Jump to content
Claris Engage 2025 - March 25-26 Austin Texas ×

This topic is 5550 days old. Please don't post here. Open a new topic instead.

Recommended Posts

  • Newbies
Posted

Hi. I have searched everywhere I can think of and I have not seen a solution to this problem. I am using Filemaker Pro Advance 8.5.

I have a file with the following fields: Name, Mailing Address and Age. It contains over 200,000 records. Many people in the database have the same address (husband and wife for example) and I would like to omit or delete the duplicate addresses and only keep one record for each mailing address. I want to keep one person per address- preferably the oldest resident at the address.

I know some of the addresses will not be exactly the same format (example 24-A vs 24a or Ave. vs Av.) and a solution to my problem may not work for those records but I can live with that. ANY IDEAS?

Thanks!

Posted

I'd start by creating a self-join using the address field as the key on both sides. You can then use a portal to literally see how many duplicates of the addresses there are, and get a feeling for the size of the problem.

Posted

Define a new table with exactly the same fields - but set the Mailing Address field's validation to Unique, Validate always.

Go to the old table, show all records, and sort them by Age, descending (or perhaps by Mailing Address and Age, descending).

Go to the new table and import records from the old table.

Posted

You'll have to work out which is the worse situation: some addresses get set multiple letters (ie, you err on the side of leaving some duplicates); or some people might not get letters (ie, err on the side of removing too many duplicates).

If you were sending out electoral information, notice for an AGM or something else important that the people really must get, it might be better to err on the side of leaving some duplicates in there. If however it was requests for donations and cost was an issue but it didn't matter if a couple of people missed out, then be liberal with the removal of duplicates.

Posted (edited)

Just a tip. As you mentioned, addresses can have slightly different forms. What I've found is that the first part of the address is much more likely to be written the same. If you try a bit with calculations you can greatly increase the chances of finding what should be a match. What I've done is take only the first 10 characters, and added the zip code (and maybe the WordCount?, or only the first few characters of the first words). The zip code helps eliminate false positives.

After (of course) eliminating all those which matched exactly on the whole thing. It's a process of elimination; do the full match first, mark them, via a Replace using a self-relationship, then try harder. P.O. Boxes are kind of a problem, and need to be standardized. You can do that in a calculation, or do them more or less separately.

Usually when I've been doing this (way more than I wanted to :-), I was also looking for "same person". Because you do not really want 2 addresses of the same person either; similar but different problem.

What I did was this: a multi-line text calculation of all identifying info. So it would look kind of like this (pseudo code):

Last, First

Phones (stripped to numbers only, each on its own line)

emails (each on its own line)

address (stripped and shortened)

birth date (converted to same form, number is best)

whatever else you have to id a person

Create a self-relationship on that, then Count ( self-relationship ). So anyone who has a match >1 has a possible duplicate on any one of those fields.

P.S. If you use Replace or any operation which changes lots of records, use Save a Copy As to backup the file. Do this before any such action, so you effectively have an "undo." If this takes too long, buy a new computer :-]

Edited by Guest

This topic is 5550 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.