My current project consisits of 2 tables to make a mailing database.

Records is all the companies

Issue Sent is all the well issues Sent, referected to the Company Id

Now wheni get updates of sent issues New companes are mixed with old companies and i end up with duplicite Records.

How do i go about searching my Records Table to find Identical records (so are similar but are different clients. Say i have Company XYZ, Company XYZ, Company ZZZ, and Company ZZZ. The second ZZZ may be a different department and would have a different address, contact person. While the second XYZ has the same contact, address etc)

what i need to do is auto sort though the exact duplicates and if any Isues Sent info refereces a Duplicate id i update that ID to the "master" and delete the duplicates.

But theirs a Catch that makes it even more complicated. Some duplicates while obvious duplicates (same address [address is manadatory after all its a mailing list) may not have all the info in sync

Copy 1 may have the address, first name and last name

copy two may nave address, title, last name, and phone number

i need to make sure when i clean up diplicate any extra info the Master doesnt have the Copy does get put in the master.

Attached is a zip of a build of my current File maker file


well really i'm just trying to remove duplicates and make sure the remaining silgular record has any exrea info that man have been spread among the duplicates. With a duplicate being records that sare a majority of key fields (same address, same company name, same city, same state, same zip, and same title, firstname, middle name, last name (but if a couple of those fields are blank its still considered a copy))

And if one of the copies has Telepohone number but other copes dont. the new master records woud have the telephone or other auxiliery info

The problem of determining whether a record is a duplicate is fairly complicated, and cannot always be done by scripting rules alone (especialy if the data has inconsistencies). A human eye may be needed to sift through similar records to determine if a pair of records are actually duplicates (maybe one having an inconsistent format) or if the two are actually different Companies (two different Companies might have the same or similar address or the same name).

For these reasons, you should see if your process can be changed so that you're only working from one source database and two files don't need to be merged.

It's also helpful if the data entry process can allow searches (or portal matches) for any new companies, just to make sure they're not already in the system.

i'm importing from a CSV

my bossses make a listin word of issues sent address. New mixed with old and then i have to convert those mialing addresses into a CSV. Then i import the CSV. But this mean that existing records will end up duplicated. So i need to find a way to clean up existing duplicates and deal with new duplicates when importing

