Jump to content
View in the app

A better way to browse. Learn more.

FMForums.com

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Identifying "duplicate" contacts

Featured Replies

Apologies in advance if this is not the right forum - please recommend a better one B)-)

This must come up fairly often: how do you identify "duplicates" in a file of people containing name, address, city, state, zip, and maybe telephone #? My definition of "duplicate" is inherently loose - it tolerates a certain amount of discrepancy due to abbreviations, typos, and omitted info, as well as discrepancies due to a change of address. Thus, any two records would have a certain *probability* of referring to the same individual. A human would make the final call, but I want something that would greatly narrow the field beforehand.

This sort of logic would seem to be well beyond the ability of FM scripting to implement, so I thought there might be a plug-in that did this kind of task.

Thanks,

Chap

You can create a calculated field that aggregates all of the criteria you specify for possible dupes. Say you want to look for all records that have the same first and last name. Create a calc field that concatenates these fields ( or any others you want), then create a self join based on that field (say you named the relationship "dupeFinder" as below). Another calc field would then be used to identify the records:

Case((IsEmpty(FirstName) or IsEmpty(LastName));"Empty";

Case(

Count(dupeFinder::FullName) = 1; "Unique";

Case(

GetAsNumber(contactID) = GetAsNumber(dupeFinder::contactID); "Original"; "Duplicate"

)

)

)

If the count function returns the value "1", that means there is only one value with a match (the record itself) Otherwise it compares the unique ID field of the current record with the first one in the the related set. If it matches, then the first record created must equal the current record, thus it's the original one, and all others will be flagged duplicate by default.

To relax or constrain your criteria, just change how your calc field concatenates the fields of interest.

(Like if you wanted to use just the first initial and the last name)

Hope this helps.

Dana

  • Author

Thanks, Dana

It turns out there's been a sizeable amount of research done on the subject of "approximate string matching". There are some pretty complex algorithms that can assign a probable match rating that tolerates a certain amount of discrepancy such as letter inversions, substitutions, and omissions. It would be interesting to start with keys such as you describe, and then do "approximate matching" to expand the matching set to include minor typos & abbreviations.

Maybe there's a product here! B)-)

Chap

Create an account or sign in to comment

Important Information

By using this site, you agree to our Terms of Use.

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.