Grouping duplicates

ghengis · August 10, 2007

Hi,

I have a file with 100 records and I have found 10 duplicates records. I would like to be able to group the duplicates.

10 duplicate records, grouped with a unique id. Producing 5 groups. group ID's 1-5.

I know I have to find my dup's first then sort them. I'm stuck on setting the Group Id field.

How do I do this?

Thanks in advance.

Ghengis

dreamingmind · August 10, 2007

I think I understand, you have 5 records with twins... I'm not clear on your end goal though. You want the pairs numbered (1,1 2,2 ...etc.). To what end? It may make a difference in which technique would help.

Don

ghengis · August 12, 2007

Yes you got it. I want to group the pairs of dups.

The reason I want to do this is because I want to deal with each pair and find the most complete record and mark the incomplete record for deletion.

What I define as a complete record is a record that has a FirstName, LastName, AddressLine1 and PostCode.

:(

dreamingmind · August 12, 2007

How 'bout something simple like a self relationship based on the fields that identify a dup (probably first and last name?).

Then, in a layout where you find your dups, you can show a portal for each dup that show it and its twin. Delete the portal row you don't want and move on to the next record.

You could script the delete-portal-row to do the delete and omit the record you kept from the found set for a smoother interface.

Don

ghengis · August 12, 2007

sounds good, but say I wanted to automate the deletion process. I don't want to delete the records manually. I would like a script to find the most complete record and delete the incomplete record.

What do you suggest?

David Jondreau · August 12, 2007

You need to make 5 decisions, I'd say forget about the script and do it manually.

What makes one address more complete than another? What if all four fields have values, but they're different? What if one has only the first name missing and the other has only the address line missing. Which is "more complete"?

ghengis · August 12, 2007

Good points to ponder over. I would still like to identify those records that need attention. Its easy to manually spot records if you're only dealing with small amounts of records. What about dealing with thousands?

I would like to identify out of a pair any records that don't have values in AddresLine1 and PostCode and then mark them for attention.

This is why I needed to group the duplicates and then find those that have missing AddressLine1 and Postcode. I understand your points about defining a complete record.And I take your points on board. At this stage as I'm still a novice and I would to like to script the steps I mentioned above.

Group the duplicates with group id's and then compare the dups based on the criteria I mentioned above.

So far I've been able to group the dups and give them ID's. I'm still stuck on comparing the records within a group.

Could I for example create global fields for Addressline1 and PostCode then set those fields with the first values in the group then go to the next record within the group and then do a comparison?

Does this method make sense?

Sorry for dragging this out. As I said earlier I'm a newbie and I have this itch that I need to scratch!.

dreamingmind · August 13, 2007

Well, the whole problem rests with you deciding on the rules for a comparison algorithm. Once you do that the problem will be solved.

If you're in fm9, you have conditional formating in your layout to highlight problem records or fields. Or you could set a number field by evaluating the condition of the fields of interest and increasing it's value based on each problem discovered. Then the number value provides a ranking of degree-of-problem.

But it all rests on your choices.

Don

ghengis · August 13, 2007

Thanks for the advice. Let you know how I get on.

Sign In

Grouping duplicates

Recommended Posts

ghengis

Link to comment

Share on other sites

dreamingmind

Link to comment

Share on other sites

ghengis

Link to comment

Share on other sites

dreamingmind

Link to comment

Share on other sites

ghengis

Link to comment

Share on other sites

David Jondreau

Link to comment

Share on other sites

ghengis

Link to comment

Share on other sites

dreamingmind

Link to comment

Share on other sites

ghengis

Link to comment

Share on other sites

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information