Find Unique records, delete the rest

shady · February 9, 2006

How do I find all UNIQUE records and delete the rest.

Essentially, I am trying to compare two tables by appending them together, and only keeping the changes. So any record that gets carried forward to the next table will get deleted. This way, I only see changes.

Inky Phil · February 9, 2006

Shady

This is what the help file says about finding duplicates and in so doing it identifies the uniques also

Identifying duplicate values using a self-join relationship

This procedure identifies "extra" instances of duplicated records. You specify the criteria that determine which is the primary record.

This procedure uses a self-join relationship and a calculation field referencing the relationship to determine which records are duplicates.

To find duplicate records except the first instance:

1.

If you plan to delete the duplicate records that you find, make a backup copy of the file.

For more information, see Saving and copying files.

2.

Identify a field that determines a unique entity in your file.

For example, in a Contacts database, the Last Name field is probably not a good choice, because you might have several people with the same last name. Social Security Number is a better choice. You can also create a calculation field (returning a text result) that combines data in several fields to make a unique identifier. An example formula is First Name & Last Name & Phone Number.

3.

Define a self-join relationship.

Use your chosen identifying field as the match field in both tables in the relationship. For more information, see About self-joining relationships.

The primary record is the first matching record according to the sort order defined in the relationship.

4.

Define two fields:

Counter, a text field with an auto-entered serial number (select Serial number and accept the default values for Next and Increment by).

Check Duplicates, a calculation field with a text result, with the formula:

If(Counter = table1::Counter, "Unique", "Duplicate")

5.

Choose Records menu > Show All Records.

6.

Click the new Counter field, choose Records menu > Replace Field Contents, and Replace with serial numbers. Again, accept the default values. Select Update serial number in Entry Options, and click Replace.

This will assign a serial number to all existing records in your database. Serial numbers will automatically be entered in new records.

7.

Perform a find for Duplicate in the Check Duplicates field.

The first record in any series of duplicates now holds the value "Unique" in the Check Duplicates field, and all duplicate records within the same series are marked "Duplicate".

Important Records with no value in the match field will be flagged as duplicates. Once set up as above, this system will identify duplicate records automatically as they are created.

HTH

Phil

Søren Dyhr · February 9, 2006

In this template are 6 different methods demontrated:

http://www.kevinfrank.com/download/kf-deduping.zip

--sd

mleiser · February 16, 2006

Yes, I saw the method in the help file. I used it and it worked fine. However, it would be nice if I understood it. Can anyone explain to me in English how the self join does it? I also checked out the 6 method file, but again it just does it, no reasoning. At least point me to a good article on it. Something.

Mike

Søren Dyhr · February 16, 2006

Perhaps you should start out with a straightforward relation. Parent - Child ...where you then after having made several related records in a portal ...jumps to the childtables data and make changes to the keyfield (foreignkey) in some of the records and then turn your attention to the main (parent) tables layout ...to see how you strained in the nunber of records related.

A selfjoin works similar it groups ID's identical.

--sd

mleiser · February 16, 2006

I understand and use related files a lot. In the duplicate example, say the key to three records is abc then another 3 def. Each has a count number 1 thru 6. The objective is to get rid of the dups - 2 abc's and 2 def's. When the first abc record (count=1) looks for the related records in the self joined file it should get the first 3 (1,2 and 3). Then I see how record 1 would get unique (since 1 of table 1 = 1 of table 2). But here's where I'm lost. It now looks at record 2 which key abc matches records 1,2 and 3 again so 2 now gets set to unique as well. How do you get the 2nd and 3rd record to set to duplicate? I hope I'm making some sense.

Sign In

Find Unique records, delete the rest

Recommended Posts

shady

Inky Phil

Søren Dyhr

mleiser

Søren Dyhr

mleiser

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Explore

Affiliate Forums

Activity

Important Information