Deleting Duplicate Records

pteaxwa · June 3, 2004

I am fairly new to Filemaker and come from a relational database background. Working with an old data set strewn with errors, I came upon a set of 300+ records out of 2000 that were duplicates of some sort. I set out to delete those records, but quickly saw some hurdles...

I found this post on how to delete duplicates:

http://www.afilemakeraffliction.com/list/howto/afa1095.html

I didn't really feel like implementing as he did to delete the duplicates, so I fudged this solution.

For the sake of simplicity say you have 1 field: id, in a file called Foo.

1. Create a self-relationship with 'id'.

2. Create a calculation field '_count_dupes' that does this Count(Foo::id).

3. Write a script to loop through all the records:

# begin the script

Go to Record[First]

Freeze Window

Enter Find Mode

Set Field[id, "!"] #not necessary, but should speed up the search

Perform Find, Replace Found Set

Loop

Exit Loop If (_count_dupes <= 1) #could just do = 1, but <= to be safe.

Delete Record/Request[No Dialog]

End Loop

Go to Record[Next, Exit after last]

End Loop

Refresh Window

Go To Record[First]

#end the script

Naturally, after doing the perform find you may want to order the records based upon some sort of Creation date to not delete the first created record...

I haven't seen this particular method posted anywhere. Hopefully, it can be of some use to someone and maybe you might be able to use it or better yet improve upon it!

-Queue- · June 3, 2004

It appears you're deleting originals and leaving duplicates (unless there are records with more than one dupe, in which case you're randomly deleting records) with this technique. Is that really what you wanted to do?

-Queue- · June 3, 2004

It would probably help to use the self-relationship serial technique to flag which records are originals and which are dupes. Then after performing your find

If Status(CurrentFoundCount)

Loop

If [flag = 1]

Omit Record

Else

Go to Record/Request/Page [Exit, Next]

End If

End Loop

Delete All Records

End If

AbsoluteVoice4u · August 1, 2004

Deleting Duplicates?....ok...This is basically what I've been doing to delete duplicates....whatever field I am choose to de-dupe by, let's say Address, I create a "temp field" merely adding "_temp..ie; Address and Address_temp.

Then I utilize the following script...(created correctly)

Find all....

Sort field "address" in ascending order

Loop

Copy field "address"

go to next record..exit last

paste to field - "address_temp"

End Loop

Then I find all duplicates (of address_temp)

Then I delete all duplicate address_temp records.

Then I delete field address_temp

All of the above is done with one script.....

Is this not an expedient avenue?

Where could I improve this, please?

Ender · August 1, 2004

I don't think your algorithm would work. It looks like it just shifts the address field to the following record, then finds and removes duplicates of that. Just as bad as removing all records following a find of address with !. Worse, because some of those records may not be duplicates in the first place.

The algorithm that pteaxwa and Queue have worked out is better.

TSC · August 12, 2004

We have a demo file available at http://www.fmdeveloper.com/php/tips_and_tricks.php that shows one technique

cognitdiss · July 11, 2005

I tried downloading the file, but the file download would not complete (a problem I have never had before) and when I tried to open the file anyhow it was corrupted (said FM).

Also, the images in the PDF were useless to me, as all I could see were vague boxes with way greeked-out text in them. If the text in the images was at all legible I might have been able to make it w/ the filemaker file, but I couldn't figure out how to get what looked like a script to run in the calculation field without something guiding me besides the text describing the greeked-out image.

Looks like a great resource, hope I can use it someday because I really need all the help I can get!

- - Dave

Lee Smith · July 11, 2005

Hi Dave,

I just verified that there is a problem with this download too.

I don't know if you noticed it or not, but Tim Cormier (TSC), hasn't been around for a while, and I don't know if he will catch your post. Why not, just leave a feedback note at his site, or send him an email. To get his email, click on his [color:"blue"] TSC in the column to the left above, and write him off list about the problems.

Lee

yknot · November 10, 2005

Hi,

I realise this is an old post but I have a question: When deleting dupes, what difference does it make whether I delete the dupe or the original record - aren't they the same??

Regards,

yknot

-Queue- · November 10, 2005

It depends on whether you are using unique serial numbers for any relationships to child tables. If you are, then deleting the original would orphan any child records based on that relationship, or worse, delete them if the 'delete related records' option is selected in the relationship for the child table.

yknot · November 11, 2005

Got it! Thanks

yknot

Sign In

Deleting Duplicate Records

Recommended Posts

pteaxwa

-Queue-

-Queue-

AbsoluteVoice4u

Ender

TSC

cognitdiss

Lee Smith

yknot

-Queue-

yknot

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information