pteaxwa Posted June 3, 2004 Posted June 3, 2004 I am fairly new to Filemaker and come from a relational database background. Working with an old data set strewn with errors, I came upon a set of 300+ records out of 2000 that were duplicates of some sort. I set out to delete those records, but quickly saw some hurdles... I found this post on how to delete duplicates: http://www.afilemakeraffliction.com/list/howto/afa1095.html I didn't really feel like implementing as he did to delete the duplicates, so I fudged this solution. For the sake of simplicity say you have 1 field: id, in a file called Foo. 1. Create a self-relationship with 'id'. 2. Create a calculation field '_count_dupes' that does this Count(Foo::id). 3. Write a script to loop through all the records: # begin the script Go to Record[First] Freeze Window Enter Find Mode Set Field[id, "!"] #not necessary, but should speed up the search Perform Find, Replace Found Set Loop Loop Exit Loop If (_count_dupes <= 1) #could just do = 1, but <= to be safe. Delete Record/Request[No Dialog] End Loop Go to Record[Next, Exit after last] End Loop Refresh Window Go To Record[First] #end the script Naturally, after doing the perform find you may want to order the records based upon some sort of Creation date to not delete the first created record... I haven't seen this particular method posted anywhere. Hopefully, it can be of some use to someone and maybe you might be able to use it or better yet improve upon it!
-Queue- Posted June 3, 2004 Posted June 3, 2004 It appears you're deleting originals and leaving duplicates (unless there are records with more than one dupe, in which case you're randomly deleting records) with this technique. Is that really what you wanted to do?
-Queue- Posted June 3, 2004 Posted June 3, 2004 It would probably help to use the self-relationship serial technique to flag which records are originals and which are dupes. Then after performing your find If Status(CurrentFoundCount) Loop If [flag = 1] Omit Record Else Go to Record/Request/Page [Exit, Next] End If End Loop Delete All Records End If
AbsoluteVoice4u Posted August 1, 2004 Posted August 1, 2004 Deleting Duplicates?....ok...This is basically what I've been doing to delete duplicates....whatever field I am choose to de-dupe by, let's say Address, I create a "temp field" merely adding "_temp..ie; Address and Address_temp. Then I utilize the following script...(created correctly) Find all.... Sort field "address" in ascending order Loop Copy field "address" go to next record..exit last paste to field - "address_temp" End Loop Then I find all duplicates (of address_temp) Then I delete all duplicate address_temp records. Then I delete field address_temp All of the above is done with one script..... Is this not an expedient avenue? Where could I improve this, please?
Ender Posted August 1, 2004 Posted August 1, 2004 I don't think your algorithm would work. It looks like it just shifts the address field to the following record, then finds and removes duplicates of that. Just as bad as removing all records following a find of address with !. Worse, because some of those records may not be duplicates in the first place. The algorithm that pteaxwa and Queue have worked out is better.
TSC Posted August 12, 2004 Posted August 12, 2004 We have a demo file available at http://www.fmdeveloper.com/php/tips_and_tricks.php that shows one technique
cognitdiss Posted July 11, 2005 Posted July 11, 2005 I tried downloading the file, but the file download would not complete (a problem I have never had before) and when I tried to open the file anyhow it was corrupted (said FM). Also, the images in the PDF were useless to me, as all I could see were vague boxes with way greeked-out text in them. If the text in the images was at all legible I might have been able to make it w/ the filemaker file, but I couldn't figure out how to get what looked like a script to run in the calculation field without something guiding me besides the text describing the greeked-out image. Looks like a great resource, hope I can use it someday because I really need all the help I can get! - - Dave
Lee Smith Posted July 11, 2005 Posted July 11, 2005 Hi Dave, I just verified that there is a problem with this download too. I don't know if you noticed it or not, but Tim Cormier (TSC), hasn't been around for a while, and I don't know if he will catch your post. Why not, just leave a feedback note at his site, or send him an email. To get his email, click on his [color:"blue"] TSC in the column to the left above, and write him off list about the problems. Lee
yknot Posted November 10, 2005 Posted November 10, 2005 Hi, I realise this is an old post but I have a question: When deleting dupes, what difference does it make whether I delete the dupe or the original record - aren't they the same?? Regards, yknot
-Queue- Posted November 10, 2005 Posted November 10, 2005 It depends on whether you are using unique serial numbers for any relationships to child tables. If you are, then deleting the original would orphan any child records based on that relationship, or worse, delete them if the 'delete related records' option is selected in the relationship for the child table.
Recommended Posts