Jump to content

Troubleshooting my Find Duplicate script


This topic is 6515 days old. Please don't post here. Open a new topic instead.

Recommended Posts

HI folks, I have created a script to find duplicate email addresses. First it finds all records that has something in the field and that part works fine. But when it enters the loop to mark duplicate records based on two fields, I keep getting the same number of records that it found when it was checking to see what records had something in the email address field to start with. I even threw in some show custom dialog boxes to see what was the content of some of my fields during the script process and I can't figure it out.

Note: Neither the DupCheck or Mark Record fields are globals. These two fields as well as the email address field are text fields.

---------------

Find Duplicate Email Addresses

Allow User Abort [ On ]

#

#Find all records with the email address entered

Show All Records

Enter Find Mode [ ]

Set Field [ wps_contacts::Email Address; "*" ]

Perform Find [ ]

Sort Records [ ]

[ No dialog ]

Show Custom Dialog [ Title: "email address records"; Message: Get ( FoundCount ) & " records found."; Buttons: “OK” ]

#

#

Freeze Window

Sort Records [ ]

[ No dialog ]

Go to Record/Request/Page

[ First ]

Replace Field Contents [ wps_contacts::Mark Record; Replace with calculation: " " ]

[ No dialog ]

Set Field [ wps_contacts:up Check; wps_contacts::Email Address ]

#

Loop

Go to Record/Request/Page

[ Next; Exit after last ]

Show Custom Dialog [ Message: wps_contacts:up Check & wps_contacts::Email Address; Buttons: “OK” ]

If [ wps_contacts:up Check = wps_contacts::Email Address ]

Set Field [ wps_contacts::Mark Record; "X" ]

Else

Set Field [ wps_contacts:up Check; wps_contacts::Email Address ]

End If

End Loop

Perform Find [ ]

#

If [ Get ( LastError ) = 401 ]

Show Custom Dialog [ Title: "Writer's Productivity Solution"; Message: "No records found."; Buttons: “OK” ]

Exit Script [ ]

End If

#

#

Show Custom Dialog [ Title: "Record Deletion"; Message: "Do you want to delete " & Get ( FoundCount ) & " duplicate record(s)?"; Buttons:

“OK”, “Cancel” ]

If [ Get ( LastMessageChoice ) = 2 ]

Exit Script [ ]

End If

#

Show Custom Dialog [ Title: "Record Deletion"; Message: Get ( FoundCount ) & " duplicate record(s) deleted?"; Buttons: “OK” ]

#

Enter Browse Mode

Link to comment
Share on other sites

Are you aware of the following:

finding duplicates can be done by performing a find and placing an exclamation mark in the field you wish to check

This will return all records with the duplicate value. When you sort the found set by that field, the duplicates stand out clearly. You can then create a looping script which deletes the duplicates.

The only extra thing to do is to create a relationship within the file itself based on the field to be checked.

In your script, you set a variable to the result of: exact (duplicate value; :):related duplicate value by selfjoin). Then, you start a loop and check the result and if it returns a 0, the record is a duplicate. You then delete the record, reset the variable and repeat the check for the next record. Or, if it returns a 1, just go to the next record and reset the variable.

Obviously, there are other ways of doing this, but if you need the loop, this is one way.

I hope I understood correctly and that this helps!

Link to comment
Share on other sites

Neither the DupCheck or Mark Record fields are globals

I suspect that is the problem. You need a global field (or a script variable) to hold a value, so that you can compare it to the next record's value. Something like:

...

Sort Records [ by Address ]

Go to Record [ First ]

Set Field [ gAddress; "" ]

Loop

If [ gAddress = Address ]

Set Field [ Duplicate ; 1 ]

Else

Set Field [ Duplicate ; "" ]

Set Field [ gAddress ; Address ]

End If

Go to Record [ Next; Exit after last ]

End Loop

...

Link to comment
Share on other sites

True: the variable I mentioned needs to be a global too. And in fact, if you set it so the address without using a relationship, it is even easier. Simply sort by the field to be checked, then set the global in the first record, go to the next and compare the value in the global to the value of the current record. If it is a match, it is a duplicate, hence it can be deleted.

Link to comment
Share on other sites

I have made DupCheck and Marked Record fields global and I still can't mark the records that are duplicate. I do appreciate all the advance that was given but I must be doing something wrong still.

--------Modified script---------

Find Duplicate Email Addresses

--------------------------------

Allow User Abort [ On ]

#

#Find all records with the email address entered

Show All Records

Enter Find Mode [ ]

Set Field [ wps_contacts::Email Address; "*" ]

Perform Find [ ]

Sort Records [ ]

[ No dialog ]

Show Custom Dialog [ Title: "email address records"; Message: Get ( FoundCount ) & " records found."; Buttons: “OK” ]

#

#Finds duplicate records

Enter Find Mode [ Specified Find Requests: Find Records; Criteria: wps_contacts::Email Address: “!” ]

[ Restore ]

Perform Find [ ]

#

#

#Mark all duplicate records

Freeze Window

Go to Record/Request/Page

[ First ]

Replace Field Contents [ wps_contacts::Mark Record; Replace with calculation: "" ]

[ No dialog ]

Set Field [ wps_contacts::Dup Check; wps_contacts::Email Address ]

#

Go to Record/Request/Page

[ Next; Exit after last ]

Set Field [ wps_contacts::Mark Record; " " ]

Loop

If [ wps_contacts::Dup Check = wps_contacts::Email Address ]

Set Field [ wps_contacts::Mark Record; "X" ]

Else

Set Field [ wps_contacts::Dup Check; wps_contacts::Email Address ]

Set Field [ wps_contacts::Mark Record; "" ]

End If

Go to Record/Request/Page

[ Next; Exit after last ]

End Loop

#

#Find marked records based on "X" in Marked Record field

Perform Find [ ]

If [ Get ( LastError ) = 401 ]

Show Custom Dialog [ Title: "Writer's Productivity Solution"; Message: "No records found."; Buttons: “OK” ]

Exit Script [ ]

End If

#

#

#Ask user if they want to delete records

June 18, 2006 12:15:49 wps_Contacts.fp7 - Find Duplicate Email Addresses -1-

Find Duplicate Email Addresses

Show Custom Dialog [ Title: "Record Deletion"; Message: "Do you want to delete " & Get ( FoundCount ) & " duplicate record(s)?"; Buttons:

“OK”, “Cancel” ]

If [ Get ( LastMessageChoice ) = 2 ]

Exit Script [ ]

End If

#

#

#When coded, deletes duplicate records

Show Custom Dialog [ Title: "Record Deletion"; Message: Get ( FoundCount ) & " duplicate record(s) deleted?"; Buttons: “OK” ]

#

Enter Browse Mode

Link to comment
Share on other sites

It seems to me you are making this more difficult than it is. I know you are new to the product, but, try to forget your old script and write a new one.

First of all, it is never necessary to Show all records if right afterwards you do a find. The find after all is meant to get a non-all-records set, right?

I also do not see the need to report to the user how many records have been found, because they will be right in front of them.

Third, marking the duplicate records seems unnecessary to me too, what is the purpose of that? Do you want to keep them and check them first? Even then, the find duplicates via ! will give you the set. Admittedly, the originals will be in that set too. That might be why you want isolate just the duplicates?

The fourth script step cluster sounds more like it.

I have a few remarks that might be helpful.

When you run this script, in my humble opinion, the first two steps would have to be: find all duplicate email addresses and sort them by that field. After sorting, Filemaker ends up in the first record. So you do not have to include the Gotofirstrecord step.

Then set the variable to the value of the email address of the first record.

Then start the loop. See below.

A variable is a field, in a sense, that will only live during the running time of the script. After the script has run, the variable is gone and this a big advantage because you do not have to clog up your solution with a multitude of system fields.

Your script would look somewhat like this:

Find duplicates (perform find by !)

Sort (by email address)

Set variable of type global ($$variablename)to value of emailaddress field in first record

Goto next record

Loop

If ($$variablename = emailaddress)

delete record

Else

Set $$variablename to emailaddress

Goto next record (exit after last)

End if

Obviously, you can track for records found, if the duplicates do not exist. That is what you have done in the tail end of your own script.

Try this in a new table, with some simple test records first. Just with some records and some test values. Taking it out of the solution and testing in this way might give you a more transparant view of the functions described. You can also use script debugger to see each step happen. Very useful tool, not only for this purpose, but also to track which script steps take more time that anticipated when you made the design. But that is a different subject..

Hope this makes sense and remember to keep it simple!

Oh and by the way: watch out for the Restore option of a find for which the script sets the value. You might have overlooked this. It will never do what you want if restore is ticked! That might be why your script is causing you head aches!

Link to comment
Share on other sites

I found your problem, shawnj985: you're doing a Sort after the first Find, for no apparent reason, but then you're NOT doing Sort after the second Find, which you MUST do.

There are some other problems such as, why do a replace when you're going to set the Mark Record field anyway in your loop; and if you find no records you exit the script, leaving the user in limbo. But I don't want to get into the details.

Also a pet peeve of mine is the (s) in parentheses; if you really care about such things, then go all the way:

Let( [ found = Get ( FoundCount ) ; s = Case( found > 1 ; "s" ) ] ;

"Do you want to delete " & found & " duplicate record" & s & "?" )

Finally in response to Tricky:

"After the script has run, the variable is gone ..." you say, but then you go on to use a GLOBAL variable in your script, which WILL persist after the script runs. And in this instance there's no need to use a global variable, a script variable will do just fine (for those that don't know what I'm talking about, you should replace "$$" in the post above with "$").

Link to comment
Share on other sites

This topic is 6515 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.