Finding the first x number of duplicate records from table relationship

MartinL · September 7, 2020

I have a table of data where there may be many duplicate records based on the 'Name' field, however the other fields data are different.

I need to find the first 6 duplicated records based on the 'Name' field and then set a number value (incremented starting from 1 ) against them in a field used as a flag.

The table may contain as many as 20,000 records with each unique 'Name' value having 0 to 50 duplicates.

Does enyone have any idea how I might achieve this using a script or a custom function?

bcooney · September 7, 2020

Do a find in the name field? You might want to only use the first three letters of the first name and last name to get a broader result.

comment · September 7, 2020

2 hours ago, MartinL said:

I need to find the first 6 duplicated records based on the 'Name' field

That is not quite clear. Do you mean the first 6 duplicates of each Name? Or just the first 6 duplicates of some Name? If the latter, which one? And what determines which is "first"? First in what order?

MartinL · September 8, 2020

Thank you for the reply.

it is up to the first 6 duplicates of each name and the order is just in the order of creation.

comment · September 8, 2020

Here is a rather simple way to do it:

First, find the duplicates by performing a find for ! in the Name field. Then sort them by Name. Then do:

Go to Record/Request/Page [ First ]
Loop
   If [ $name ≠ YourTable::Name ]
      Set Variable [ $name; Value:YourTable::Name ] 
      Set Variable [ $i; Value:1 ]
   Else
      Set Variable [ $i; Value:$i + 1 ]
   End If
   If [ $i ≤ 6 ]
      Set Field [ YourTable::Flag; $i ] 
   End If
   Go to Record/Request/Page [ Next; Exit after last ]
End Loop

Now, there is a way to make this faster by jumping from the 6th record directly to the first record of the next group, using a variation of the "Fast Summaries" method by Mikhail Edoshin. But I doubt you need the added complexity - hopefully you don't need to do this often.

Another option is to define a summary field as Count of Name (or any field that cannot be empty), running, with restart when sorted by name. Then (after finding and sorting) do simply:

Replace Field Contents [ YourTable::Flag; Replace with calculation: If ( Table::sRunningCount ≤ 6 ; Table::sRunningCount ) ] [ No dialog ]

Edited September 8, 2020 by comment

MartinL · September 8, 2020

Hi,

Thank you for all of the replies.

I used Comment's option which worked realy well even with a dataset of over 50,000 records.

comment · September 8, 2020

22 minutes ago, MartinL said:

worked realy well even with a dataset of over 50,000 records.

Good. 50k records is not that much these days. Esp. if you do it in Form view and start with freezing the window (I should have mentioned this in my answer, but I was too absorbed in the logic of the process).

BTW, what is the purpose here? What makes the first 6 records different from the other duplicates?

MartinL · September 8, 2020

The client needed to run the search so that they could export the found records and use them for a mail campaign. They then run the code again to find the next batch and so on until there were none left. Although the Company Name value might be duplicated the rest of the info such as the email address is not. They were trying to not bombard their clients with numerous emails at once.

Sign In

Finding the first x number of duplicate records from table relationship

Recommended Posts

MartinL

bcooney

comment

MartinL

comment

MartinL

comment

MartinL

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information