Jump to content
Claris Engage 2025 - March 25-26 Austin Texas ×

This topic is 5141 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Posted

I have an excel file to import into FM. It has many duplicates, but they are different records. I need to assign an same Customer ID to all of the duplicates with the same name.

Jack = Customer ID 1

Jack = Customer ID 1

Jim = Customer ID 2

Bob = Customer ID 3

Bob = Customer ID 3

Billy = Customer ID 4

Jack = Customer ID 1

When they created the excel file they did not create a customer ID. There are over 10000 rows in this file and about 700 customers total. Can this be done easily? I can not get any script I make to loop through correctly.

CPE1704TKS

Posted

Yes it can be done.

The tricky bit is correctly identifying the duplicates. No doubt your examples have been simplified for the post, but you'll likely end up with false positives and false negatives.

The false positives are two different people with the same name. Very difficult to separate. Often it will require an additional piece of information such as birth date or address. For example, George Miller from New York and George Miller from Maine.

The false negatives are often data entry problems, where the names are spelt incorrectly or have leading or trailing spaces. Example is Elizabeth Windsor and Elisabeth Windsor.

So start by cleaning up the data and if possible include another field in the duplicate identification process.

One very quick way to assign unique ids is to create 2 tables in FMP. Import the data and use the Replace command to set a serial number for the customerID in the first table, don't worry about duplicates at this stage. Each record should get a unique number, even the duplicates.

In the second table, import the same data but keep the customer id field empty. Make a relationship between the two tables using the name field (or whichever field is used to identify the duplicates). Then in the second table use the Replace command in the CustomerID field to insert the related customer id from the first table. The relationship will pick the first matching record, so all the Johns in the second table will have the id of the first John in the table 1.

Now change the relationship to match on the customer id, and delete all records in table1 that do not match a record in table 2.

This technique means that the final customer ids in table 1 won't be sequential (there will be number gaps where the duplicate records were deleted) but the technique is quick and easy and requires no complex scripting, and you'll end up with a decent relational data structure a the end.

This topic is 5141 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.