How to distinguish between two parties of the same name

dmontano · August 15, 2009

Imagine a DB with 1000 records for people. In this DB there are some people with the exact same name. How do you build the table that holds these people and ensure that they are not the same person - with the restriction that you can not ask them for their social security number?

Some may suggest that when entering the person's record and another record of the exact same name appears - that you just add the record and it does not matter because of the serial ID placed in the DB. I agree that the DB knows they are different - however, how does the person entering the data realize that it is or is not the same person?

This may sound simple - but I think not. It requires some other attributes of the person in order to identify them as being a different person. If so, this forces additional attributes to be recorded in the PEOPLE table to enforce the uniqueness - or the validation that is in fact the person in question.

What attribute could be asked of someone that does not require SS#? If it is the phone number, then that attribute would have to be IN the people table to enforce uniqueness.

How do you build validation in the process of creating or finding a person in a PEOPLE table unless you place the attribute in the people table?

I have a feeling the answers will most likely break "normalization" rules - such as a phone number as the attribute to aid in uniqueness. But, then the User can only have one "Primary" number instead of many numbers held in a separate "PHONES" table.

And what if for some highly unlikely situation where the father AND son have exactly the same name - and their PRIMARY phone is the same.

I hope someone can understand my question - any help appreciated.

bcooney · August 16, 2009

I do not think that there is a way to validate uniqueness. Imho, it requires a human to compare data, perhaps even to make a phone call or send an email.

In a recent system, we added a "match" screen. We have a people table related to addresses which are in turn related to Comm (phones, emails). On this screen the incoming person info is held in a "temp" table. A portal shows possible matches, using a concatenated field of Last Name, Birth Date, City, State, Zip and email. So, a person can be matched to an existing record or the user can create a new person.

dmontano · August 16, 2009

Thanks bcooney,

I have been trying to build a normalized DB and it compounded by the fact that I am using a Super/Sub type structure to help generalize entities. For example, I have a Party Super table with person and organizations as subtables.

This appears to be connected correctly, but as I move into other areas such as a PARTY ROLE supertype that has subtype tables of CUSTOMER, SUPPLIER, Etc. I am trying to see how a person would enter the data.

For example, Jane wants to enter a new customer record. The customer is a person who will play the role of customer. I need to record the people portion of the data in a people table, but record the fact that this is a customer in the customer table. The logical starting point is to add the customers name - which is in the person subtype table.

As you can see the normalization and generalization makes the data entry tricky for me. I am thinking that I need a layout with a button that asks "Is this a Person or and Organization?" If they select "person", I would take them to the person table and add the person information. I need to make sure this "Person" does not already exist.

I can not unique validate the name for the reason stated in original post... there are two "John Doe"s who are different people. So in my quest to build a working prototype of how a person or organization record gets entered is puzzling. Compounding that is the fact that I have CUSTOMER, SUPPLIER tables along with PEOPLE and ORGANIZATION tables.

comment · August 16, 2009

I see two questions here: a conceptual one and a technical one.

The conceptual question - which only you can answer - is what combination of attributes must be identical in order to constitute a duplicate.

The technical question is how to prevent duplicates when these attributes can be spread across multiple tables.

I believe the technical question is solvable (though not necessarily by using the built-in 'Unique' validation). But it cannot be solved before the concept itself is clear.

The order of data entry can be important too. Validation of this type can be only applied after the last significant attribute has been entered.

bcooney · August 16, 2009

Yes, this is definitely both a conceptual and technical question.

But, dmontano (I don't know your name), how about just letting the user find first and if not found, create new?

Your system sounds just like one I did last year. Totally Graham method (subtype/supertype). However, we put Companies in our Contacts table and had "satellite" tables: Vendors, Customers, Dealers, etc.

New Vendor finds first in Contacts. User selects from found set. If Contact does not have _kF_VendorID, then new Vendor record is created and foreign key is populated in Contacts.

dmontano · August 16, 2009

Thanks Comment and bcooney. I am trying to get this solution as functional as possible without scripting. I will have scripts in the solution for sure. I have found in the past that I have built scripts that are a "patch" for my poor table structure and I am forcing myself to stay as "native" as possible.

Once I get things hooked up as soundly as possibly, then I will overlay scripts to facilitate data entry and enforce "rules", etc. I think this is a good strategy, but like everything else with FM and me, only time will tell.

If FM allowed a value list to be created from a related table using a calculation field - much of my troubles would go away. But since I can not "present" these values, it has posed a serious challenge to me. A huge stumbling block.

For example, in the Super/Sub structure it requires a common attribute of the subs to be located in the supertype table. Sounds easy. However, when that attribute is name and it needs to apply to PEOPLE, ORGANIZATIONS (Informal & Formal), it is not easy.

As comment pointed out, what common attributes are required to determine uniqueness... there can be multiple Robert Smiths. If I do not account for this - then the Users would have to create some Name form of Robert Smith 1, Robert Smith 2, etc. That is completely unacceptable for sure.

I accept the fact that Names for people are a cultural construct as pointed out by a few people on this forum, so even if I accept that the name for a person is one field - the problem still exists.

It may seem that the issues I am bringing up are convoluted and not related, but they seem to be interrelated to me. I keep coming back to the same "type" of conundrum.

NYPoke · September 18, 2009

If they are really customers, you need to use a Customer Id. It would be on all correspondence. At the point you mention, you would ask them "Have use done business here before" or something like that. No way around that.

As mentioned above, build a simple look up with lists the Customer Id, Address, most recent transaction...

Robert Smith:

16 123 Main, Cedar Rapids, IA 2009-09-01

33 881 Walnut, Des Moines, IA 2008-12-25

Your data entry person would select from that list, if found.

Your big issue is your database design. I just built something similar. NOT easy at first.

I built a Contact Table (Person, Company, ...). The Client Table relates to the Contact by Contact Id. Not all Contacts will be Clients. Could be Vendors, Employees, ... You build a table for each type.

However, all Client Records base off of a Contact Record.

The iffy part is breaking down between People & Companies.

You can create a Contact Type (1 - People, 2 - Company). You will need a People Table & a Company Table. They each derive their id from the Contact Id.

Tricky part is getting People assigned to Companies (multiple contacts per Company). Not tough, but in general, it needs a Linking Table (Company Id, People Id).

Your interface just needs to handle the new/existing Contacts as either People or Companies. Not too tough, once the dirty work is done in the database.

Once you dig into, you should get the idea.

Jack Rodgers · March 11, 2010

Every unique individual should have a unique id number and all records relating to this individual should link to this id number.

After you have stabilized your db with this concept you can search for individuals using the various fields for name, address, phone number, zip code or just the id number.

Using one unique id number for each person is the only way to maintain sanity...

Sign In

How to distinguish between two parties of the same name

Recommended Posts

dmontano

bcooney

dmontano

comment

bcooney

dmontano

NYPoke

Jack Rodgers

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information