Identifiying Duplicates / 1 Occurrence

LedHead · March 23, 2005

Thanks for reading my dilemma! Im trying to make a list of clients who visited my office this last year. I have a log of test done for my clients, but some clients might come in 5 to 10 times in a week. So I want to find all clients that I've tested but only have one occurrence not duplicates, if I omit duplicates this takes out every occurrence. I hope this make's some scent's, any post would be most helpful.

Thanks for your help!!!

Ugo DI LUCA · March 23, 2005

Hmm...

From the Log Table, create a relationship to the Customer Table, and you should see them once only.

If some info from the log should be visible too in the portal, as I assume this is where you'd want to see the results, then use a 2 new Occurrence of the LogTable.

1) CheckLogsDupe will link the CustomerID to the CustomerID in the LogTable.

2) FilterLogs will ling the LogID to a calculation in the LogTable, that will only return the first (or last as you wish and sorted the relation) log for any Customer.

The calc, call it cFlagFirst could be Case(logID=CkeckLogsDupe;logID;"")

If you'd be filtering from a global date Ranges, these globals should reside in the LogTable and the Range match should only be used from LogTable to CheckLogsDupe

whitemyer · March 27, 2005

I use a method outlined in Filemaker Help. It's kind of hidden in the Help application, but if you go to

Contents > Finding, sorting, and replacing data > Finding records > Finding duplicate values

and look at the bottom of the page, you'll see a link for identifying duplicate values using a self-join relationship.

Søren Dyhr · March 28, 2005

Fastsummaries could also be exploited here ...checkout:

http://www.onegasoft.com

--sd

cbum · March 28, 2005

Whitemyer,

the problem I have with the method described using self-joins, is that it does not work dynamically for found sets.

There is no guarantee that the record marked unique, i.e. the first (or last) found record for a given ID, will be present in a found set later on.

Ugo,

your solution is too kryptic for me. Where are the 2 tables coming from? The description seems to indicate a single one, which is similar to my case, where I have a file with patient records, and individual patients may have 1 or many record entries.

When I have done a find on various criteria, I need to know how many cases (i.e. individual patients) fit those criteria without duplicate counting, and I'm still looking for a workable solution...

whitemyer · March 28, 2005

cbum,

Could you simply display the data in a subsummary sorted by patient?

cbum · March 28, 2005

Yes, but that does not show up in browsing - you have to create reports, right?

This is to rapidly find the patients that fit various criteria in multiple settings - i.e. interactively - not by creating a report for each question.

Søren Dyhr · March 29, 2005

Quote
your solution is too kryptic for me. Where are the 2 tables coming from?

Reread what Ugo writes, it's "Table Occurence" not "Table" !!! !

--sd

cbum · March 29, 2005

Soren,

I'm afraid rereading did not help me.

His first sentence is "From the Log Table, create a relationship to the Customer Table"

He then mentions table occurences in the context of a portal...

It almost sounds as if you guys were referring to a previous discussion, since most of the details are missing, and I'm not following the logic...

Ugo DI LUCA · March 29, 2005

Solution 1 :

From the Log Table, create a relationship to the Customer Table, and you should see them once only.

Solution 2 :

If some info from the log should be visible too in the portal, as *I assumed this is where you'd want to see the results*, then use 2 new Occurrences of the LogTable.

1) CheckLogsDupe is the name of the first new Table Occurrence, which source Table is LogTable, and you will link the CustomerID in CheckLogsDupe to the CustomerID in the LogTable.

2) FilterLogs is the name of the second new Table Occurrence of the LogTable, and you will link the LogID in FilterLogs Table to a calculation in the LogTable, that will only return the first (or last as you wish and sorted the relation) log for any Customer.

The calc, call it cFlagFirst could be Case(logID=CheckLogsDupe;logID;"")

If you'd be filtering from a global date Ranges, these globals should reside in the LogTable and the Range match should only be used from LogTable to CheckLogsDupe

cbum · March 29, 2005

Ugo,

thanks for chiming in.

Am I correct in assuming that the solution 1 & 2 you mention have nothing to do with each other, and #1 is for the case there are 2 separate tables (which is not the case for me)?

And as to #2, I'm still not following your solution. What portal?

I'm trying to find a way that eliminates duplicates for the found set (while leaving one record for each customer/patient)...

If you solution requires going to a new layout that shows a filtered subset of the found set, eliminating dups, that's fine, but I don't follow your architecture...

I guess I'm trying to walk before running here...

Ugo DI LUCA · March 29, 2005

Hi,

I guess the first thing would be that you tell us clearly what you mean by "eliminating" duplicates.

I was myself referring at 2 methods of filtering duplicates, to get a "temporary" list of unique records in a portal, without deleting any record.

See attached the graph for both methods.

Now if you wanted to clear duplicate entries, just use Method1, trigger a Go To Related Records to the UniqueLogs, Show Omitted records and delete them.

comment · March 30, 2005

I believe the question is: How to omit duplicates from the found set?

Since relationships ignore the found set, IMHO no method that relies on relationships will work reliably. For example, the record marked as unique by a relationship, may not be in the found set - but its duplicates might.

If sub-summaries are not an acceptable answer, then the correct way would be to perform the find in the Customers table, using fields from the related Log table.

IOW, in order to create a list of patients treated this year, do NOT search for visits this year. Instead, search for patients that have been visited this year.

Ugo DI LUCA · March 30, 2005

Hi Comment,

Right.

This is why I stated in my first answer that

Quote
If you'd be filtering from a global date Ranges, these globals should reside in the LogTable and the Range match should only be used from LogTable to CheckLogsDupe

which my previous graph didn't accounted for clarity purpose.

comment · March 30, 2005

I agree. But there seem to be 2 questions in this thread. The original poster clearly stated the search criteria, and your first answer provided a full answer to that.

Then cbum said "the problem I have with the method described using self-joins, is that it does not work dynamically for found sets." That is correct. It will not adjust dynamically to any arbitrarily found set. If the found set cannot be translated to a relationship (as you have done with the date range), then it's best to abandon this method and look for a solution elsewhere.

cbum · March 30, 2005

Ugo, Comment,

first, thanks for your time.

- with eliminating dups, I mean only from the found set, not the records themselves. Now, my read of Leadhead's Q. is the same, I don't believe he wants to eliminate dups from the database either. So I'm confused about comment's comment about 2 Q in the thread, since I see my problem as almost identical to the original one, substituting patient's records, where there may be many for each patient, for client visits (which is why I jumped into the thread in the first place).

- Ugo, how does the portal help, and does your solution work in the context of found sets?

I tried to follow the logic of your relationship picture, but there is too much for me the understand the reasoning behind the various links...

- Comment, " the record marked as unique by a relationship, may not be in the found set - but its duplicates might." is spot on my problem.

However, I don't understand "perform the find in the Customers table, using fields from the related Log table", since I only have one table, not 2...

And as to "in order to create a list of patients treated this year, do NOT search for visits this year. Instead, search for patients that have been visited this year", ... :

comment · March 30, 2005

The difference is that the first poster said "trying to make a list of clients who visited my office this last year". The solution offered did NOT use Find to create a found set of visits this last year first, then omitting duplicate clients. Instead, relationships were filtered to make only unique-client visits this year related.

IIUC, you posed another question, which starts from an arbitrary found set. We don't know how you arrived at the found set, and whether relationships can be used instead.

Quote
I don't understand "perform the find in the Customers table, using fields from the related Log table", since I only have one table, not 2...

The solution is simple - create another table, then you will have two...

One table lists the patients, and each patient has one and only one record. The second table logs the visits/treatments, and each visit has one record.

cbum · March 30, 2005

comment,

ok, I see how the record selection in the first post was not necessarily a found set.

Although I still think the most likely read is that he is interested in finding how many patients were tested for x, or y, or z, counting each patient only once, and his database happens to reflect a year of encounters, not that he is selecting the last year and now wants a count of all patients he tested at all.... So this again would be the same as my Q.

Now to your suggestion of creating another table with the list of patients, one record per patient.

I don't see how that helps though, since if the find is performed on this new table ("perform the find in the Customers table"), the table will not necessarily have the relevant data, e.g.

Assume each record has a few result fields (diagnoses, temp, blood pressure etc.), which are typically searched for, and which can change with every record/encounter for a given patient. The "index" record for a given patient that is reflected in the new patient/customer table may or may not have the result of interest.

It seems incredible that Filemaker doesn't seem to have a simple way of filtering for unique instances for any given field (like pt ID) in a search result ...

:

comment · March 30, 2005

Quote
I don't see how that helps though, since if the find is performed on this new table ("perform the find in the Customers table"), the table will not necessarily have the relevant data

No, but the related table does. You can search for records in one table that meet the specified criteria in a related table. The point is, since there is only one record per patient in the Patients table, you cannot find duplicate patients.

cbum · March 30, 2005

Hmmm,

"search for records in one table that meet the specified criteria in a related table"

I'm still chewing on that one, and I have to go to bed...

How does that work? Is that where Ugo's portal comes in?

If I perform the find in the Patients table, which does not have all the records/findings of the full table with all the dups, how does the search find results not reflected in the patient table? Do you put in the find criteria in the portal fields?

comment · March 30, 2005

Quote
Do you put in the find criteria in the portal fields?

That would be one way, yes. You can input find criteria in a related field in other ways, e.g. by setting a related field in a scripted find.

This may be slightly off-topic, but I believe it needs to be made clear: there really must be a separate Patients table. Consider a regular patient, coming in for a periodic checkup. Is the doctor supposed to enter the patient's name, address, age, billing information etc. every time anew? Or would it be better to enter just the patient's ID and the details of the current visit?

Once you have two tables, Patients and Visits (related by PatientID), it is a matter of what you're looking for: if you say "show me the patients I have treated this quarter", then clearly you are looking for patients, not visits. So the place to look for them is in the Patients table.

ralph.nusser · March 30, 2005

John Mark Osborne has in his DevCon 2003 examples an example to mark the duplicates. You can just omit the marked records:

http://www.databasepros.com/FMPro?-db=re...p=25&-find=

LaRetta · March 30, 2005

cbum,

Instead of thinking about (and chewing on) the information presented, or attempting to keep the vision of it in your head alone, it usually helps to create a dummy file and play with the ideas. It's amazing how clear something becomes when you have a fresh file and can see their ideas in action. Even if you don't understand WHY something works when you start the exercise, you can figure out the logic and principles by the very creation of it.

LaRetta

Søren Dyhr · March 31, 2005

Wow ...... aren't this thread straying? The original question is lost in hijackers attempt to solve thier own problems instead - but the issue was:

Quote
So I want to find all clients that I've tested but only have one occurrence not duplicates, if I omit duplicates this takes out every occurrence.

Which I interpret this way ...I wish to make a list of the clients unaware of their test results - The ones showing up now and then has gradualy been informed of the results, no need to write them anything.

If it's the case is it pretty straight forward to script it or do it as a manual search ...i've chosen to make it scripted:

Enter Find Mode [  ] 

Insert Calculated Result [ clients::ClientName; "!" ] [ Select ] 

Perform Find [  ] 

Show Omitted Only

--sd

Sign In

Identifiying Duplicates / 1 Occurrence

Recommended Posts

LedHead

Ugo DI LUCA

whitemyer

Søren Dyhr

cbum

whitemyer

cbum

Søren Dyhr

cbum

Ugo DI LUCA

cbum

Ugo DI LUCA

comment

Ugo DI LUCA

comment

cbum

comment

cbum

comment

cbum

comment

ralph.nusser

LaRetta

Søren Dyhr

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information