Jump to content

How to count how many ID's are in a field, after removing duplicates?


This topic is 1617 days old. Please don't post here. Open a new topic instead.

Recommended Posts

In the database I received there are ID numbers for each user.  Sometimes the user's ID is in 10 or 20 rows, depending on how many questions they answered. 

I need to calculate how many different ID numbers there are.  I obviously need to have someone who answered 20 questions only count as 1 person.

So how do I go through the IDNumber field and only count duplicate ID's as a 1?

THANKS!

Link to comment
Share on other sites

Ideally, there would be a parent Users table where each user has a unique record. You can create such table by importing the user IDs into another table where the UserID field is validated as Unique, Validate always.

If you want to calculate this from your existing data, you could define a summary field as List of [UserID],  then do:

ValueCount ( UniqueValues ( sListUserID ) )

Note that like with your previous question, a summary field summarizes the current found set.

 

Link to comment
Share on other sites

I think that's a very good point "comment".  I guess I just need to learn how to do that.  That ID is on every line of the spreadsheet and I need to make it so each user ID has it's own unique record.  I will have to look for some examples on how to do that.   If you know of any good tips on here on how to do that  please share.   

THanks! 

Link to comment
Share on other sites

So I created a new database and imported the data and created an "ID" code for each record.  However there are several lines of data, that have the same user on them, but a unique ID code on each one.  Hope I can separate the data by the question that was asked to them, and get the data, like on how many people said yes to this question, and graph their race, etc.....

 

Link to comment
Share on other sites

If you have a structure of:

Users --< Responses >-- Questions

you can count how many unique users responded to a question by counting the related records from Users. This will count each user only once, even if they responded multiple times.

If you want a more detailed analysis, it would be best to produce a report from the Responses table, sorted by question and by response. Then you can use a summary field to show "how many people said yes to this question" and more.

 

Link to comment
Share on other sites

The only field I have giving me the correct answer is "All Nurses".  It is actually giving me the correct number of nurses.  I am trying to go from there to count the amount of men, women, race, etc in each field.   There are about 1,600 actually users in these fields, but each line for each question has their name, RN number and stuff.  I just simply need to get a count of the data for each field.   Very difficult, but it seems like it would be simple.   

 

So like under the "test" field name I am simply trying to sound how many in the field of "sListGender" answered "Male" and how many answered "Female".  I THOUGHT that would be a simple query, boy was I wrong!  I have a lot to learn in this very powerful program!

Screen Shot 2019-10-18 at 10.43.14 AM.png

Link to comment
Share on other sites

I am afraid you have lost me at this point. I thought you had the problem of summarizing your current flat table solved, and were looking for a better approach by normalizing your data into proper relational structure (where every fact would be recorded only once).

 

Link to comment
Share on other sites

Basically I will be importing a $hit load of data from excel spreadsheets.  I then need to pull data out of them.  What is the average age, gender, what percent are the "races", and so on.    This is like a test part of that I am working on to see if I could do it with all 250,000 users.  Right now I am only working on the about 1,500 users data.  Need to pull data out of it and make some nice fancy pie and flow charts, lol .

Link to comment
Share on other sites

OK, I was out sick for a few days and now I am back.   Basically I have rows of data that I need to get the answers from.  I need to narrow the answers down to only getting one answer per person I need to narrow the data from RN_Number and then field I want the data from.

Example: Find out how many male nurses their are.

RN_Answers::RN_NUMBER  And RN_Answers::Gender

I am not sure if I need to somehow put    UniqueValues ( values {; datatype ; locale } )
Into my valuecount formula or what????

This is what I am currently using, but it doesn't stop duplicates from coming in to the count.  
ValueCount ( FilterValues ( RN_Answers::sListGender ; "Male" ) )
 

So again, PLEASE HELP 🙂

 

Screen Shot 2019-10-24 at 11.34.36 AM.png

Link to comment
Share on other sites

IIUC, you have a "flat" table where the same person can have more than one record, and you want to know how many people are male.

I don't know of a way to do this by a straightforward calculation. You could get the count using a script: for example, find only the Male records, then count the unique values in the ID field.

--
Let me add again that these problems are much easier to solve when you have normalized relational structure.

 

Edited by comment
Link to comment
Share on other sites

The only problem is that the ID field has the same ID in it for however many questions they answered.  It is a mess how they did this.

 

Ok I found one way to filter out the questions!!!  Each question has it's own unique question ID!

RN_Answers::RN_NUMBER  And RN_Answers::Gender AND QUESTION_ID

So if I take the RN_Number and make it Unique to the Question_ID, we should be able to do it.

Now my mind is going crazy on how to add the field Question_ID in there to separate out the answers.

 

15 minutes ago, comment said:

IIUC, you have a "flat" table where the same person can have more than one record, and you want to know how many people are male.

I don't know of a way to do this by a straightforward calculation. You could get the count using a script: for example, find only the Male records, then count the unique values in the ID field.

 

 

Link to comment
Share on other sites

7 minutes ago, 1FilemakerMan said:

The only problem is that the ID field has the same ID in it for however many questions they answered. 

But you already know how to count the unique IDs. It was your original question in this thread and I answered it in my first post here.

 

Link to comment
Share on other sites

I don't understand your reply?  They have Unique Id's, but they appear several times over in the rows for each question they asked.  I found that each question has a unique ID and that is what I need to use to help with the totals.  

Unique_Question_ID per each RN_Number   

Link to comment
Share on other sites

On 10/18/2019 at 6:02 AM, 1FilemakerMan said:

So I created a new database and imported the data and created an "ID" code for each record.  However there are several lines of data, that have the same user on them, but a unique ID code on each one. 

Hi 1FilemakerMan,

If the same User exists as different records in this new table then you did not follow Comment's suggestion of:

"You can create such table by importing the user IDs into another table where the UserID field is validated as Unique, Validate always."

Can you attach your file?  Added:  You did NOT import the User ID field and set it to unique, validate always.

Edited by LaRetta
Link to comment
Share on other sites

Sorry that I am losing you!!!!!     Each question has a unique ID number.  

 

So basically under filtered values I would like to add a third value, which it won't let me, or I just can't figure it out.

(RN_Answers::QuestionID; "Q-034496")  is What I need to add, so it is referring to this quesiton_ID.


 

didn't work either.    
ValueCount ( FilterValues ( ( FilterValues (RN_Answers::sListGender & RN_Answers::QuestionID ; "Q-034496") & (RN_Answers::sListofAllQuestions; "Male" ) ) )

 

Link to comment
Share on other sites

3 minutes ago, comment said:

What is the actual result you want to get here, in what format and for what purpose? (I should have asked this at the very beginning.)

I want to get a count/total for each of the questions these people were asked.  One of them just happens to be what Gender are they, what Ethnicity are they, etc....    I just need to take that data and create pie charts.

All of this data is imported from an Excel file into FileMaker Pro, roughly 50 questions per person, and about  1,500 people.

1. Each question has a unique question ID.
2. Each person has a Unique ID

So I need to find a way to take each person's answer to each question and get total.

For Example:  This gets me the total number of times "Male" appears, but I need to make it unique to the  field of RN_Answers::QUESTION_ID = Q-034496.

ValueCount ( FilterValues ( RN_Answers::sListGender ; "Male" ) )

 

 

 

Link to comment
Share on other sites

11 minutes ago, 1FilemakerMan said:

I just need to take that data and create pie charts.

So you've taken the data and created some pie charts. What next? What will you do with the pie charts?

And what do you do with the data after that? You mentioned getting multiple spreadsheets. Will they be added to the already existing data? Or will you start anew for each batch you import?

 

Link to comment
Share on other sites

1 hour ago, comment said:

So you've taken the data and created some pie charts. What next? What will you do with the pie charts? People will view the data.

And what do you do with the data after that? Keep it.  You mentioned getting multiple spreadsheets. Yes spreadsheets will be imported for each letter of the alphabet. Will they be added to the already existing data? Yes Or will you start anew for each batch you import? NO

 

 

Link to comment
Share on other sites

I am afraid I am still not getting the sense of it. Let me outline two possible strategies: 

  1. Keep your data in a single flat table. To produce the pie chart (or any other statistics), run a script that collects the data ad hoc for the requested view. For this purpose, you'd be well advised to study a method known as Fast Summaries by Mikhail Edoshin. Using this strategy, you should be able to get most, if not all, of your summary values using a single summary field that counts the records.
  2. Normalize your data as suggested earlier.

The way you're going about it now, trying to summarize your flat data using a mixture of summary and calculation fields, is IMHO not viable. You will end up with a ton of extra fields, and there will still be some stats you won't be able to get without running a script.

 

Edited by comment
Link to comment
Share on other sites

This topic is 1617 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.