beckham Posted February 4, 2011 Posted February 4, 2011 Hello, Can anyone advise please, in Filemaker 11 is it possible to create a find script to find "similar" records in a set ? An example would be that I have 100 records and i want to find every records where 50% of the data entered is the same. Many thanks for any advise offered.
Russell Barlow Posted February 4, 2011 Posted February 4, 2011 Hello, Can anyone advise please, in Filemaker 11 is it possible to create a find script to find "similar" records in a set ? An example would be that I have 100 records and i want to find every records where 50% of the data entered is the same. Many thanks for any advise offered. http://www.filemaker.com/help/html/find_sort.5.9.html#1028377 - KB about using ! for finding duplicates http://www.filemaker.com/help/html/find_sort.5.2.html#1028377 - KB about Find Requests Above links should point you in the right direction.
comment Posted February 4, 2011 Posted February 4, 2011 What exactly would be considered as "50% of the data entered is the same"?
beckham Posted February 4, 2011 Author Posted February 4, 2011 What exactly would be considered as "50% of the data entered is the same"? Hi I have a greeting card database which as text verses entered into a field, some of these get reused but changed slightly, I need to find a way of looking for records where the verse is reused but it would not be an exact ! duplicate, often 20% or more of the words used in the verse would change to accommodate a different gender etc. many thanks.
Lee Smith Posted February 4, 2011 Posted February 4, 2011 Please update your Profile and include the version of FileMaker and platform (Operation System) Lee
beckham Posted February 4, 2011 Author Posted February 4, 2011 Profile updated, sorry about that Lee.
Russell Barlow Posted February 4, 2011 Posted February 4, 2011 Hi I have a greeting card database which as text verses entered into a field, some of these get reused but changed slightly, I need to find a way of looking for records where the verse is reused but it would not be an exact ! duplicate, often 20% or more of the words used in the verse would change to accommodate a different gender etc. many thanks. You may want to look into using regEx expressions or perhaps a function based on the Levenshtein distance.
Lee Smith Posted February 4, 2011 Posted February 4, 2011 Click on one of the suspected dates, and then Right Click and choose "Find Matching Records" and see that helps. Lee
beckham Posted February 4, 2011 Author Posted February 4, 2011 Thank you Lee thats not exactly the result I wanted, but its far better than i have managed Many Thanks
Vaughan Posted February 4, 2011 Posted February 4, 2011 Look into Soundex, it's relatively standard and easy-ish to implement in FMP. Thanks one again to Mr Edoshin. http://edoshin.skeletonkey.com/2006/01/soundex_and_mir.html
comment Posted February 4, 2011 Posted February 4, 2011 I need to find a way of looking for records where the verse is reused but it would not be an exact ! duplicate, often 20% or more of the words used in the verse would change to accommodate a different gender etc. This is sort of possible, but by no means simple. First, you need to turn the text into a return-separated list of words used. Ideally, this process would also eliminate "stop words" (e.g. a, an, the, at, of, etc.). Since you don't have the advanced version, you cannot use a custom function to split the verse into words - but you could use a repeating calculation field instead. Next, you define a self-join relationship matching on the word list. This will relate records that share at least one word with the current record. Finally, you would use portal filtering to show only records that share at least x words with the currently viewed record. --- Another option would be to replace words that may change with gender etc. with their "root" - then look for an exact match (assuming the word order does not change).
comment Posted February 4, 2011 Posted February 4, 2011 You may want to look into using regEx expressions or perhaps a function based on the Levenshtein distance. Click on one of the suspected dates, and then Right Click and choose "Find Matching Records" and see that helps. Look into Soundex, it's relatively standard and easy-ish to implement in FMP. Am I in the right thread?
Lee Smith Posted February 5, 2011 Posted February 5, 2011 i want to find every records where 50% of the data entered is the same. Lee Smith, on 04 February 2011 - 01:14 PM, said: Click on one of the suspected dates, and then Right Click and choose "Find Matching Records" and see that helps. Am I in the right thread? I don't think so. It works for a lot of simple things like this. and... we didn't seem to be gaining on it the other ways. Lee
Recommended Posts
This topic is 5375 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now