January 26, 200916 yr Hello everyone I'd like to implement a function in Filemaker Advanced 10 would calculate the Levenshtein distance between 2 words ( the Levenshtein distance is a metric for measuring the amount of difference between two sequences). For example, the distance between "Madonna" and "Modanna" is 2. There is a good article on wikipedia: http://en.wikipedia.org/wiki/Levenshtein_distance Unfortunately, I didn't find this feature in BrianDunning and others :-( If anyone has a solution, I'm interested :-)
January 26, 200916 yr Hi Benka, I'm surprised you didn't follow up on your other post. It seems similar (if not the same concept). Having said that, you would need a dictionary attached. And how would you know the result should be Madonna? How would you specify which words should be compared because if it compared every word then there could be hundreds of results changing each word to another. We need to know what exactly you want to accomplish and, more importantly in this instance, why. LaRetta
January 26, 200916 yr Author Hi I'd like to create a function working like this : LevenshteinDistance ( "madonna";"modanna") = 2 LevenshteinDistance ( "GUMBO";"GAMBOL") = 2 LevenshteinDistance ( "ZXRZYXX";"ZXRZZZXZZ") = 4 like this : http://www.merriampark.com/ld.htm Edited January 26, 200916 yr by Guest
January 26, 200916 yr Okay, but I'm stuck on a few rules: If it were reversed, ie, "GAMBOL" ; "GUMBO" ... would it still equal 2 because there was no L on Gumbo? "ZXRZYXX";"ZXRZZZXZZ") = 4 Please explain (in exact sequence) how this can equal 4. I come up with 5. UPDATE: Oh, the X by position matches the second X. Forget the second question then. Edited January 26, 200916 yr by Guest
January 26, 200916 yr Author :waytogo: http://en.wikipedia.org/wiki/Levenshtein_distance Yes, if it were reversed, ie, "GAMBOL" ; "GUMBO", it would be still equal 2 Edited January 26, 200916 yr by Guest
January 26, 200916 yr Custom Function: CompareStringErr Parameters: NewString OldString Let ( [ $posErr = $posErr + 1 ; newChar = Middle (newString ;$posErr ; 1 ) ; oldChar = Middle (oldString ; $posErr ; 1 ) ; match = If ( newChar ≠ oldChar ; TextColor (newChar ; RGB (255 ; 0 ; 0 ) ) ) ] ; If ( $posErr ≤ Length ( newString ) ; match & CompareStringErr( newString ; oldString ) ) ) Then calculation (number): Let ( $posErr = 0 ; Length ( CompareStringErr ( newString ; oldString ) ) ) Given time, I could fine-tune it further ... UPDATE: I totally expect (and hope) that a few Masters here show us how it REALLY should look! Edited January 26, 200916 yr by Guest Added update & removed incorrect sentence
January 26, 200916 yr It appears (to me) that you are simply comparing letter to letter right down the line. If so, a recursive custom function can handle it. But I'm unsure which string takes priority. For instance, you'd need to adjust the calculation portion of my CF (to account if reversing GAMBOL and GUMBO to: Let ( [ add = Case ( Length ( oldString ) > Length ( newString ) ; Length ( oldString ) - Length ( newString ) ) ; $posErr = 0 ] ; Length ( CompareStringErr ( newString ;oldString) ) + add )
January 26, 200916 yr I see there is a lot more to Levenshtein than simple match letter to letter. You might just search here for Levenshtein and you'll get a few other hits.
January 27, 200916 yr Here's my attempt. FileMaker isn't particularly well suited to this sort of thing, but anything is possible with enough trickery. This is implemented via a set of custom functions to port the code on wikipedia to FM. I use global variable repetitions and dynamic variable names to make up the 2D matrix. It seems speedy enough on small words, but I wouldn't suggest trying it on pages on text.... : Levenshtein.fp7.zip
January 27, 200916 yr Author Thanks The Shadow! :thanks: I'm gonna try to understand what you did It seem to be very clever!
Create an account or sign in to comment