Ugo DI LUCA Posted June 27, 2003 Posted June 27, 2003 Hi, I'm strugglig with this for a while. I'm just looking for a way to identify mispelling when the user enters a customer name. It could be also used to check dupes while entering a new record from globals. The best I could get at the moment is an extremely long calc of type : Case(Length(TrimName) >= 8,Replace(TrimName, 8,1,Middle("ABCDEFGHIJKLMNOPQRSTUVWXYZ", 1, 1)) & "
ernst Posted June 27, 2003 Posted June 27, 2003 Hi Ugo, I do not really understand what you want to achieve, maybe you could explain it a bit more. Looks like the MIDDLE ("ABCDEFGHIJKLMNOPQRDTUSVWXYZ, 1,1) bits could be replaced with a simple "A" and so forth. So your calc would become: Case(Length(TrimName) >= 8,Replace(TrimName, 8,1,"A") & "
Ugo DI LUCA Posted June 27, 2003 Author Posted June 27, 2003 Hi Ernst, Thanks for the interest to this. This is directly related to the sample I've posted about checking duplicate while entering new entries. I'm looking for a genious calculation that could return a Multikey that could check for mispelling. This is different from a Clairvoyance as the record is not suppose to exist in the db. For example, if the user type SHUMACHER, CHUMACHER, SCHUAMCHER, SCHUAMKER (this one's impossible), and then proceed with the "go entry" script, I'd like the user to be alerted that there is a possible duplicate called SCHUMACHER. Any Ernstein here
ernst Posted June 27, 2003 Posted June 27, 2003 Hey Ugo, OK, I understand what you want to do. Seems a hell of a job to me. You have got to define how 'equal' the existing names must be before being put on the list of possible duplicates. You could use length, the first few letters, inversions of letters and presumably much more parameters. But the involved relations and multikeys would personally really put me of, besides the fact that the kind of 'smart guessing' that you are after is a science by itself. Therefore, can't you use the built-in spell checker? Just export a list of names and import them in a custom dictionary. I just tried that and it gives usable results. Interface is not so nice as your sample, but it is a lot less work then inventing your own! Regards, Ernst.
Ugo DI LUCA Posted June 28, 2003 Author Posted June 28, 2003 Hi Ernst, I've never worked with FM dictionanry and couldn't find any clue for this suggestion. Would you mind extending on this, as for me to know if I spent some useless time... Nevertheless, here's a first approach in a 500 records sampler. Let say this it is offering me 80% of my request, but the calcs is awfully long, I even catched FM limitations when I made it, which was my first time with this alert. Now, the global is set by script, so that it doesn't take too many recalcs, and I was quite surprise on how quick FM is performing the matches. It takes into account, Little and Big Inversions, Missing characters and Additional ones. Limited to 10 characters names at the moment. Have a go and PLEASE report any improvements to be made in your opinion. SpellChecker.fp5.zip
ernst Posted June 29, 2003 Posted June 29, 2003 Hey Ugo, I'm playing with your spell checker. I'll let you know my opinion about possible improvements or bugs. I find it impressive that you came so far with it, because it's a difficult problem to tackle and Filemaker is not really (really not?) the tool to do this with, IMO. May I suggest building a non linear video editing system as your next project? About Filemaker's spell checker. It's not really the solution that you want, I think, but you could have a look at it. What you should do is export your name database as tab separated text, select 'Edit->Spelling->Edit user dictionary', click on the 'Text file' triangle at the bottom of the dialog and import the tab separated text file from the first step. Your script should then perform the 'check selection' script step with the g_name as target. Clicking 'Learn' in the dialog would add the name to the dictionary, the end user could also select an alternative name from the list. I have not tried if the spell check dialog also gives a sort of "Status(Currentmessagechoice)) output to check if you cancelled the dialog, but you can easily compare the contents of the g_name field before and after the spell check. Hope this is somehow useful... Regards, Ernst.
Ugo DI LUCA Posted June 29, 2003 Author Posted June 29, 2003 Hi Ersnt, Thanks for the comments and the precision about FM dictionnalry. I'll try to look at it. The current sampler actually find any match for a related name "SCHUMACHER" even if the user : - substitute 1 letter (ACHUMACHER or SCHIMACHER) - invert1 letter or more (SCUHMACHER or CSHMUACHER...) - omit 1 letter (CHUMACHER or SCHUMCHER) or 2 letters in a row (SCMACHER...but this one ise really your "guessing science " statement. Next step would be to add one letter to the global field, leading to SCHUMACHER when the global had SCHUMACHERY, or substract to the indexed key... That leads to the 4 "Matching" functions.That's why I said "80% of what I needed". The adjustments and improvements to be made to this sampler are indeed multiple, even if the first results are promising in se. I've already corrected the indexed part (changing the Name to the TrimName in the Replace calcs), but the biggest part would be to rework the calcs in a more intuitive manner. They currently are IMO too long, and I'm sure I'm missing some nested substitute function, Mod and 10^. My feeling is that the "textToNumber" substitution leading to a range could also be a way to go, associated with the existing Permutation,Substraction and Ommission of any characters in the global text field. For the latter function requested, rather than adding 26 letters into the 9 positions (which would lead to an additional 260 CR) of this particular name, a "calculated number" range, for the left side as well as for the indexed key would surely be easier. Some "Mathematicians" could also find an easiest way to generate a number that guarantees some kind of "uniqueness". As you probably see, the numbers attributed to any letter of the alphabet are in this instance rather non-intuitive (randomly affected should I say). Thanks anyway for your time, and thanks to anyone, especially the Gurus here that would mind to give it a go. Or is this kind of tool totally unnecessary ?
ernst Posted June 29, 2003 Posted June 29, 2003 Hey Ugo, One more thought on your spell checker. Because you activate checking with a button anyway, you could also do all your calculations in a script as opposed to in a calc field. That would mean that you could easily use some extra globals to achive the desired result. >>Or is this kind of tool totally unnecessary ? This depends on the kind of morons you're target users are... Regards, Ernst.
Ugo DI LUCA Posted June 29, 2003 Author Posted June 29, 2003 Ernst, The left side global is activated by a script, that holds this tremendous calc. The indexed key for right side is calculated with some less steps. What is exacly your idea ? I really reached FM 3000 lines limitations even in the script calculation.
ernst Posted June 29, 2003 Posted June 29, 2003 Hey Ugo, Ok, I see. Thought you were calculating both side of the relation. But you do the calculation in one big step, and sometimes it my be easier to do a few small steps and/or have some part of the calculation stored in an extra global... But you obviously have thought about this a lot more then I. Anyway, I'm gonna have a coffee now, Regards, Ernst.
Ugo DI LUCA Posted June 29, 2003 Author Posted June 29, 2003 >>Or is this kind of tool totally unnecessary ? This depends on the kind of morons you're target users are... This is part of a project I'm on. The users are typing names all the day. They have been generating duplicates for almost 2 years. At the same time, I'm also involved in another project where customers could be indentified as soon as they give their name at the phone. Rodriguez, no Rodrigues Schumacher, no Chumacher +++ mispellings... Sure this also depends on the kind of complicated the developper is...
Ugo DI LUCA Posted June 29, 2003 Author Posted June 29, 2003 >>>sometimes it my be easier to do a few small steps and/or have some part of the calculation stored in an extra global... OK, I understand. I even think you made my day... Here's what I have in mind, will test and report. 4 scripted globals, 1 extra one and 5 relationships. Global 1 : Characters Inversions (The NumbersToText calculation) Rel 1 : Global 1 To Name Global 2 : Characters ommission Multikey Rel 2 : Global 2 To Name Global 3 : Characters substitution Multikey Rel 3 : Global 3 To Name Global 4 : Characters Addition Multikey Rel 4 : Global 4 To Name Global 5 : c_ValueListItem(FileName,"VL From Rel 1) &" "&c_ValueListItem(FileName,"VL From Rel 2) &" "&c_ValueListItem(FileName,"VL From Rel 3) &" "&c_ValueListItem(FileName,"VL From Rel 4) Rel 5 : Global 5 To Names ----> Definitive list of possible duplicates in a portal. I'm quite sure it won't take longer that what it is at the moment. I tested the previous demo on a 4,500+ records this morning, and it was really quick. I'd split each calc so that someone could look at it and eventually "simplify" the process. Wonderful. Thanks.
ernst Posted June 29, 2003 Posted June 29, 2003 Good idea Ugo, Nice thing about your proposed method is that you can easily add an extra check if you find that necessary. And that you don't have to delve in one long and complicated one. Success! Ernst
Ugo DI LUCA Posted June 29, 2003 Author Posted June 29, 2003 the updated version has moved to the sample section Matching Mispellings
Recommended Posts
This topic is 7888 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now