oilcan Posted January 14, 2010 Posted January 14, 2010 I have a find result that I don't understand, and I was hoping somebody could shed some light for me. I manage and develop a FileMaker database for my lab. In one particular text field, among many other values, we have the following two values: HLA-B-1501 DRB1-1501 It was recently pointed out to me that when you do a find in that field for: 1501 You get hits for the HLA-B-1501, but not the DRB1-1501. Perhaps there is a good reason for this, but I am at a loss to explain this behavior. Does anybody have any insight into this?
Matthew F Posted January 14, 2010 Posted January 14, 2010 No, I don't get it. For some reason having a number instead of a letter in front of -1501 makes a difference. In any case the right way to do the search is to use *1501 (asterisk preceding the term). The asterisk means '0 or more characters' and it is one of the standard operators that shows under the 'Insert' button when you are in find mode.
LaRetta Posted January 14, 2010 Posted January 14, 2010 (edited) The reason is how FM interprets word breaks. HLA-B-1501: First dash has alpha before and after it so it is a word break. And it has an alpha before the second dash and a number after so it is considered a word break (it would count as three words) DRB1-1501: Dash is between two numbers so it is NOT a word break so it counts as one word. So HLA-B-1501 is interpreted as HLA B 1501 and [color:green]FM finds the 1501 but DRB1-1501 doesn't find 1501. There have been a few great posts on the subject by Comment but I don't can't seem to produce the full links. They are Posts 176047 and 175060 (neither are coming up). I will try to find them later but here is a bit of information I have stored about the word break rules ([color:blue]from Comment) [color:blue]RULE 1 (period .) one.two = 1 word one.2 = 2 words 1.two = 2 words 1.2 = 1 word RULE 2 (slash / hyphen - colon : comma ,) one-two = 2 words 1-two = 2 words one-2 = 2 words 1-2 = 1 word RULE 3 (single quote ') one'two = 1 word one'2 = 2 words 1'two = 2 words 1'2 = 2 words The ampersand and hyphen are word delimiters - but they are not the only ones. There are many more, first and foremost a space, of course. I don't know if there's a comprehensive list, but at least !@%^()≠+[]{}<>≤≥`~|? are all word delimiters. Word delimiters are not words, so you cannot get them with the MiddleWords() function. space * # ; " = $ There are other seemingly strange rules, such as text field with abc 34). If you create calculation for RightWords ( text field ; 1 ), it produces 34. The ) is ignored because it is one of the word breaks. UPDATE: Oh! I'm a colorful puppy today! Edited January 14, 2010 by Guest Corrected green
michael3785 Posted January 16, 2010 Posted January 16, 2010 (edited) Hi guys Found this post interesting as I am having a similar problem but mine does not involve any punctuation. A find for 00000000533006B9BD99CC4D8B1F70AF9DA7783B0700A5B4F3D8CC2B3C4FA1DB76709804EC44000000001BA00000A5B4F3D8CC2B3C4FA1DB76709804EC4400000041D2610000 finds things like 00000000533006B9BD99CC4D8B1F70AF9DA7783B0700A5B4F3D8CC2B3C4FA1DB76709804EC44000000001BA00000A5B4F3D8CC2B3C4FA1DB76709804EC4400000000425B0000 and 00000000533006B9BD99CC4D8B1F70AF9DA7783B0700A5B4F3D8CC2B3C4FA1DB76709804EC44000000001BA00000A5B4F3D8CC2B3C4FA1DB76709804EC4400000041D08C0000 (There are no carriage returns in the actual values) I can make the find work properly by putting a quote at each end of the search string, but this does not address my application as I am actually trying to make a relationship between records using this data and the relationship is matching records with the wrong key on the "right" of the relationship. Are the 0's doing something maybe? Any input much appreciated. Many thanks Michael Edited January 16, 2010 by Guest
IdealData Posted January 16, 2010 Posted January 16, 2010 (edited) That's a long key field! In fact they are 140 characters long. Some time ago I discovered an indexing limitation which seemed to be around 100 chars - but a close inspection of your results would suggest 127 or 128 - both are notable numbers. It's difficult to say exactly as the 127/128th chars fall into a block of zeros. Use a shorter key? Or possibly split the key between 2 fields? Edited January 16, 2010 by Guest
comment Posted January 16, 2010 Posted January 16, 2010 Or convert the hexadecimal values to decimal numbers.
michael3785 Posted January 17, 2010 Posted January 17, 2010 Hi all Many thanks for your help on this. I went with the split key option and this work great. BTW, I am not generating the keys - they are from Outlook using the Productive Outlook Maniplulator plug-in so I don't get to choose the key length. I thought about converting the hex values to decimals but I thought this might allow for collisions where record A was "01234A" and record B was "012341". If I was using a A->1, B->2, type conversion scheme, record A would go to "012341" and collide with record B. I should add, that I tried a workaround in the scripting to use a find on the key field rather than a relationship with it, but putting quotes around the key value when finding made the find take 3-4 seconds (I am searching a table with 200,000 records) so it would not have been scalable for batch scripting. The relation look-up is almost instant so works better for me. Anyway, thanks again for your input. All sorted Michael
LaRetta Posted January 17, 2010 Posted January 17, 2010 (edited) I didn't know there were different types of conversion schemes. I would think mathematics would be straightforward, particularly on something like hexadecimal to decimal, but I could be very wrong. I get different numbers and I'm unsure if you could ever get dups because alpha increments 10, 11, 12 (for A, B & C) etc. and hexidecimal works in 16's. // 012341 1 * (16^0 ) + 4 * (16^1 ) + 3 * (16^2) + 2 * (16^3) + 1 * (16 ^4) + 0 * (16 ^5 ) = 74561 // 01234A 10 * (16^0 ) + 4 * (16^1 ) + 3 * (16^2) + 2 * (16^3) + 1 * (16 ^4) + 0 * (16 ^5 ) = 74570 UPDATE: I suppose if G were in the second position, translating to 16 ... but then we look at single digits in the translation so, in looking at the pattern and not testing thoroughly, dups wouldn't make sense. Edited January 17, 2010 by Guest Changed sentence in update :^)
comment Posted January 17, 2010 Posted January 17, 2010 // 012341 1 * (16^0 ) + 4 * (16^1 ) + 3 * (16^2) + 2 * (16^3) + 1 * (16 ^4) + 0 * (16 ^5 ) = 74561 That is correct (except the redundant parentheses). Of course, with 142 digits you'd want to use a custom function for this. There can be no G in a hexadecimal number.
LaRetta Posted January 17, 2010 Posted January 17, 2010 (edited) I can find nothing in what I'm reading which says which alpha characters are valid in hexadecimal. Can you help me isolate valid alpha and could we ever deal with an invalid hexadecimal in which we should trap for invalid alpha [color:green]or other characters? As for the redundant parentheses, LOL, I just used theories on line for the conversion. Of course they aren't necessary and I appreciate you catching me!! :laugh2: Edited January 17, 2010 by Guest Added green
comment Posted January 17, 2010 Posted January 17, 2010 (edited) The valid symbols are 0-9 and A-F (either upper or lower case may be used). If a hexadecimal number contains other characters, it should be treated the same way as a decimal number that contains non-digit characters. Edited January 17, 2010 by Guest
LaRetta Posted January 17, 2010 Posted January 17, 2010 (edited) This custom function might do the trick: HexToDec ( string ) Case ( Length ( string) ≠ Length ( Filter ( string ; "0123456789abcdefABCDEF" ) ) ; "Invalid" ; Let ( [ x = Left ( string ; 1 ) ; alpha = Position ( "ABCDEF" ; x ; 1 ; 1 ) ; char = Case ( alpha ; alpha + 9 ; x ) ; len = Length ( string ) - 1 ]; If ( Length ( string ) > 0 ; char * 16^ len + HexToDec ( Right ( string ; len ) ) )// end if )// end let )// end case If a hexadecimal number contains other characters, it should be treated the same way as a decimal number that contains non-digit characters. I chose to invalidate the string if it contained invalid characters. If I filtered them out and continued, it wouldn't be the same number any more and could match other converted hexadecimal numbers. Edited January 17, 2010 by Guest
Recommended Posts
This topic is 5480 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now