November 13, 201015 yr I'm trying to scrape a web page and clean it up. I've subbed out spaces, tabs, pilcrows, and something that showed up in a text editor as a diamond. But I'm still getting a couple hundred line breaks. What character am I missing? I take the page source, apply this code Let([ source = GetLayoutObjectAttribute("viewer"; "content"); clean=Substitute(source; [" "; ""]; ["¶";""]; [" ";""];[" ";""]) ]; clean) and before I get to the initial text of -//W3C, I get this: " " and so on for a 209 ValueCount(). It's driving me nuts!
November 13, 201015 yr Hey David, give us something to chew on. -- BTW, congrats on the certifications. Your replies are much smarter now. :P
November 13, 201015 yr It sounds like the literal characters cannot be pasted directly into the substitute command. In any event, to determine the unicode point values of the offending characters, use Show Custom Dialog Code(Middle ( clean ; 1; 1 )) & "," & Code(Middle ( clean ; 2; 1 )) & "etc" Once the code point values are known, the substitutions can be made using the values as opposed to the literal characters using char, the inverse command of code: clean=Substitute(source; [Char(codepoint1); ""]; [Char(codepoint2);""]) You may also consider filtering the text for alphanumeric characters, brackets, etc to clip other values, but that may suffer slow execution speed. I'm assuming here that simply pasting the characters from the field into the substitute calculation, isn't working. You might also want to look at ScriptMaster, since regular expressions are supported, it is easy to clip all the tags and other characters, the SM demo file has an excellent example of stripping all tags. Unfortunately, basic character code functions were only added in FM10, so if it needs to run in earlier versions, you can always store the 'un-pasteable' character values you need to substitute for in global fields. Edited November 13, 201015 yr by Guest
November 13, 201015 yr Author comment: When it's Friday at 6pm and I've been stuck on a stupid problem for an hour, I just need to get it into the aether, good examples or no. Funny thing is, I know more about FileMaker from taking the exams. I certainly know what I don't know, which is considerable! fseipel: Thanks, just blanked on Code(). It was line feed/LF/Code 10, but that doesn't paste into FileMaker, instead it pastes as a space/32 (!) which was really throwing me for a loop.
November 13, 201015 yr I'm glad you were able to resolve this, I have had this problem before, but didn't remember which character(s) were "un-pasteable" (is that even a word?). It may be only Chr(10). It was particularly frustrating for me also, because when you paste this, no alert is given that FM changed what was in the clipboard. One would think any character that can be stored in a field, can be stored in a calculation, but that isn't the case. It just occurred to me you ought to be able to type Chr(10) using the keyboard escape (Alt+0010) under Windoze, or alt+fn on a laptop with no numeric keypad, but that doesn't work, either -- if you type it into the field, it gives Char(13) instead. If you type it into the calculation, it transforms it into Char(32), the space character, which is consistent with what happens when you paste.
Create an account or sign in to comment