Jump to content
Claris Engage 2025 - March 25-26 Austin Texas ×

This topic is 5143 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Posted

I'm trying to scrape a web page and clean it up. I've subbed out spaces, tabs, pilcrows, and something that showed up in a text editor as a diamond. But I'm still getting a couple hundred line breaks. What character am I missing?

I take the page source, apply this code

Let([

source = GetLayoutObjectAttribute("viewer"; "content");

clean=Substitute(source; [" "; ""]; ["¶";""]; [" ";""];[" ";""])

];

clean)

and before I get to the initial text of -//W3C, I get this:

"

"

and so on for a 209 ValueCount().

It's driving me nuts!

Posted (edited)

It sounds like the literal characters cannot be pasted directly into the substitute command.

In any event, to determine the unicode point values of the offending characters, use

Show Custom Dialog

Code(Middle ( clean ; 1; 1 )) & "," &

Code(Middle ( clean ; 2; 1 )) & "etc"

Once the code point values are known, the substitutions can be made using the values as opposed to the literal characters using char, the inverse command of code:

clean=Substitute(source; [Char(codepoint1); ""]; [Char(codepoint2);""])

You may also consider filtering the text for alphanumeric characters, brackets, etc to clip other values, but that may suffer slow execution speed. I'm assuming here that simply pasting the characters from the field into the substitute calculation, isn't working. You might also want to look at ScriptMaster, since regular expressions are supported, it is easy to clip all the tags and other characters, the SM demo file has an excellent example of stripping all tags.

Unfortunately, basic character code functions were only added in FM10, so if it needs to run in earlier versions, you can always store the 'un-pasteable' character values you need to substitute for in global fields.

Edited by Guest
Posted

comment: When it's Friday at 6pm and I've been stuck on a stupid problem for an hour, I just need to get it into the aether, good examples or no.

Funny thing is, I know more about FileMaker from taking the exams. I certainly know what I don't know, which is considerable!

fseipel: Thanks, just blanked on Code().

It was line feed/LF/Code 10, but that doesn't paste into FileMaker, instead it pastes as a space/32 (!) which was really throwing me for a loop.

Posted

I'm glad you were able to resolve this, I have had this problem before, but didn't remember which character(s) were "un-pasteable" (is that even a word?). It may be only Chr(10).

It was particularly frustrating for me also, because when you paste this, no alert is given that FM changed what was in the clipboard. One would think any character that can be stored in a field, can be stored in a calculation, but that isn't the case.

It just occurred to me you ought to be able to type Chr(10) using the keyboard escape (Alt+0010) under Windoze, or alt+fn on a laptop with no numeric keypad, but that doesn't work, either -- if you type it into the field, it gives Char(13) instead. If you type it into the calculation, it transforms it into Char(32), the space character, which is consistent with what happens when you paste.

This topic is 5143 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.