March 15, 200817 yr I am using the function GetLayoutObjectAttribute ( "wv" ; "content" ) to try to parse info out of a webpage. However when I use this function the resulting html code appears a bit differently compared to when you simply do a copy paste out of the html source tab in Netscape Composer. Basically a lot (but not all) of the carriage returns are missing. Example below. NETSCAPE Cast �(Cast overview, first billed only) FILEMAKER FIELD Cast �(Cast overview, first billed only)
March 15, 200817 yr It is likely that FileMaker doesn't recognize the whitespace line endings. Perhaps they are Unix, ASCII 10. They are not considered important to web browser engines, which are whitespace agnostic. What exactly are you doing with the source code after putting it in a FileMaker field? Perhaps there is a better way to get it. If you use AppleScript and shell script you can very quickly get the source code of a web page. Run this in Script Editor: do shell script "curl 'http://fentonjones.com'" In my experience this is faster and more reliable method to get the html code. You can set this into a FileMaker field. Or, you could use further AppleScript or command line tools to parse the text. In the latter case you need to force the text into Unix line endings. The do shell script command in AppleScript (which is also line-ending agnostic) is made to be more compatible with old-style Mac returns, whereas the Unix commands require Unix line endings. Let me know if you want more info on this.
March 16, 200817 yr Author Thanks for the response Fenton. I am not familiar with the curl command. In short I am using FM to find a specific movie on IMDB and parse out the actors for the film. I was using the grep command to pullout the html lines containing the names, and FM to pullout just the names from those lines. Problem is what I mentioned before that there are no carriage returns when FM pulls out the code using GetLayoutObjectAttribute ( "wv" ; "content" ). Any help would be much appreciated if you have time. Thanks David
March 16, 200817 yr "curl" is a Unix command to return the source of a URL, to the standard output. It is sort of like getting the source of a Web Object, but much faster. It works well with grep. I use within AppleScript, as I can run that directly from a Perform Applescript step, and can then set the results into Filemaker fields. There are some caveats when working with Unix commands within Applescript (AS) however. The "do shell script" command runs Unix command line within AS. By default it returns old-style Mac returns, for compatibility's sake I imagine. But you must have Unix line endings to use grep (or other Unix tools). There is a way to do this, the "without altering line endings" option. Examples (run in Script Editor) set web_txt to do shell script "curl 'http://imdb.com/find?s=all&q=Daywatch&x=18&y=5'" without altering line endings But an even better way, when you're trying to parse text, is to the use the command: strings It coerces text to Unix line endings, and removes extra lines. set web_txt to do shell script "curl 'http://imdb.com/find?s=all&q=Daywatch&x=18&y=5' | strings | grep 'Daywatch'"
Create an account or sign in to comment