get html source without rendering?

lsmall · August 21, 2009

I am currently running an automated script that scrapes data from an HTML based lab system in my hospital. I collect study data this way. I was wondering if there was a way to get the html source without having to render HTML page first, as this would, theoretically, cut my data capture time significantly. Right now I use a loop to detect when the page has loaded in the webviewer and then scrape the code using getlayoutobjectattribute and then parse the code. It works great, but I'm always looking for ways to shorten the data collection time and this is the major limiting factor in the script.

Any thoughts would be appreciated.

Thanks

mr_vodka · August 21, 2009

You may be able to produce an XML from that system that FileMaker can then grab.

lsmall · August 21, 2009

I believe you are correct. I have tried this in the past without much luck, probably because my XML knowledge is just not advanced enough, actually almost non-existent. I suppose I would need to create an XSL style sheet, but I really don't have much experience with this.

comment · August 21, 2009

There are many examples of XSLT stylesheets on these forums (both the Importing & Exporting section, and the XML/XSL section).

Another option is to use Applescript to run some shell commands. IIRC, a combination of curl and textutil can fetch the rendered text from a site.

bruceR · August 23, 2009

I am currently running an automated script that scrapes data from an HTML based lab system in my hospital. I collect study data this way. I was wondering if there was a way to get the html source without having to render HTML page first, as this would, theoretically, cut my data capture time significantly. Right now I use a loop to detect when the page has loaded in the webviewer and then scrape the code using getlayoutobjectattribute and then parse the code. It works great, but I'm always looking for ways to shorten the data collection time and this is the major limiting factor in the script.

Any thoughts would be appreciated.

Thanks

Perform applescript

Do shell script "curl http://www.filemaker.com"

copy result to cell "Source" of current record

lsmall · August 24, 2009

Thanks for all the help. I will give it a go. I'll see if I can get the applescript to work for me.

I found the Troi URL plug-in which seems like it has promise, so I may just take the easy way out.

Sign In

get html source without rendering?

Recommended Posts

lsmall

mr_vodka

lsmall

comment

bruceR

lsmall

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information