Jump to content

Scraping within Frame?


_henry_

This topic is 6007 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Hello all,

Is there a way for a web viewer to scrap information within a frame? I tried using View Source in IE or Firefox and they returned well.

However, while I did that on Web Viewer, it returned message that it didn't support the frame.

I attached the full message about the error.

Any help would be appreciated. Thank you.

Archive.txt

Link to comment
Share on other sites

It's not possible, because the information is not there - it's in another document. It's not possible with a regular browser either.

I am not sure what the attached document means. If you point your web viewer to a simple example using frames, such as http://www.unh.edu/NIS/Courses/Frames/Simple/simple-frame.html you'll see that frames are supported, and that the content returned by GetLayoutObjectAttribute() is identical to what you get by selecting 'View Page Source' in any browser.

If you have problems with a specific site, perhaps you should include its URL.

Link to comment
Share on other sites

Hello Comment,

Thank you for your response.

For example, for the website that you've mentioned, I wrote simple script like this to get the "View Source" of the website:

Set Field[Gateway::Testing; GetLayoutObjectAttribute ("myBrowser";"content")]

And it returned me like this:

You need a frames-capable browser.

FYI, I have IE 7 and Firefox. I viewed okay with IE 7 and while I viewed it with Firefox 2.0.0.9, then it showed the message above.

Moreover, do you know what kind of browser type or engine did Web Viewer use?

Link to comment
Share on other sites

That is not a message. That IS the content of the HTML page you are viewing. You'll get exactly the same thing if you choose 'View Page Source' in your browser.

Web Viewer is just another instance of your OS-native browser: Safari on Mac, Explorer on Windows.

Link to comment
Share on other sites

Comment,

As you said, if I viewed it with "View Source", the message will be the same. You can see my attachment. I don't know why that's happened.

I also attached the simple form for scraping your given URL.

Moreover, if you said that was the content of the website, so is it possible that the Web Viewer did the "View Source"?

Thanks

test.gif

Gateway.zip

Link to comment
Share on other sites

Somtimes you can get around this by only loading the frame source in the webviewer, I have done this successfully. I have also seen it not work this way (The frame source when loaded checking if it's in a frame and reloading the framed page.)

Attached is an example of "breaking" the frames into thier own window.

One thing to remember anytime you are scraping a page, you are at the mercy of the code on that page and there are ways that pages can be written that will prevent scraping from working.(and even a minor change can force you to rewrite scripts. This is a high matenaince solution.)

Frames.zip

Link to comment
Share on other sites

Thank you for your answer and help.

Well, it seems that this program called "User Agent Switcher" made by Chris Pederick has helped me especially for a website that do not support directly with Firefox (which is ONLY can be opened with IE ver 6.0 and up).

I will provide his URL for further information:

http://chrispederick.com/work/web-developer/

Thank you once again. :( :)

Link to comment
Share on other sites

This topic is 6007 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.