Jump to content

Insert from URL Problem


This topic is 3162 days old. Please don't post here. Open a new topic instead.

Recommended Posts

I am trying to do some web scraping from the Yahoo! Finance site, the content I get back says the document has moved "The document has moved <A HREF="http://finance.yahoo.com/q?p..." etc with Insert from URL. If I use the same URL in Open URL, that I put in the Insert from URL function, the site is opened perfectly. Do some sites not work with Insert From URL? When I was testing the script it seemed like I was able to grab the source at one time, is the Yahoo URL dynamic?

Thanks

Link to comment
Share on other sites

4 hours ago, laguna92651 said:

is the Yahoo URL dynamic?

If you don't provide the URL, how are we supposed to know?

 

4 hours ago, laguna92651 said:

Do some sites not work with Insert From URL?

All sites "work" with Insert From URL. But web scraping using Insert From URL will not work with all sites. All that the step does is insert the HTML code of the linked page. If the page redirects, then you will end up with a field containing the redirecting code.

Link to comment
Share on other sites

Well, this is "interesting". If I run cURL with the above URL, I get the expected page. However, If I run the same URL inside the BE_GetURL() external function (using the BaseElements plugin), I get the "The document has moved ... " message - although, according to the documentation, this function uses the cURL library.

I don't know what causes the differences in response. I do, however, have a suggestion: try to get your data through an API, if at all possible, and use web scraping only as the last resort, when no API is available.

---

BTW, I seem to get the same page using only http://finance.yahoo.com/q?s=^GSPC - and this works the same with both methods.

Edited by comment
Link to comment
Share on other sites

A response "The document has moved ... " is a server generated http error (302) and indicates that the document that used to be on that url has moved to a new url. The server generally includes that new url and most browsers will use that to load from there instead.

Whether or not the url is dynamic is a question better asked Yahoo.

Link to comment
Share on other sites

31 minutes ago, laguna92651 said:

I'm not even sure I would know what to google for, other than Filemaker API.

No, it's the third-party data you want to get, and the third party API you want to get it from.

Hint: if their API can provide an XML response, Filemaker can import it directly (with the help of an XSLT sylesheet): http://www.filemaker.com/help/14/fmp/en/html/import_export.18.17.html#1041831

Link to comment
Share on other sites

This topic is 3162 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.