laguna92651 Posted November 29, 2015 Posted November 29, 2015 I am trying to do some web scraping from the Yahoo! Finance site, the content I get back says the document has moved "The document has moved <A HREF="http://finance.yahoo.com/q?p..." etc with Insert from URL. If I use the same URL in Open URL, that I put in the Insert from URL function, the site is opened perfectly. Do some sites not work with Insert From URL? When I was testing the script it seemed like I was able to grab the source at one time, is the Yahoo URL dynamic? Thanks
comment Posted November 29, 2015 Posted November 29, 2015 4 hours ago, laguna92651 said: is the Yahoo URL dynamic? If you don't provide the URL, how are we supposed to know? 4 hours ago, laguna92651 said: Do some sites not work with Insert From URL? All sites "work" with Insert From URL. But web scraping using Insert From URL will not work with all sites. All that the step does is insert the HTML code of the linked page. If the page redirects, then you will end up with a field containing the redirecting code.
laguna92651 Posted November 29, 2015 Author Posted November 29, 2015 Here is the link to the site. http://finance.yahoo.com/q;_ylt=AtSfJUysKYLIoHTgg2pQ1fEgBrgF;_ylc=X1MDMjE0MjQ3ODk0OARfcgMyBGZyA3VoM19maW5hbmNlX3dlYl9ncwRmcjIDc2EtZ3AEZ3ByaWQDBG5fZ3BzAzEwBG9yaWdpbgNmaW5hbmNlLnlhaG9vLmNvbQRwb3MDMQRwcXN0cgMEcXVlcnkDXkdTUEMsBHNhYwMxBHNhbwMx?p=http%3A%2F%2Ffinance.yahoo.com%2Fq%3Fs%3D^GSPC%26ql%3D0&uhb=uhb2&fr=uh3_finance_vert_gs&s=^GSPC Thanks you
comment Posted November 29, 2015 Posted November 29, 2015 (edited) Well, this is "interesting". If I run cURL with the above URL, I get the expected page. However, If I run the same URL inside the BE_GetURL() external function (using the BaseElements plugin), I get the "The document has moved ... " message - although, according to the documentation, this function uses the cURL library. I don't know what causes the differences in response. I do, however, have a suggestion: try to get your data through an API, if at all possible, and use web scraping only as the last resort, when no API is available. --- BTW, I seem to get the same page using only http://finance.yahoo.com/q?s=^GSPC - and this works the same with both methods. Edited November 29, 2015 by comment
laguna92651 Posted November 29, 2015 Author Posted November 29, 2015 Can you point me to some information on how I would get the data with an API?
comment Posted November 29, 2015 Posted November 29, 2015 It's not fair to ask me to do your Googling for you. Still, I believe this could prove interesting:http://thesimplesynthesis.com/article/finance-apis#yahoo-yql-finance-api
OlgerDiekstra Posted November 29, 2015 Posted November 29, 2015 A response "The document has moved ... " is a server generated http error (302) and indicates that the document that used to be on that url has moved to a new url. The server generally includes that new url and most browsers will use that to load from there instead. Whether or not the url is dynamic is a question better asked Yahoo.
laguna92651 Posted November 30, 2015 Author Posted November 30, 2015 I didn;t mean for you to google it, I thought you just might have something handy. I'm not even sure I would know what to google for, other than Filemaker API. Thanks for you help I appreciate it.
comment Posted November 30, 2015 Posted November 30, 2015 31 minutes ago, laguna92651 said: I'm not even sure I would know what to google for, other than Filemaker API. No, it's the third-party data you want to get, and the third party API you want to get it from. Hint: if their API can provide an XML response, Filemaker can import it directly (with the help of an XSLT sylesheet): http://www.filemaker.com/help/14/fmp/en/html/import_export.18.17.html#1041831
laguna92651 Posted November 30, 2015 Author Posted November 30, 2015 Okay, thanks to aiming in the right direction, I'll give it a go.
Recommended Posts
This topic is 3280 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now