getting Different results with GetLayoutObjectAttribute in webviewer

hartmut · January 31, 2014

Hi

I am using the formula

GetLayoutObjectAttribute ("page" ; "content" )

to get the page source from a web page

http://www.wordproject.org/bibles/kj/40/18.htm

The calculation gets part of the source and puts it in a field

But when I manually view page source in safari I get the entire page of code .

I am trying to extract different bible verses in this case with the above url.

For example, I would like to get the verse 19 and bring it into a field.

The verse shows up in the page source but not in the GetLayoutObjectAttribute ("page" ; "content" ) calc.

How can I get the entire code by calculation of the page I have above in the above url?

I don't know what I am doing wrong. I need to do this calc a lot and it is very important that I get the full page code. I need to extract different verses.

Thanks for your help here

comment · January 31, 2014

Your question is not quite clear: the URL points to a chapter, not a verse. If I load the URL in a web viewer, I get the HTML code for the entire chapter 18, including verse 19:

...
<br><SPAN class="verse" id="19">19</SPAN> Again I say unto you, That if two of you shall agree on earth as touching any thing that they shall ask, it shall be done for them of my Father which is in heaven.
<br>
...

BTW, there's no shortage of sources for the text of the bible - I don't see why you need to resort to web scraping.

hartmut · February 1, 2014

There wasn't a clear way to point to a URL with a chapter and verse. The website only allows for me to go to the chapter. I will have to do further processing to extract the verse or verses that I would like to scrape. My goal is to be able to get those verses I want and have them load into my database. This is my hope to be able to accomplish. But my question was basically "what am I doing wrong that the chapter doesn't fully come up in my web viewer but it does when I manually load the page source code directly from my safari browser?

Having said that, I will ask again what I asked before

http://www.wordproje...es/kj/40/18.htm

The verse shows up in the page source but not in the GetLayoutObjectAttribute ("page" ; "content" ) calc. (using the web viewer)

How can I get the entire code by calculation of the page I have above in the above url?

I don't know what I am doing wrong. I need to do this calc a lot and it is very important that I get the full page code. I need to extract different verses. Even though there are many sources of Bible verses, I would like to know how to extract a particular verse or verses. But I am puzzled why the full code doesn't seem to load in my web viewer by it does appear from the safari browser separately. Unless I can get the full page source I can't parce the verse I am looking for.

I have a hard time getting a wifi signal because my house was destroyed in a flood. I have to travel a couple miles to get a wifi signal. So it is very hard to give a response to a reply right away. Sorry about that . I always hope someone will answer while I am waiting in the hotspot so I can at least go home with an answer.

Thanks

NB......To further support my reason for asking this question i have enclosed the two files. One is the text I get in the field sent by calculation from the webviewer in filemaker. The second is the page source simply copied from the safari browser by going to page source and copying it.

The sources .zip

comment · February 1, 2014

Where exactly are you using the GetLayoutObjectAttribute ( "page" ; "content" ) calculation? Is it in a calculation field? If so, is it unstored? What I see in the "from the web viewer .txt" file is that you are not even on the correct web page - so most likely, this is a stored calculation with nothing to trigger it to update.

IMHO, you should use a script to set the web viewer to the correct chapter, wait until it's loaded, then set a text field (or a variable) to the web viewer's content. That's assuming you need to do this at all; if you like, I will give you the text of the entire KJV in a FM-friendly format.

hartmut · February 1, 2014

I thank you for your help

I don't know about stored and unstored. I use it as a set field calculation. So I can make a button to execute the script.

I have need to so this in other languages and that is why I am trying to make this happen.

comment · February 1, 2014

I don't know about stored and unstored. I use it as a set field calculation.

Going by your description alone, it should work - so there must be something else. Does the attached work for you?

Scrape.fp7.zip

BTW, your profile shows v.12: why don't you use the Insert From URL[] script step instead of going through a web viewer?

hartmut · February 1, 2014

Sorry I couldn't get back to you . Had to wait to use the internet today.I didn't know I could do that (Insert from URL) Thanks

Now I just have to figure out how to get the range of verses I want and then just got those.

BTW I need to do this because I am using these verses for animations and they need to be in other languages so this getting a specific verse or several verses will make that process much much easier for me because I need to bring the other languages into Adobe after effects. So your help is really valuable.

Thanks!

www.youtube.com/watch?v=nAAoLc7gw64

That link is what I am doing. I need to capture verses or a range of verses in many languages. Just a little thing I am doing.

Thats why because I actually need the text in other languages and fonts. So as soon as I can capture just those verses I can really go to town on my project.

hartmut · February 2, 2014

The scrape file is working.. Thank you

I tried to modify the file you gave me. But I can't seem to extract a verse.

Could you help me? Otherwise I am stuck on the extraction.

I am enclosing 2 files The modified file and the verse I am trying to extract.

Matthew 18-19 Example.zip

Scrape modified.fmp12.zip

comment · February 3, 2014

See if this helps:

Scrape&Parse.fp7.zip

hartmut · February 3, 2014

Thank you SO much. I can work to try to understand. But this is a very happy day for me

There is much work to do.

Blessings to you

I don't understand why but when I try the parsing in other languages sometimes it doesn't do it.

Seemingly the code looks the same yet there is no response. I have included a file with 7 languages and some work and some don't . I don't know why.

Scrape&Parse check.fmp12.zip

comment · February 4, 2014

I am sorry, I am currently limited to version 11 so I cannot debug your file even if I wanted to.

hartmut · February 4, 2014

Thanks I appreciate it. You have been most helpful. I will attempt it. Take care. I'll keep you posted. I will keep asking people who know parsing well. I have compared the files and i can't for the life of me figure out why some work and some don't.

LaRetta · February 4, 2014

You are getting pos 0 results on de, it and af and it is because you have an extra space.

18

In Michael's file, there are no spaces before the closing span but in your file on the ones that break, there is a space at the location of the red X and if you remove that bogus space it works.

18X

LaRetta · February 4, 2014

Oh, and on ru ... I do not know html but it produces a Pos yet it still breaks with this type of result:

&nbsp;Истинно говорю вам: что вы свяжете...

When reviewing, it is:

<br><SPAN class="verse" id="18">18</SPAN> &nbsp;Истинно...

None of the other examples or records include a &nbsp ... why is it here? When I remove it the record seems to begin to work.

Lee Smith · February 4, 2014

That is a nice catch, my old eyes would have never spotted that.

LaRetta · February 4, 2014

I didn't spot it either, Lee ... when wondering why something breaks, I take the Let() variables and place them one at a time into the calc (commenting out the real calc) so I can view each value and pinpoint exactly where it breaks.

In the offending records, the third variable gave it away - it produced pos of 0 which meant the string being tested did not exist or did not match. Only then did I know to find the string below and look for differences and I STILL did not spot the extra space. But Exact() comparisons failed and Length() was different so I placed my cursor between the chevrons and the characters and backspace-tested until I found the extra space.

I ADORE this kind of stuff; it's like a detective novel (but much better). :yep:

Lee Smith · February 4, 2014

Have you used Wrangler to compare to pages?.

You can compare the two pages and it will denote any differences between them, as long as every line is the same position.

Lee

LaRetta · February 4, 2014

Sure, for larger text comparisons. However I knew there would be many differences in this text and since I knew exactly where the break occurred, I focused there. And I have a tool I use which quickly allows comparison of strings and inserting calculations so I used it instead. :-)

hartmut · February 4, 2014

Thanks for all the hard work thats going on.

Its mysterious and I thought it might be simple. This is an abnormal text parse to say the least.

LaRetta · February 4, 2014

Well now you know how it breaks. I should think it wouldn't be difficult to pre-process the html before/during parsing.

I would guess it would NEVER be okay to have a SPACE right before >. Can we count on that? If so this one would be safe:

Substitute ( html ; " >" ; ">" )

as for the nbsp, you might be able to use similar: Substitute ( html ; "> nbsp;" ; "> " ) but this is all WAG.

aded/CORRECTED blue

Edited February 4, 2014 by LaRetta

Sign In

getting Different results with GetLayoutObjectAttribute in webviewer

Recommended Posts

hartmut

comment

hartmut

comment

hartmut

comment

hartmut

hartmut

comment

hartmut

comment

hartmut

LaRetta

LaRetta

Lee Smith

LaRetta

Lee Smith

LaRetta

hartmut

LaRetta

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information