Jump to content
Claris Engage 2025 - March 25-26 Austin Texas ×
The Claris Museum: The Vault of FileMaker Antiquities at Claris Engage 2025! ×

This topic is 4002 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Posted

Hi

I am using the formula

 

 GetLayoutObjectAttribute ("page" ; "content" )

to get the page source from a web page 

 

 http://www.wordproject.org/bibles/kj/40/18.htm

 

The calculation gets part of the source and puts it in a field

But when I manually view page source in safari I get the entire page of code .

 

 

I am trying to extract different bible verses in this case with the above url.

For example, I would like to get the verse 19 and bring it into a field.

The verse shows up in the page source but not in the GetLayoutObjectAttribute ("page" ; "content" ) calc.

 

 

How can I get the entire code by calculation of the page I have above in the above url?

I don't know what I am doing wrong.  I need to do this calc a lot and it is very important that I get the full page code.    I need to extract different verses.

 

Thanks for your help here

Posted

Your question is not quite clear: the URL points to a chapter, not a verse. If I load the URL in a web viewer, I get the HTML code for the entire chapter 18, including verse 19:

...
<br><SPAN class="verse" id="19">19</SPAN> Again I say unto you, That if two of you shall agree on earth as touching any thing that they shall ask, it shall be done for them of my Father which is in heaven.
<br>
...

BTW, there's no shortage of sources for the text of the bible - I don't see why you need to resort to web scraping.

Posted

There wasn't a clear way to point to a URL with a chapter and verse. The website only allows for me to go to the chapter.  I will have to do further processing to extract the verse or verses that I would like to scrape.  My goal is to be able to get those verses I want and have them load into my database. This is my hope to be able to accomplish. But my question was basically "what am I doing wrong that the chapter doesn't fully come up in my web viewer but it does when I manually load the page source code directly from my safari browser? 

 

Having said that, I will ask again what I asked before

 

 http://www.wordproje...es/kj/40/18.htm

 

The verse shows up in the page source but not in the GetLayoutObjectAttribute ("page" ; "content" ) calc. (using the web viewer)

 

How can I get the entire code by calculation of the page I have above in the above url?

I don't know what I am doing wrong.  I need to do this calc a lot and it is very important that I get the full page code.    I need to extract different verses.  Even though there are many sources of Bible verses, I would like to know how to extract a particular verse or verses.  But I am puzzled why the full code doesn't seem to load in my web viewer by it does appear from the safari browser separately. Unless I can get the full page source I can't parce the verse I am looking for.

 

I have a hard time getting a wifi signal because my house was destroyed in a flood. I have to travel a couple miles to get a wifi signal. So it is very hard to give a response to a reply right away. Sorry about that .  I always hope someone will answer while I am waiting in the hotspot so I can at least go home with an answer.

Thanks

 

NB......To further support my reason for asking this question i have enclosed the two files. One is the text I get in the field sent by calculation from the webviewer in filemaker. The second is the page source simply copied from the safari browser by going to page source and copying it. 

The sources .zip

Posted

Where exactly are you using the GetLayoutObjectAttribute ( "page" ; "content" ) calculation? Is it in a calculation field? If so, is it unstored? What I see in the "from the web viewer .txt" file is that you are not even on the correct web page - so most likely, this is a stored calculation with nothing to trigger it to update.

 

IMHO, you should use a script to set the web viewer to the correct chapter, wait until it's loaded, then set a text field (or a variable) to the web viewer's content. That's assuming you need to do this at all; if you like, I will give you the text of the entire KJV in a FM-friendly format.

Posted

I thank you for your help

I don't know about stored and unstored.   I use it as a set field calculation.  So I can make a button to execute the script. 

I have need to so this in other languages and that is why I am trying to make this happen. 

Posted
I don't know about stored and unstored.   I use it as a set field calculation.

 

Going by your description alone, it should work - so there must be something else. Does the attached work for you?

 

Scrape.fp7.zip

 

BTW, your profile shows v.12: why don't you use the Insert From URL[] script step instead of going through a web viewer?

  • Like 1
Posted

Sorry I couldn't get back to you . Had to wait to use the internet today.I didn't know I could do that (Insert from URL)  Thanks

Now I just have to figure out how to get the range of verses I want and then just got those. 

 

BTW I need to do this because I am using these verses for animations and they need to be in other languages so this getting a specific verse or several verses will make that process much much easier for me because I need to bring the other languages into Adobe after effects.  So your help is really valuable.

Thanks!

 

www.youtube.com/watch?v=nAAoLc7gw64 

 

That link is what I am doing. I need to capture verses or a range of verses in many languages.  Just a little thing I am doing.

 

Thats why because I actually need the text in other languages and fonts.  So as soon as I can capture just those verses I can really go to town on my project.

Posted

Thank you SO much. I can work to try to understand. But this is a very happy day for me 

There is much work to do.

Blessings to you

 

I don't understand why but when I try the parsing in other languages sometimes it doesn't do it.

Seemingly the code looks the same yet there is no response. I have included a file with 7 languages and some work and some don't . I don't know why.

Scrape&Parse check.fmp12.zip

Posted

Thanks I appreciate it. You have been most helpful.  I will attempt it. Take care. I'll keep you posted. I will keep asking people who know parsing well. I have compared the files and i can't for the life of me figure out why some work and some don't.

Posted

You are getting pos 0 results on de, it and af and it is because you have an extra space.

 

<br><SPAN class="verse" id="18">18</SPAN>

 

In Michael's file, there are no spaces before the closing span but in your file on the ones that break, there is a space at the location of the red X and if you remove that bogus space it works.

 

<br><SPAN class="verse" id="18">18X</SPAN>

 

 

Posted

Oh, and on ru ... I do not know html but it produces a Pos yet it still breaks with this type of result:

&nbsp;Истинно говорю вам: что вы свяжете...

When reviewing, it is:

<br><SPAN class="verse" id="18">18</SPAN> &nbsp;Истинно...

None of the other examples or records include a &nbsp ... why is it here?  When I remove it the record seems to begin to work.

Posted

I didn't spot it either, Lee ... when wondering why something breaks, I take the Let() variables and place them one at a time into the calc (commenting out the real calc) so I can view each value and pinpoint exactly where it breaks.

 

In the offending records, the third variable gave it away - it produced pos of 0 which meant the string being tested did not exist or did not match.  Only then did I know to find the string below and look for differences and I STILL did not spot the extra space.  But Exact() comparisons failed and Length() was different so I placed my cursor between the chevrons and the characters and backspace-tested until I found the extra space.  

 

I ADORE this kind of stuff; it's like a detective novel (but much better).  :yep:

Posted

Have you used Wrangler to compare to pages?.

 

You can compare the two pages and it will denote any differences between them,  as long as every line is the same position.

 

Lee

Posted

Sure, for larger text comparisons.  However I knew there would be many differences in this text and since I knew exactly where the break occurred, I focused there.  And I have a tool I use which quickly allows comparison of strings and inserting calculations so I used it instead.  :-)

Posted

Thanks for all the hard work thats going on.

Its mysterious and I thought it might be simple.  This is an abnormal text parse to say the least.

Posted (edited)

Well now you know how it breaks.  I should think it wouldn't be difficult to pre-process the html before/during  parsing. 

 

I would guess it would NEVER be okay to have a SPACE right before >.  Can we count on that?  If so this one would be safe:

 

Substitute ( html ; " >" ; ">" )

 

as for the nbsp, you might be able to use similar:  Substitute ( html ; "> nbsp;" ; "> " ) but this is all WAG.  

 

aded/CORRECTED blue

Edited by LaRetta

This topic is 4002 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.