Jump to content

Recommended Posts

Pardon my brute force learning methods :)  I'm using this from another thread here since it's pretty much what I'm trying to learn.

My intention here is more to verify what I've got already in my database against the book(s) in my possession. But if there's something missing may as well grab it.  (and FWIW, a lot of their data is hokey anyway, at least for books of 1980s computing, but that's for another forum/topic)

 

I've got some general questions on Set Variable. I'm just not getting whats going on here with the script steps. Why some of these work, some don't.

1) Text leading up to the value you want

2) Text after what you want.

3) Setting up the conditions?

4) .. if #1 & #2 are met, this starts? Otherwise, move along. This is not the text you are looking for.

5-6) ...  doing something with the matched text.. but I'm just not following what is happening here. I have some lines that work, others that don't and I'm not seeing why.

7) Stuff the result in the target field and move along. Got it.

It's probably that I have no idea what's going on after line 2.

 

This one for ISBN13 works. 

# ISBN13
1) Set Variable [ $prefix ; Value: "ISBN13</th> <td>" ]
2) Set Variable [ $suffix ; Value:  "</td>" ]
3) Set Variable [ $start ; Value: Position ( $text ; $prefix ; 1 ; 1 ) ]
4) If [ $start ]
5) Set Variable [ $result ; Value: Let ( [ start = $start + Length ( $prefix ) ; end = Position ( $text ; $suffix ; start ; 1 ) ] ; Middle ( $text ; start ; end - start ) ) ]
6) Set Variable [ $result ; Value: Trim( Substitute( TrimAll($result; 0;0 ) ; [Char(10); ""] ; ["'"; ""] )) ]
7) Set Field [ Table::Zip ; $result ]
8) End If

But changing it slightly for the other ISBN field..  I just can't seem to capture that field. No matter which of these scripts I try.

# ISBN10
Set Variable [ $prefix ; Value: "ISBN</td>" ] 
Set Variable [ $suffix ; Value:  "</td> </tr>" ] 
Set Variable [ $start ; Value: Position ( $text ; $prefix ; 1 ; 1 ) ] 
If [ $start ] 
Set Variable [ $result ; Value: Let ( [ start = $start + Length ( $prefix ) ; end = Position ( $text ; $suffix ; start ; 1 ) ] ; Middle ( $text ; start ; end - start ) ) ] 
Set Variable [ $result ; Value: Trim( Substitute( TrimAll($result; 0;0 ) ; [Char(10); ""] ; ["&#039;"; ""] )) ] 
Set Field [ Table::State ; $result ] 
End If
# 
# ISBN10 Alt
Set Variable [ $prefix ; Value: "<th>ISBN</td> <th>" ] 
Set Variable [ $suffix ; Value:  "</td>" ] 
Set Variable [ $start ; Value: Position ( $text ; $prefix ; 1 ; 1 ) ] 
If [ $start ] 
Set Variable [ $start ; Value: 1+ Position ( $text ; ">" ; $start + Length( $prefix); 1 ) ] 
Set Variable [ $result ; Value: Let ( [ start = $start ; end = Position ( $text ; $suffix ; start ; 1 ) ] ; Middle ( $text ; start ; end - start ) ) ] 
Set Variable [ $result ; Value: Trim( Substitute( TrimAll($result; 0;0 ) ; [Char(10); ""] ; ["&#039;"; ""] )) ] 
Set Field [ Table::State ; $result ] 
End If

Below is from Firefox. Why I can't see the same thing in the Content tab in FM, I have no idea. Voodoo probably.

 <!-- <img src="/sites/default/files/default-book-cover.jpg" style="height:250px; width:190px; background-color:#dddddd"/> -->
                <object height="250px" width="190px" data="https://images.isbndb.com/covers/74/35/9780201177435.jpg" type="image/png">
                 <img height="250px" width="190px" src="/modules/isbndb/img/default-book-cover.jpg" />
                </object>
            </div>
            <div class="book-table col-xs-12 col-md-6">
              <table class="table table-hover table-responsive ">
                                <tr> <th>Full Title</th> <td>Apple Iigs Hardware Reference</td> </tr>
                                                <tr> <th>ISBN</td> <th>0201177439</td> </tr>
                                                <tr> <th>ISBN13</th> <td>9780201177435</td> </tr>
                                                <tr> <th>List Price</th> <td>USD $24.95</td> </tr>
                                                <tr> <th>Publisher</th> <td><a href="/publisher/Longman Pub Group">Longman Pub Group</a></td> </tr>
                                                <tr> <th>Authors</th> <td>
                                    <a href="/author/Inc. Apple Computer">Inc. Apple Computer</a><br />
                                </td> </tr>
                                
                
                                <tr> <th>Edition</th> <td>1</td> </tr>
                                                <tr> <th>Publish Date</th> <td>1987</td> </tr>
                                                <tr> <th>Binding</th> <td>Hardcover</td> </tr>

 

parsing and scraping MOD ISBN.fmp12.zip

Link to post
Share on other sites
Posted (edited)

First of all, web scraping is for the dogs. You are at the mercy of the web page's author, and even an addition of an insignificant space will break your code. 

Now, I took a quick look at your file. I see that the ISBNs are on these two lines:

                                                <tr> <th>ISBN </th><th>0830631291 </th></tr>
                                                <tr> <th>ISBN13</th> <td>9780830631292</td> </tr>

Your script says:

Set Variable [ $prefix ; Value: "ISBN</td>" ] 

but the ISBN 10 value is preceded by:

"ISBN </th><th>"

so that's already not working. I did not check the rest.

Next you say that the code you get in Firefox is different. I verified that and it is true: the web site returns a different content when the browser is Firefox (or Safari). Which brings me back to my first point.

So what exactly was your question?

 

Edited by comment
Link to post
Share on other sites

Yes, I know the evils of scraping and the mercy of the operator.. 

 

ISBN13 works.

The one for ISBN does not. I can use the exact same statement and change the prefix to match the ISBN13 and it works.

There's a couple there, I had them en/disabled individually, I left it like that for the copy/paste. Different ways of trying it.  No matter what, I can't seem to capture that

Firefox and Filemaker show slightly different page source for that table, why? I don't know. The other entries work straight up. That's what I'm not getting here. Actually understanding all that syntax in the Set Variable lines is would probably be a good start. :)

 


            <div class="book-table col-xs-12 col-md-6">
              <table class="table table-hover table-responsive ">
                                <tbody><tr> <th>Full Title</th> <td>Apple Iigs Hardware Reference</td> </tr>
                                                <tr> <th>ISBN </th><th>0201177439 </th></tr>
                                                <tr> <th>ISBN13</th> <td>9780201177435</td> </tr>
                                                <tr> <th>List Price</th> <td>USD $24.95</td> </tr>
                                                <tr> <th>Publisher</th> <td><a href="/publisher/Longman Pub Group">Longman Pub Group</a></td> </tr>
                                                <tr> <th>Authors</th> <td>
                                    <a href="/author/Inc. Apple Computer">Inc. Apple Computer</a><br>
                                </td> </tr>
                                
                
                                <

 

Link to post
Share on other sites
8 minutes ago, Tony Diaz said:

Firefox and Filemaker show slightly different page source for that table, why?

It is not uncommon for web sites to return different content to different browsers. The most obvious example is mobile browsers (that are often redirected to an entirely different page), but there are other differences that a web site might want to take into account.

BTW, it is interesting to note that the code returned for Firefox and Safari is actually wrong. Both:

<th>ISBN</td>

and:

<th>0201177439</td>

have unmatched start and end tags.

 

7 minutes ago, Tony Diaz said:

Actually understanding all that syntax in the Set Variable lines is would probably be a good start.

It's actually quite simple: first you find the position of the prefix; then you find the position of the suffix; and finally you extract the text between the end of the prefix and the start of the suffix. Load this into your data viewer and break it apart to see how it works:

Let ( [
text = "some text that contains an important message ending here." ; 
prefix = "important " ;
suffix = " ending" ; 
start = Position ( text ; prefix ; 1 ; 1 ) + Length ( prefix ) ;
end = Position ( text ; suffix ; start ; 1 )
] ;
Middle ( text ; start ; end - start )
)

This is also to show that you don't need all those SetVariable steps; it can be all done within a single Let() statement.

 

  • Like 1
Link to post
Share on other sites
Posted (edited)

So, that would be doing it as a function vs. a script.  Position ( table::field ; prefix..  How would it get at the web viewer data then?

(Another area I need to work on more, script, vs function vs calculation)

3 hours ago, comment said:

BTW, it is interesting to note that the code returned for Firefox and Safari is actually wrong. Both:

... and -that- explains it.  I was wondering why those tags were flipped around. I kept looking back and forth at that stuff. The source is easier to read on FireFox. All the other stuff comes across the same. Just not that line. Otherwise it works now.

 

Edited by Tony Diaz
Link to post
Share on other sites

I would still keep it as a script, just reduce it to something like:

Set Variable [ $html ; GetLayoutObjectAttribute ( "myWebwiewer" ; "content" ) ]
Set Field [ Books::Title ; Let ( ... ) ]
Set Field [ Books::ISBN10 ; Let ( ... ) ]
Set Field [ Books::ISBN13 ; Let ( ... ) ]
# ...

Or even better, eliminate the web viewer and use the Insert from URL script step to populate the $html variable. Just check which version you get from this call.

 

Link to post
Share on other sites

Ah, I see now. The Let part is being done in as one of the parameters, so you use the calculation editor to put that in just like I'm doing the simple bits of prefix/suffix now., but use the actual editor instead of the dialog box. D'oh!

I see now. :) a lot less voodoo this way too.

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
  • Similar Content

    • By Spidey
      Hi,
      I have two table: Invoice and Customer.  I like to have the total of all the invoice for a customer between certain date in the Customer portal that show all the customers, but I got a error when I try to debug..
      ExecuteSQL("SELECT SUM(I.TotalAmount) FROM Invoice I JOIN Customer C ON I._kf_CustomerID = C.__kp_CustomerID WHERE date(I.InvoiceDate) between date(C.SearchFromDate ) and   date(C.SearchToDate )" ; "" ; "" )
      I have an error and couldn't figure it out.  Thanks...
      KC


    • By Todd Dignan
      I have a client that has been using a send email script step  that brings up the outlook email client on the desktop.  This as worked for years no problem.  It has stopped work on 3 of 35 computers within the last two weeks.  I talked with there IT personal and they have assured me that no updates have happened.  The actual error is -
      Microsoft Office Outlook
      Either there is no default mail client or the current mail client cannot
      fulfill the messaging request.  Please run Microsoft Outlook and set it as
      the default mail client.
       
      I have double checked with system default  and Outlook's settings.  Both are set to default.
      The client is using the latest version of office 360's and the latest version of FileMaker 18 advance. Both 64bit.
      Any suggestions are welcome.


    • By tbcomputerguy
      I have an excel sheet that controls bills of ladings for a forestry company.  In the example you can see that there is lots going on with this Bill.  It has a payperiod, mill, truck that delivered it, etc.
      I would like setup a database to monitor this.  The fields CT1, CT2, Skid1, Skid2. PROC1, PROC2 are all contractor numbers.  There are 6 contactors.  The percentages in each line are the amount of the volume they performed  In the third line there is a value in CT1 only...they get 100% of the volume.  I can figure out most of this, but am stumped on how I can monitor when a contractor does multiple jobs..ie in line one, contractor 5, cuts and skids.  All 6 contractors could be involved in one BOL. Each one of these jobs, cutting, skidding and processing each has their own respective rate of pay as well.   I think i need a way to break down each line so that I can produce pay summaries for each of the contractors.  I had started this years ago, and thought I asked in a forum, but can't remember where.  Nonetheless, they stopped using multiple contractors per load...Now they have returned, so I am back at it.  So if this is a repost from years ago I apologize.  
      Thanks in Advance
      tbcomputerguy
       
       

    • By dancer5678
      I am using Filemaker Server 18 on Windows Server 2012 R2
      Been using it for years with no issues
      Currently when I log in to the console it is very sluggish.
      When I get to the Dashboard it shows No databases, then it auto refreshes and the database list appears.
      Within 15 seconds of scrolling the database list to open files the screen refreshes. This situations is happening over and over in a loop.
      Any Thoughts on what is causing this issue?
       
       
       
       
    • By stevaroni
      I get an error 3 when using a script to Export Records via WebDirect. Using FileMaker Server 18 and have tried both Safari and Chrome both with same results. I have tried using the temporary path, desktop path, and documents path. I have tried using with the automatically open and not. I have tried writing a tab delimited and comma delimited file. Does anyone have ideas I haven't yet tried?
  • Who Viewed the Topic

    11 members have viewed this topic:
    whardy  elipsett  millmaine  kerver  mr_vodka  comment  ChangeAgent  Aussie John  doughemi  bcooney  Sky Willmott 

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.