Jump to content
Claris Engage 2025 - March 25-26 Austin Texas ×

This topic is 5732 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Posted (edited)

I have exported the html contents of a webviewer as text to a text field in another layout.

I am trying to write a script to go through the text and extract urls from within it.

All urls in the text begin http and the string is contained within " "

I am running a looping script that goes to the first word, if its first 4 characters are http (Left ( $thisword ; 4 ) = "http") then i set the current word plus the next 20 to another variable which is exported to another field.

The current word then increments by +1 and stops when current word = last word.

This is ugly! I wonder if anyone can help me get the exact text of the url contained in the " " separators.

Any help much appreciated!

Basically, how do i exit a loop if the current word is followed by a " separator....

Edited by Guest
Posted

Roughly:

Loop

Set Variable [ $i ; $i + 1 ]

Exit Loop If [ $i > PatternCount ( text ; "http" ) ]

SetVariable [ $url ; <> ]

Peform Script [ New URL Record ; parameter: $url ]

End Loop

and the <> would be:

Let ( [

start = Position ( text ; "http" ; 1 ; $i ) ;

end = Position ( text ; """ ; start ; 1 )

] ;

Middle ( text ; start ; end - start )

)

  • 1 month later...
  • Newbies
Posted

To extract URLs from a html, asp, php, text, etc. documents, there is a good script posted at http://www.biterscripting.com/SS_URLs.html .

To use, do the following. (With high speed internet, this entire process, including installation, should take no more than a couple of minutes.)

1. Download and install biterscripting at http://www.biterscripting.com .

2. Start biterscripting and enter the following command .

script "http://www.biterscripting.com/Download/SS_AllSamples.txt"




(biterscripting can execute scripts directly from a web site)



3. Now you are ready to use the SS_URLs script to extract URLs. This is done with the following command.




script "C:/Scripts/SS_URLs.txt" URL("http://....")




The above will extract URLs referenced in that web page. OR,




script "C:/Scripts/SS_URLs.txt" URL("C:/....")

The above will extract URLs referenced in that local file.

Hope this helps.

Patrick

This topic is 5732 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.