Jump to content
View in the app

A better way to browse. Learn more.

FMForums.com

A full-screen app on your home screen with push notifications, badges and more.

To install this app on iOS and iPadOS
  1. Tap the Share icon in Safari
  2. Scroll the menu and tap Add to Home Screen.
  3. Tap Add in the top-right corner.
To install this app on Android
  1. Tap the 3-dot menu (⋮) in the top-right corner of the browser.
  2. Tap Add to Home screen or Install app.
  3. Confirm by tapping Install.

Featured Replies

Is it possible to parse the contents of a PDF document? Alternatively is it possible to 'grab' data from a website that is displayed from another database (i.e not as HTML). Obviously I could simply cut and paste the info into a text file and then parse it from there, but I wondered if anyone had any other ideas?

Thanks in advance.

I don't know much about PDF. There are applications that deal with them that are AppleScriptable. PDFOpen comes to mind. There are others which are dedicated to getting the contents as text, such as TextLightning. Trapeze (don't know if they're AppleScriptable). There are probably other geekier options also.

Reading from a web page is easy. You can use AppleScript and Safari.

tell application "Safari"

source of document 1

-- or (don't use both :P-)

text of document 1

end tell

Or you can use Unix command line to get the source:

do shell script "curl 'http://www.fentonjones.com'" without altering line endings

You'd use that if you wanted to continue parsing the source text with further Unix commands, such as 'grep' and 'cut', and 'sed'. All of which are confusing to use, but powerful. Open Terminal and type:

man grep

Usually it's more useful to get the entire source code, rather than just the "text" of the web page, because the source has all the html code, which you might need to identify exactly what you want to extract.

  • Author

tell application "Safari"

source of document 1

-- or (don't use both :P-)

text of document 1

end tell

In this case the text of the document was what I needed, as the web page is displaying info from another database, so doesn't show up in the 'source'. But it worked perfectly, and a small script cleared out the info I don't need and parsed the rest to relevant fields.

Many thanks once again!

Create an account or sign in to comment

Important Information

By using this site, you agree to our Terms of Use.

Configure browser push notifications

Chrome (Android)
  1. Tap the lock icon next to the address bar.
  2. Tap Permissions → Notifications.
  3. Adjust your preference.
Chrome (Desktop)
  1. Click the padlock icon in the address bar.
  2. Select Site settings.
  3. Find Notifications and adjust your preference.