Jump to content
Claris Engage 2025 - March 25-26 Austin Texas ×
The Claris Museum: The Vault of FileMaker Antiquities at Claris Engage 2025! ×

This topic is 4476 days old. Please don't post here. Open a new topic instead.

Recommended Posts

  • Newbies
Posted

I am pulling down the html text in to a file and i want to pull out the links

I saw get image URLs

My end goal

monitor member sites for changes daily.

When the change happen notify different members in their network so that the change can be noted on their sites also

Ie a link and teaser Text

Rob

Posted

I'd suggest using an HTML parser such as jsoup, you can then retrieve the image link list easily, e.g.

import org.jsoup.Jsoup;

import org.jsoup.nodes.Document;

import org.jsoup.nodes.Element;

import org.jsoup.select.Elements;

String results = "";

String url = 'http://www.amazon.com';

Document doc = Jsoup.connect(url).get();

Elements media = doc.select("[src]");

for (Element src : media) {

if (src.tagName().equals("img"))

results = results + src.tagName() + ':' + src.attr("abs:src") + 'n';

}

return results;

Posted

@fseipel - thanks for the pointer, that looks a great library.

@rob pritts - do you want to 'learn' how to take advantage of the pointer, or just have someone give you a solution to a workflow that is once outlined by a sketch?

If the first

Go and find jsoup and download the jar.

Read the documentation, and particularly the cookbook examples, and you will find that fseipel has simplified it already for you for the case you outlined.

Import the jar to you SM demo file and create a function using the code

Test it with some real urls

Make it a registered function following one of the methods outlined by 360works

Integrate the results into a FileMaker workflow to achieve your outlined expectation.

As a side point. Can you explain how you intend to define if an image is 'new' if it is uploaded with the same name as one from yesterday??

  • Newbies
Posted

I would love to have some one do it at one point but also need to learn.

I am looking for content changes. but was saw it could get images from SM so was pointing it out

Thanks

Posted

@john: Thanks for fleshing out the code example.

In past I've also used Filemaker's string functions to parse HTML, but the Java libraries seem like a potentially better choice, less likely to break, and more readable/maintainable.

I don't see how the OP will be able to tell if an image has changed, short of downloading it and doing a byte comparison against the last downloaded copy, or at least a file byte length comparison (less reliable). If it has to do that over a large number of users/pages, it may be quite slow and a bandwidth hog. If the site(s) offer web services, that may be a much better alternative for data acquisition. I interact a lot with amazon.com, and I always use the web services, except for data which isn't provided by the web services. Screen scraping, in contrast, is slower, and more prone to breaking.

This topic is 4476 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.