Jump to content
Claris Engage 2025 - March 25-26 Austin Texas ×

This topic is 5797 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Posted

Hi,

Is there anyone who know a good freeware with which I could convert XHTML tags and content to a XML file?

Thank

Posted

I know, but I want to extract some data from an XHTML source so I can import them in Filemaker.

Posted (edited)

Yes, we can and no, I don't. There are a few examples in the Extras folder next to your application, and more on FMI's site.

And of course there are a number of threads here, either in this or the Export section.

Edited by Guest
Posted

I attached a source file example.

I want to extract the content of the rows where the tr tag and its attribute class="txt_general".

I just want to know where to start.

In this example, the primary key is 5176748. You can locate the right tag with the ID.

srcData.zip

Posted

I don't really know what you expect. I took a look at your document: although I was able to extract data from it using an external XSLT processor, Filemaker refuses to import it directly for some reason. I suspect it may because the encoding is declared incorrectly.

Posted

It's an OS X application called TestXSLT, and it has a choice of 4 processors: Sablotron, libxslt, Saxon and Xalan-J. Mind you, it doesn't do anything of itself - I had to write a stylesheet for it to run against your file.

Posted

Can't hurt - but I believe you're going to have problems with your source. It's not strict XHTML (doesn't even have an XML declaration), and as I said, I suspect it has other issues as well. If you need to to do this often, you should look for a better alternative.

Posted

I had to modify the source because two tags weren't standard.

Maybe I'll do something in C# with regular expression to extract the data.

Posted

Yes, I need to extract some data from a search result. I will need to do this in a regular basis. There might as much as 200 rows to extract from the page.

I thought it would be easier using an XML+XSLT parser.

Posted

If you have any control over the server side, that would be the best place to make changes. Otherwise, you might be able to script the extraction in a web viewer, I think (you never said what exactly you need to extract, and I think there's only one record in your sample).

If external pre-processing is an option, I believe you could run the file against a XSLT stylesheet from the command line, or use a third-party app. (don't know what's available for Windows)

Posted

I attached a file with multiple rows.

I want extract the data in the rows. Eg: Address, city, price, ...

This is an original source, there are two errors that make the file not XML compliant.

btw :P It's in french.

srcExample.txt

Posted

Please zip your file before attaching, so we can eliminate a possible cause of the encoding mismatch. The file says "charset=iso-8859-1", but it is really UTF-8.

Posted

I know, but I have no control on the server side. Okay, I'll zip the file next time.

This topic is 5797 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.