Jump to content

HTML export text encoding


ccosner
 Share

This topic is 5637 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Issue: In FM6 I exported data to an HTML table and uploaded to web server, which parses data into an SQL database. In FM8, works fine too, except that diacritics are suddenly encoded in a different way. The format I was getting from FM6 was very handy because diacritics were encoded for example as follows

ñ = soft n
 which required no translation to load in a web page cleanly on pretty much any browser. Now in FM8 I've tried Windows ANSI, Macintosh, Western European, and none give me the encodings for diacritics that FM6 provided in its HTML export. Instead I get codes like 
226

often two in a row (presumably unicode control characters, but I'm not well versed in text encodings). These display erratically or not at all in a browser, and em dashes and quotations marks are also mangled.

So I'm thinking of either revamping this to work with an XML export and hopefully tweak the character encoding easily enough *OR* write a perl script with an array of all the backslash codes to change the output to what I consider better html.

Are there some other ideas out there for how to approach this? Thanks.

Link to comment
Share on other sites

Hi, been doing some exports to web sites recently and found this useful...

1. Create a calculation field - GetAsCSS(contentfield). This will also maintain the text formatting when you display it on the web.

2. Try setting up an ODBC connection to your web site. I'm using MySQL on the web servers and it works really well. Just need to click a button to do an update

:

Link to comment
Share on other sites

Thanks Matt. Using ODBC does seem like a better idea than the system I have. GetAsCSS( ) - right. The overkill of tags turned me off to that function at first, but I realize they're harmless, can be put to good use, and even customized with some substitutions.

Now I have to get the ODBC connection to work...

Link to comment
Share on other sites

Okay, so after learning that only FMS 8 Advanced will serve up ODBC I went back to massaging the data and found a solution without ODBC. FM8 export to HTML Windows ANSI does produce iso-8859-1, or LATIN1, and except for smart quotes, em dashes, and perhaps some other characters we're not using, this text can go right to a browser if the web page declares its charset as such. So I had to make sure the SQL db was storing data in LATIN1 (not ASCII).

The Perl script to get the filemaker data into the SQL was straightforward, except for one gotcha. Perl works in UTF-8 and handily mangles diacritics given to it in other character sets.

open (F, "<:encoding(iso-8859-1)", $file)


gets the data in cleanly. You do transformations on the text to clean up quotes and emdashes the way you want, but then Perl has it in UTF-8 again. So right before you put it in your SQL query, 


$string = encode("iso-8859-1", $string);

And voila, no oddball characters. Diacritics look good.

Link to comment
Share on other sites

This topic is 5637 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.