Steve G Posted February 7, 2010 Posted February 7, 2010 (edited) I built a script that takes data from a number of different fields, converts it into simple HTML, pastes the aggregate information into a single text field, and then exports that text field to a file on the desktop. It all works quite smoothly. However, there's a problem and I don't know where it stems from. If I open the exported file (which has an .html extension, btw) in my web browser it renders correctly. If I upload the exported file to the web site and then open it, I get garbled Chinese characters along with a few random strings of Roman text. IOW: If I open the file locally it renders correctly, but if I open it remotely it doesn't. In experimenting I discovered that if I open the file in a text editor (TextWranger in this case), select all, make a new document, paste, and save the new document, the resulting file renders properly both locally and remotely. Further investigation shows that the file that was exported from FileMaker is almost twice as large as the file that was saved from TextWranger (29,972 bytes vs 15,009 bytes). In either case TextWrangler says that the two files contain the same amount (14,985) of characters. I think what's happening is FileMaker is exporting the field as a Rich Text file but I see no way to verify this (or change it if I'm right). This would account for the doubled file size; one is RTF, the other is ASCII. How can I get FileMaker to export my text field as plain old ASCII? Edited February 7, 2010 by Guest
comment Posted February 7, 2010 Posted February 7, 2010 When you export field contents, the file is UTF-16 encoded. A better way to export to HTML is through XML - see the example files included with the application.
bruceR Posted February 7, 2010 Posted February 7, 2010 The "export field contents" script step suffers from a couple of problems. Encoding of the result is not documented in Help. It is UTF-16 little endian. Further, this step does not give us encoding options like other export script steps. Better to select a single record and use a standard tab-delimited export to export just this field. Of course this won't allow you to export any tabs that are contained in the field that you may wish to retain.
bruceR Posted February 7, 2010 Posted February 7, 2010 Another option is to use the existing command and then use a shell script to do character translation. If you are familiar with Terminal, do man iconv. Partial result: ICONV(1) Linux Programmer's Manual ICONV(1) NAME iconv - character set conversion SYNOPSIS iconv [OPTION...] [-f encoding] [-t encoding] [inputfile ...] iconv -l DESCRIPTION The iconv program converts text from one encoding to another encoding. More precisely, it converts from the encoding given for the -f option to the encoding given for the -t option. Either of these encodings defaults to the encoding of the current locale. All the inputfiles are read and converted in turn; if no inputfile is given, the standard input is used. The converted text is printed to standard output. The encodings permitted are system dependent. For the libiconv implementation, they are listed in the iconv_open(3) manual page.
bruceR Posted February 7, 2010 Posted February 7, 2010 (edited) See attached for simple shell export method, result is UTF-8. Note that there are limits to the amount of data that can be passed in this way. The limit is about 258,000 characters. ShellExport.fp7.zip Edited February 8, 2010 by Guest
Steve G Posted February 8, 2010 Author Posted February 8, 2010 Better to select a single record and use a standard tab-delimited export to export just this field. Of course this won't allow you to export any tabs that are contained in the field that you may wish to retain. There's no tabs in the field so that's not a problem. There ARE line breaks (RETURNs) but if I'm going to HTML that shouldn't be a problem. I tried your advice; I changed the script step from "Export Field Contents" to "Export Records" using the tab-delimited format. This time when I open the file in my browser it was oddly distorted. Most of my text was there but every apostrophe and quote mark was turned into a symbol that looks like a white question mark inside a black diamond. Plus, my CSS information was disabled/ignored When I tried to open the file in TextWrangler I got an alert: "The UTF-8 file...is damaged or incorrectly formed; please proceed with caution." I did a search and replace to swap the question mark-diamond for an apostrophe and resaved. The apostrophes appeared as you'd expect but my CSS info is still MIA.
Steve G Posted February 8, 2010 Author Posted February 8, 2010 See attached for simple shell export method, result is UTF-8. Note that there are limits to the amount of data that can be passed in this way. The limit is about 258,000 characters. The limit shouldn't be a problem; the most I could ever envision passing is about 40,000 characters, maybe 50,000 at most. The average is around 15,000-20,000. I'll admit I haven't done much work with shell commands. I opened your file and ran it, and got two back-to-back errors: "Object not found" followed by "Unknown error: -1728". Looking at your script, the $sourceField Set Variable command shows a function missing. Forgive my ignorance, but is there something I need to insert there?
Steve G Posted February 8, 2010 Author Posted February 8, 2010 After some Googling it appears that the cause of my woes appears to be something called a "vertical tab" character which FileMaker uses to replace a CR in any given field. If I export the record (as opposed to the field), open it in TextWrangler, S&R the vertical tab for a CR, then save, it comes out perfect. There doesn't appear to be a way to manage this inside of FileMaker so now I just need a way to do it outside. Bruce, do you have a script? I was able to do it manually in TextWrangler by copying the space between what I knew would be two lines and pasting it into the search field, then using "r" in the replace field which is what TextWrangler interprets as a CR. TextWrangler seems to interpret the vertical tab as "x0B".
bruceR Posted February 8, 2010 Posted February 8, 2010 You might have been too fast. I did some testing and uploaded later versions that should work properly. Regarding tab export, please note that there you DO have export encoding choices, including ASCII.
Steve G Posted February 8, 2010 Author Posted February 8, 2010 Yes, but the ASCII export substitutes those vertical tabs for the CRs that are in the field, so that's unusable without some kind of post-processing. In other words: Exporting the field contents gives me the UTF-16 file, and exporting the record gives me vertical tabs. Either way, the resulting file does not render properly despite having otherwise-correct HTML in it. I redownloaded your script and I'm getting the same errors and the same missing object, btw. It really does blow my mind that there's no way to do a straight-up 100% pure plain text export from FileMaker. BTW, I redownloaded your script and I'm getting the same errors that I got the first time I downloaded it.
bruceR Posted February 8, 2010 Posted February 8, 2010 The export script as currently written requires FileMaker 10; it uses the getFieldName function in the first line. You can rewrite it to "hard wire" the field names. Why don't you do s substitution to change the returns to or whatever would meet your needs?
Recommended Posts
This topic is 5402 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now