Carl Smith Posted November 21, 2007 Posted November 21, 2007 After creating a stylesheet that works (with a great help from Fenton) I am now having issues when the xml file to be imported contains (or is supposed to contain) non-English text characters (e.g. umlauts, scandinavian characters, etc). I've attached my stylesheet (that works for English text!) Any suggestions? WebOrdersPlaced.zip
comment Posted November 21, 2007 Posted November 21, 2007 I'd suggest you describe what those "issues" are, and attach an example of the source xml document. Most likely the source is not using Unicode text encoding.
Carl Smith Posted November 22, 2007 Author Posted November 22, 2007 Here is the source xml file. Curiosly when viewing this in WordPad the umlaut in the surname is visible, but when opening it Dreamweaver 2004 the umlaut is not visible. Which is more confusing! Also when amending the xml file first row from; <?xml version="1.0" encoding="utf-8"?> to read <?xml version="1.0"?> the issue seems to get resolved. My source file is an edited version based on what I usually receive from a third party (e.g. data changed but format is original)
comment Posted November 22, 2007 Posted November 22, 2007 I don't see a file, but what you said more or less supports my initial suspicion. It seems that while the document declares "utf-8", it doesn't actually conform to it. You should alert your data provider to this problem. Otherwise you'll have to either modify the source xml every time before importing it, or find a way to transliterate the problematic characters during or after import.
Carl Smith Posted May 7, 2008 Author Posted May 7, 2008 Hi comment I know it's some time ago when you answered my query, but I'm hoping you can look at this xml file, which I should have sent earlier, in reference to this subject. I requested the third party to remove the reference to encoding. My problem is becoming more prevalent as more orders are coming from Europe and elsewhere and ideally I need to automate the import process! Thanking you for your assistance in advance. orders.zip
comment Posted May 7, 2008 Posted May 7, 2008 (edited) AFAICT, the problem is caused not by any specific characters, but rather by undefined character entities, such as "auml" or "aring". These are not valid XML entities, and whoever uses them must make their definition avaiable to the XML parser - see: http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Entities_representing_special_characters_in_XHTML Edited May 7, 2008 by Guest
Recommended Posts
This topic is 6044 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now