Jump to content
Claris Engage 2025 - March 25-26 Austin Texas ×

Text characters causing import errors


This topic is 6044 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Posted

After creating a stylesheet that works (with a great help from Fenton) I am now having issues when the xml file to be imported contains (or is supposed to contain) non-English text characters (e.g. umlauts, scandinavian characters, etc).

I've attached my stylesheet (that works for English text!)

Any suggestions?

WebOrdersPlaced.zip

Posted

I'd suggest you describe what those "issues" are, and attach an example of the source xml document.

Most likely the source is not using Unicode text encoding.

Posted

Here is the source xml file.

Curiosly when viewing this in WordPad the umlaut in the surname is visible, but when opening it Dreamweaver 2004 the umlaut is not visible. Which is more confusing!

Also when amending the xml file first row from;

<?xml version="1.0" encoding="utf-8"?>

to read

<?xml version="1.0"?>

the issue seems to get resolved.

My source file is an edited version based on what I usually receive from a third party (e.g. data changed but format is original)

Posted

I don't see a file, but what you said more or less supports my initial suspicion. It seems that while the document declares "utf-8", it doesn't actually conform to it. You should alert your data provider to this problem. Otherwise you'll have to either modify the source xml every time before importing it, or find a way to transliterate the problematic characters during or after import.

  • 5 months later...
Posted

Hi comment

I know it's some time ago when you answered my query, but I'm hoping you can look at this xml file, which I should have sent earlier, in reference to this subject. I requested the third party to remove the reference to encoding.

My problem is becoming more prevalent as more orders are coming from Europe and elsewhere and ideally I need to automate the import process!

Thanking you for your assistance in advance.

orders.zip

Posted (edited)

AFAICT, the problem is caused not by any specific characters, but rather by undefined character entities, such as "auml" or "aring". These are not valid XML entities, and whoever uses them must make their definition avaiable to the XML parser - see:

http://en.wikipedia.org/wiki/List_of_XML_and_HTML_character_entity_references#Entities_representing_special_characters_in_XHTML

Edited by Guest

This topic is 6044 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.