Importing from an Excel XML

Raybaudi · February 5, 2007

Hi all

I haven't any notion on XML

I have an Excel file that shows a column of cells with formatted text ( Superscript and Subscript ).

When I open ( convert ) that file with FileMaker, that format is loosed.

So I saved that file ( from Excel ) to XML; the XML mantains the information about Superscript and Subscript.

How can I import that file ? I need a xsl ?

This is an example of part of XML:

xmlns="http://www.w3.org/TR/REC-html40">SrAl₃[(PO₄)(SO₄)](OH)₆

ss:Name="Print_Area"/>

Stuart Taylor · February 5, 2007

hmm ... wonder if you can import the data between the tags and then create a recursive custom function to modify the other tags within a calculation field to add text formatting.

Fenton is good with these things.

Raybaudi · February 5, 2007

Hi Stuart

thanks for replaying...

I think that I need only an XSL to import directly from the XML.

After that I'll have to rebuild the formula:

"SrA l₃[(PO₄& gt;)(SOont>_{4ub>)](OH)& lt;/Font>_{6& lt;/Sub"}}

Stuart Taylor · February 5, 2007

I'd have to agree ... want a hand with the XSL ... im kinda new to it but can probably work this out now.

Fenton · February 5, 2007

Yes, I would say to import the data, and also import whatever you need to reset the formatting. Then use either an auto-enter calculation or a script to reset the Text Formatting via FileMaker functions. It's going to be a bit of a mess however. But I imagine you know which fields might need this Subscript formatting.

I cannot tell much of anything from the code that was posted. Please wrap your xml code in code /code tags. Even after cleaning it up and indenting it, I still can't tell what it means. For example, it uses , which is a common (though deprecated) HTML element. But what the heck would this mean?





Then it further nests all kind of elements within a font element. Are the font elements HTML? Seems so, as the html namespace is declared in the  element. But what the heck is [(PO? This whole thing looks like formatting, other than the "SrA l".



In order to do further processing of this data you have to know what each piece is; and I sure don't.

<?xml version="1.0" encoding="utf-8"?>



  SrA l 

	  [sub]3[/sub]

	  

	  [sub]4[/sub]

	  

		  [sub]4 

			  

			  [sub]6[/sub]

			  

			  

		  [/sub]

Raybaudi · February 5, 2007

Hi Fenton

many thanks for your answer.

It is difficult to explain, as I don't know XML and English, too !

May I send you the Excel file by email ?

So you'll know why I can't import directly from Excel !

Edit: the file was smaller than I thought, so I posted here

AMI_DB_Mineral_list_updating_31.12.2006.zip

Edited February 5, 2007 by Guest

Fenton · February 6, 2007

Yes, that's kind of what I thought, chemical formula names (or whatever they are, excuse my ignorance). I think it would be a real problem to do these with xml/xsl also. Because you don't really know how many _{'s, etc. you might get. About all you could do would be to bring in the whole business, for each name, then loop through it, setting the formatting of the NEXT symbol that came along (since the _{tag is before its text).}}

Another way to try this would be to go ahead and import regularly, then loop through each name, gathering the number groups one at a time, holding in a global field or variable, then either Superscripting or Subscripting that group, then moving on; or possibly a Custom Function, but I'd script it first.

It seems the Superscript ones always have the number(s) then "+", so those could be identified. Whereas the Subscript could have ".", "-", "x", or the Sum symbol. But fairly limited characters in either.

You could format each group as you went, building the result in another field. It looks like it could be done that way, maybe; I don't see any exceptions to the rules (though there's some characters I can't read at all, but I'm running only the Mac OS at the moment).

Raybaudi · February 7, 2007

Yes, that's kind of what I thought, chemical formula names (or whatever they are, excuse my ignorance). I think it would be a real problem to do these with xml/xsl also. Because you don't really know how many _{'s, etc. you might get.}

That isn't important: I only need that string with all sub and super into a field of FileMaker.

Another way to try this would be to go ahead and import regularly,

A normal import looses all the sub and super tags

It seems the Superscript ones always have the number(s) then "+", so those could be identified. Whereas the Subscript could have ".", "-", "x", or the Sum symbol. But fairly limited characters in either.

That was my first approach, but "+" and "-" can stay as sub too ( rarely ) !

BTW: Fenton thank you very much to take a look at the file !

I was hoping to have an xsl to import just the name of the mineral into a field and just the string with ALL sub, normal and sup into another field

Fenton · February 7, 2007

Yes, I know it removes the Sub formatting; else we'd be done already :-| I was just thinking that it would be easier to loop through the characters themselves than it would be to loop thru the xml returned. But if, as you say, there are exceptions which cannot be trapped, then the xml would be more accurate. But I would want to be sure that the exceptions could not be trapped, 'cause looping thru the xml may be no fun.

Are you sure there is no way to identify the super and subs by looking at the "group" of characters? The "-" in the sub was always followed by an "x", wasn't it?

It should not be real difficult to import just the elements, with all their sub-elements. But Excel xml is kind of messy; lots of stuff that is not really data. Could you post another example of the xml, with a few data cells. And this time please either surround it with code brackets, or better, attached as a zipped text file; Windows line endings are fine.

Raybaudi · February 7, 2007

An easy example:

6H2(sub)O = 6 wather's molecola: there isn't an easy way to exstabilish that 6 is "normal"

Now, at work, I haven't Excel programm into my PC... but, if you have, you can export ( from Excel ) the Excel file that I posted as XML

The "-" in the sub was always followed by an "x", wasn't it?

No, try to do a find (while in Excel) for : "-1", for example !

Raybaudi · February 7, 2007

Hi Fenton

now I'm at home, so I can attach the XML file generated by Excel

AMI_DB_Mineral_list.zip

Fenton · February 7, 2007

This is my attempt to begin a solution to the problem. Basically it brings in the formatting and data into a single field, which will then need to be post-processed by looping through the characters, as I mentioned earlier. You can use whatever separator you want; I used a tab. It has to be something not in the data however.

It's still not going to be fun to loop thru and set the super and subscript, but it can be imported into FileMaker.

Minerals_short.zip

Raybaudi · February 7, 2007

Many thanks, Fenton..

I'm just studying ! :

Raybaudi · February 7, 2007

Can the field "markup" contain also the very first "normal" chars ?

for example: Abelsonite

[color:orange]Font: NiC Sub: 31 Font: H Sub: 32 Font: N Sub: 4

And, while you are there, can you change the TAB with a Carriage Return ? ( so the Custom Function will be easier and sure it will never been inside the "formula")

Fenton · February 8, 2007

Yep, you've got a point there. The first little section was missing. The thing is, I cannot see any way to get it out of the xml by itself. Getting the data element gets all of the data; I've already got that field, Text_. IMHO, it's a weakness in the way they're writing the xml; or else I'm missing something (quite possible). That first piece is kind of left hanging at the beginning.

But, because we can get all the text, and also get all the pieces of text AFTER it, we could get it by remvoing the "after" text from the "whole" text; post-process in FileMaker.

This example file is better I think. It puts the "labels" in one field, and the "data" in another, return-separated. Since there is a one-to-one correspondence for the pieces, you can use GetValue ( field; counter ) to loop through both, setting the formatting.

Minerals_short.zip

Raybaudi · February 8, 2007

Good Night, Fenton...

I'll study it tomorrow :

Fenton · February 8, 2007

And, once it's all in its little boxes, it's relatively easy to format. I think.

Minerals_short.zip

comment · February 8, 2007

The first little section was missing. The thing is, I cannot see any way to get it out of the xml by itself. Getting the data element gets all of the data

I believe this can be done by:

Assuming that all the "free" text is always positioned at he beginning, it could be possible to import the contents of the entire cell as a formula, ready to be evaluated by the Evaluate function.

Fenton · February 8, 2007

Right you are.

Minerals_short.zip

Raybaudi · February 8, 2007

Very, very, very good !

Thank you very much

Raybaudi · February 8, 2007

... ready to be evaluated by the Evaluate function.

The evaluate function can format a formula ?

What is the meaning ?

comment · February 8, 2007

I mean that with some effort, a cell like:

H[sub]2[/sub]



could be transformed into:

"H" & TextStyleAdd ( "2" ; Subscript ) & "O"



This would be the imported value, which could then be evaluated in another field to produce the formatted result.





Not sure if it's worth the complexity required to deal with the quotes, but that part could be done in Filemaker. So even a value like:

HSubStart2SubEndO

could be processed without a script.

Edited February 8, 2007 by Guest

Raybaudi · February 8, 2007

I mean that with some effort, a cell like:
H[sub]2[/sub]
could be transformed into:
"H" & TextStyleAdd ( "2" ; Subscript ) & "O"
This would be the imported value, which could then be evaluated in another field to produce the formatted result.

Do you mean that this work can be made by the .xsl file ?

Raybaudi · February 8, 2007

Hi Fenton

I'm trying to do an import, but this error comes up...

Edited February 8, 2007 by Guest

Stuart Taylor · February 8, 2007

I think he means:

Evaluate (

Substitute (

"₂"

; "_{" ; "TextStyleAdd ( "" )}

_{; "}" ; "" ; Subscript )" )

)

comment · February 8, 2007

Do you mean that this work can be made by the .xsl file ?

Yes, and it can be simpler than I thought. The relevant part of the XSL would go something like:

...

...













"



" & 























TextStyleAdd ( "



" ; Subscript ) & 









TextStyleAdd ( "



" ; Superscript ) & 









"



" & 

  















""











...





This would import directly into a text field as:

"H" & TextStyleAdd ( "2" ; Subscript ) & "O"

and a calc of Evaluate ( ImportedField) would produce the final result.

Edited February 8, 2007 by Guest
Code gets mangled

Raybaudi · February 8, 2007

Wow !!

It's to bad that I don't know XML !

Can you modify for me the .xsl maded by our dear friend Fenton ?

Stuart Taylor · February 8, 2007

Nice comment

Very Interesting

Fenton · February 8, 2007

That is very slick. There was one thing missing, which was the element after the . I modified my earlier file to use this method.

Daniele, when you get an error on the 1st line and column, it usually means there's an extra character in front of the declaration. This is usually because of an encoding problem, such as utf-16 or something; xml is supposed to be utf-8, and that is how I'm trying to save it. It's a little difficult to troubleshoot because it's invisible unless you use a dedicated text editor (not a word processor).

For Windows there is Programmers Notepad. For a real xml app there is Altova XMLSpy. It has a free version, somewhere there on the site; pretty awesome.

comment_Evaluate.zip

comment · February 9, 2007

On second thought, if this ever needs to be exported again...

Perhaps it would be better to go back to importing the text and the styles into two separate fields. If the two fields were repeating, the formula could be calculated again, with no need for a post-import script.

Raybaudi · February 9, 2007

Perfect !

Problem solved !!

Thank you all; this problem wasn't a mine problem, but from it I'm learning so much.

Sign In

Importing from an Excel XML

Recommended Posts

Create an account or sign in to comment

Create an account

Sign in

Important Information