Jump to content

Importing from an Excel XML


Raybaudi

This topic is 6310 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Hi all

I haven't any notion on XML

I have an Excel file that shows a column of cells with formatted text ( Superscript and Subscript ).

When I open ( convert ) that file with FileMaker, that format is loosed.

So I saved that file ( from Excel ) to XML; the XML mantains the information about Superscript and Subscript.

How can I import that file ? I need a xsl ?

This is an example of part of XML:

xmlns="http://www.w3.org/TR/REC-html40">SrAl3[(PO4)(SO4)](OH)6

ss:Name="Print_Area"/>

Link to comment
Share on other sites

Yes, I would say to import the data, and also import whatever you need to reset the formatting. Then use either an auto-enter calculation or a script to reset the Text Formatting via FileMaker functions. It's going to be a bit of a mess however. But I imagine you know which fields might need this Subscript formatting.

I cannot tell much of anything from the code that was posted. Please wrap your xml code in code /code tags. Even after cleaning it up and indenting it, I still can't tell what it means. For example, it uses , which is a common (though deprecated) HTML element. But what the heck would this mean?






Then it further nests all kind of elements within a font element. Are the font elements HTML? Seems so, as the html namespace is declared in the  element. But what the heck is [(PO? This whole thing looks like formatting, other than the "SrA l".



In order to do further processing of this data you have to know what each piece is; and I sure don't. 




<?xml version="1.0" encoding="utf-8"?>



  SrA l 

	  [sub]3[/sub]

	  

	  [sub]4[/sub]

	  

		  [sub]4 

			  

			  [sub]6[/sub]

			  

			  

		  [/sub]

	  

  

Link to comment
Share on other sites

Hi Fenton

many thanks for your answer.

It is difficult to explain, as I don't know XML and English, too !

May I send you the Excel file by email ?

So you'll know why I can't import directly from Excel !

Edit: the file was smaller than I thought, so I posted here

AMI_DB_Mineral_list_updating_31.12.2006.zip

Edited by Guest
Link to comment
Share on other sites

Yes, that's kind of what I thought, chemical formula names (or whatever they are, excuse my ignorance). I think it would be a real problem to do these with xml/xsl also. Because you don't really know how many 's, etc. you might get. About all you could do would be to bring in the whole business, for each name, then loop through it, setting the formatting of the NEXT symbol that came along (since the tag is before its text).

Another way to try this would be to go ahead and import regularly, then loop through each name, gathering the number groups one at a time, holding in a global field or variable, then either Superscripting or Subscripting that group, then moving on; or possibly a Custom Function, but I'd script it first.

It seems the Superscript ones always have the number(s) then "+", so those could be identified. Whereas the Subscript could have ".", "-", "x", or the Sum symbol. But fairly limited characters in either.

You could format each group as you went, building the result in another field. It looks like it could be done that way, maybe; I don't see any exceptions to the rules (though there's some characters I can't read at all, but I'm running only the Mac OS at the moment).

Link to comment
Share on other sites

Yes, that's kind of what I thought, chemical formula names (or whatever they are, excuse my ignorance). I think it would be a real problem to do these with xml/xsl also. Because you don't really know how many 's, etc. you might get.

That isn't important: I only need that string with all sub and super into a field of FileMaker.

Another way to try this would be to go ahead and import regularly,

A normal import looses all the sub and super tags

It seems the Superscript ones always have the number(s) then "+", so those could be identified. Whereas the Subscript could have ".", "-", "x", or the Sum symbol. But fairly limited characters in either.

That was my first approach, but "+" and "-" can stay as sub too ( rarely ) !

BTW: Fenton thank you very much to take a look at the file !

I was hoping to have an xsl to import just the name of the mineral into a field and just the string with ALL sub, normal and sup into another field

Link to comment
Share on other sites

Yes, I know it removes the Sub formatting; else we'd be done already :-| I was just thinking that it would be easier to loop through the characters themselves than it would be to loop thru the xml returned. But if, as you say, there are exceptions which cannot be trapped, then the xml would be more accurate. But I would want to be sure that the exceptions could not be trapped, 'cause looping thru the xml may be no fun.

Are you sure there is no way to identify the super and subs by looking at the "group" of characters? The "-" in the sub was always followed by an "x", wasn't it?

It should not be real difficult to import just the elements, with all their sub-elements. But Excel xml is kind of messy; lots of stuff that is not really data. Could you post another example of the xml, with a few data cells. And this time please either surround it with code brackets, or better, attached as a zipped text file; Windows line endings are fine.

Link to comment
Share on other sites

An easy example:

6H2(sub)O = 6 wather's molecola: there isn't an easy way to exstabilish that 6 is "normal"

Now, at work, I haven't Excel programm into my PC... but, if you have, you can export ( from Excel ) the Excel file that I posted as XML

The "-" in the sub was always followed by an "x", wasn't it?

No, try to do a find (while in Excel) for : "-1", for example !

Link to comment
Share on other sites

This is my attempt to begin a solution to the problem. Basically it brings in the formatting and data into a single field, which will then need to be post-processed by looping through the characters, as I mentioned earlier. You can use whatever separator you want; I used a tab. It has to be something not in the data however.

It's still not going to be fun to loop thru and set the super and subscript, but it can be imported into FileMaker.

Minerals_short.zip

Link to comment
Share on other sites

Can the field "markup" contain also the very first "normal" chars ?

for example: Abelsonite

[color:orange]Font: NiC Sub: 31 Font: H Sub: 32 Font: N Sub: 4

And, while you are there, can you change the TAB with a Carriage Return ? ( so the Custom Function will be easier and sure it will never been inside the "formula")

Link to comment
Share on other sites

Yep, you've got a point there. The first little section was missing. The thing is, I cannot see any way to get it out of the xml by itself. Getting the data element gets all of the data; I've already got that field, Text_. IMHO, it's a weakness in the way they're writing the xml; or else I'm missing something (quite possible). That first piece is kind of left hanging at the beginning.

But, because we can get all the text, and also get all the pieces of text AFTER it, we could get it by remvoing the "after" text from the "whole" text; post-process in FileMaker.

This example file is better I think. It puts the "labels" in one field, and the "data" in another, return-separated. Since there is a one-to-one correspondence for the pieces, you can use GetValue ( field; counter ) to loop through both, setting the formatting.

Minerals_short.zip

Link to comment
Share on other sites

The first little section was missing. The thing is, I cannot see any way to get it out of the xml by itself. Getting the data element gets all of the data

I believe this can be done by:


Assuming that all the "free" text is always positioned at he beginning, it could be possible to import the contents of the entire cell as a formula, ready to be evaluated by the Evaluate function.

Link to comment
Share on other sites

I mean that with some effort, a cell like:

H[sub]2[/sub]


could be transformed into:




"H" & TextStyleAdd ( "2" ; Subscript ) & "O"


This would be the imported value, which could then be evaluated in another field to produce the formatted result.





Not sure if it's worth the complexity required to deal with the quotes, but that part could be done in Filemaker. So even a value like:




HSubStart2SubEndO

could be processed without a script.

Edited by Guest
Link to comment
Share on other sites

I mean that with some effort, a cell like:

H[sub]2[/sub]


could be transformed into:




"H" & TextStyleAdd ( "2" ; Subscript ) & "O"

This would be the imported value, which could then be evaluated in another field to produce the formatted result.

Do you mean that this work can be made by the .xsl file ?

Link to comment
Share on other sites

Do you mean that this work can be made by the .xsl file ?

Yes, and it can be simpler than I thought. The relevant part of the XSL would go something like:

...

...













"



" & 























TextStyleAdd ( "



" ; Subscript ) & 









TextStyleAdd ( "



" ; Superscript ) & 









"



" & 

  















""











...




This would import directly into a text field as:


"H" & TextStyleAdd ( "2" ; Subscript ) & "O"

and a calc of Evaluate ( ImportedField) would produce the final result.

Edited by Guest
Code gets mangled
Link to comment
Share on other sites

That is very slick. There was one thing missing, which was the element after the . I modified my earlier file to use this method.

Daniele, when you get an error on the 1st line and column, it usually means there's an extra character in front of the declaration. This is usually because of an encoding problem, such as utf-16 or something; xml is supposed to be utf-8, and that is how I'm trying to save it. It's a little difficult to troubleshoot because it's invisible unless you use a dedicated text editor (not a word processor).

For Windows there is Programmers Notepad. For a real xml app there is Altova XMLSpy. It has a free version, somewhere there on the site; pretty awesome.

comment_Evaluate.zip

Link to comment
Share on other sites

On second thought, if this ever needs to be exported again...

Perhaps it would be better to go back to importing the text and the styles into two separate fields. If the two fields were repeating, the formula could be calculated again, with no need for a post-import script.

Link to comment
Share on other sites

This topic is 6310 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.