otari Posted July 14, 2008 Posted July 14, 2008 Hi there - I've managed to get the XML (well, XMP) data from imported images into a field labeled "XMP", but now I would like to parse that data into different fields according to hierarchical subject heading. The data in the "XMP" field looks like this: Nation|Germany Image Type|Photo - WW1 Era Original Aircraft Manufacturer|Friedrichshafen Aircraft Type|Biplane Aircraft Type|Seaplane Aircraft Seats/Use|Bomber Setting|In Flight My fields are named "Nation", "Image Type", "Aircraft Manufacturer", "Setting", etc. Obviously, the text string that I would like to parse immediately follows the subject heading and the "|", then ends right before the "". Any help with this is much appreciated!
LelandLong Posted July 14, 2008 Posted July 14, 2008 I've done a lot of XML scraping This should work for each field you want to scrape by modifying just the first variable 'fieldNameStart'. Here's the formula I like to use for each field (modified for your example - hopefully didn't create an error with modifying my code to match your needs): Let ( [ fieldNameStart = "Nation|" ; fieldNameClose = "" ; fieldNameLength = Length ( fieldNameStart ) ; dataBegin = Position ( YourXMPfield ; fieldNameStart ; 1 ; 1 ) ; dataEnd = Position ( YourXMPfield ; fieldNameClose ; dataBegin ; 1 ) ; dataLength = dataEnd - dataBegin - fieldNameLength ] ; Middle ( YourXMPfield ; dataBegin + fieldNameLength ; dataLength ) )
otari Posted July 14, 2008 Author Posted July 14, 2008 Thanks! I'll give it a try now and report back.
otari Posted July 14, 2008 Author Posted July 14, 2008 (edited) It works great - you're an absolute legend. Now - one final question: How should I alter this script if there are multiple instances of each keyword, or if there are none? For example, sometimes in the xml data there are multiple "Nation" keywords - like this: Nation|Germany Nation|France Nation|Britain I would like to have the script add each string to the "Nation" field, separated by a comma or dash. Likewise, sometimes there is no "Nation" data for the record at all - and when this happens, your script put a ton of unrelated xml data into the field. Thank you again for your help!! Edited July 14, 2008 by Guest
LelandLong Posted July 14, 2008 Posted July 14, 2008 To handle no-occurrences I modified the code to: Let ( [ fieldNameStart = "Nation|" ; fieldNameClose = "" ; fieldNameLength = Length ( fieldNameStart ) ; dataBegin = Position ( YourXMPfield ; fieldNameStart ; 1 ; 1 ) ; dataEnd = Position ( YourXMPfield ; fieldNameClose ; dataBegin ; 1 ) ; dataLength = dataEnd - dataBegin - fieldNameLength ] ; Case ( dataBegin > 0 ; Middle ( YourXMPfield ; dataBegin + fieldNameLength ; dataLength ) ; "" ) ) To handle multiple occurrences of 'Nation' and/or others, you have a couple options. 1) if you have Advanced you could create a recursive function to handle all instances of 'Nation' 2) if no FMPAdvanced, you'll need a field for every possibility. ie for up to 5 possible Nations you'll need 5 fields. Use the above code for each field. Change the Position code in each field to match the occurrence. * Nation3 would replace the position function parameters to 'Position(YourXMPfield;fieldNameStart;1;[color:red]3)'
otari Posted July 15, 2008 Author Posted July 15, 2008 This method for capturing multiple occurences is working perfectly. Thank you so much for your help!
Recommended Posts
This topic is 5976 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now