Jump to content
Quito

xsl to import multiple attributes within a Pubmed subset

Recommended Posts

Quito    1

Hi FMForums Community,

After exporting PubMed records in XML format (where the default file name is pubmed_result.xml), the first two lines within the file must be removed in order to import pubmed_result.xml into FileMaker (14 Adv, MacOS Sierra) using a custom .xsl file.

Within the ArticleIdList subset there is the ArticleId, with multiple IdType. I am including the relevant XML and XSL portions.

----------------

XML

<PubmedArticleSet>

  <PubmedArticle>

    <PubmedData>

        <ArticleIdList>
            <ArticleId IdType="pubmed">27167443</ArticleId>
            <ArticleId IdType="pii">S1806-37132016000200162</ArticleId>
            <ArticleId IdType="doi">10.1590/S1806-37562016000000114</ArticleId>
            <ArticleId IdType="pmc">PMC4853075</ArticleId>
        </ArticleIdList>

    </PubmedData>
  </PubmedArticle>

</PubmedArticleSet>

------------------

XSL

                        <COL>
                            <DATA>
                        <xsl:for-each select="PubmedData/ArticleIdList">
                            <ArticleId>
                               <xsl:for-each select="*">
                                  <xsl:value-of select="." />
                                  <xsl:if test="position()!=last()">
                                  <xsl:value-of select="'&#xD;'"/>
                                  </xsl:if>
                               </xsl:for-each>
                            </ArticleId>
                        </xsl:for-each>
                            </DATA>
                        </COL>
-------------------

The result I'm getting after importing is:

27167443
S1806-37132016000200162
10.1590/S1806-37562016000000114
PMC4853075

But I would like to include the ArticleId IdType attribute to make it look like this:

pubmed: 27167443
pii: S1806-37132016000200162
doi: 10.1590/S1806-37562016000000114
pmc: PMC4853075

In another topic within fmforums.com there was a suggestion to use @ to capture multiple attributes within a subset.  Something like:

<xsl:value-of select="ArticleId/@IdType" />

What should the xsl contain in order to make pubmed: pii: doi: and pmc: visible?

Many thanks in advance.

Kind regards,

Daniel

Edited by Quito
Clarification

Share this post


Link to post
Share on other sites
comment    1,372

I don't think you are approaching this correctly.

If you know in advance which ID types are included (or can be included) in an article, then create a field for each type and populate it explicitly with the value from the ArticleId that has the corresponding IdType - not by using xsl:for-each. If you don't know all the possible ID types, then you need to import them as individual records, not fields, into a related table, where each record would have fields for type, value, and some common value tying all IDs of the same article together.

The way you are going about this has two major flaws: first, you will end up with poorly structured data; a field should not be holding both the type and the value together. Also, the number of fields a stylesheet can import is fixed in advance. The fields are defined in the stylesheet's METADATA section - which is sadly missing from your post - and in Filemaker's field import order. You cannot add fields on the fly - that's why it makes very little sense to try and grab all the IDs using xsl:for-each.

 

--
P.S. I did not understand this part:

Quote

the first two lines within the file must be removed in order to import pubmed_result.xml into FileMaker

If the source file is a well-formed XML document, you should be able to import it as is - without having to modify it in any way.

 

Edited by comment

Share this post


Link to post
Share on other sites
Quito    1
9 hours ago, comment said:

I don't think you are approaching this correctly.

If you know in advance which ID types are included (or can be included) in an article, then create a field for each type and populate it explicitly with the value from the ArticleId that has the corresponding IdType - not by using xsl:for-each. If you don't know all the possible ID types, then you need to import them as individual records, not fields, into a related table, where each record would have fields for type, value, and some common value tying all IDs of the same article together.

The way you are going about this has two major flaws: first, you will end up with poorly structured data; a field should not be holding both the type and the value together. Also, the number of fields a stylesheet can import is fixed in advance. The fields are defined in the METADATA section, which is sadly missing from your post. You cannot add fields on the fly - that's why it makes very little sense to try and grab all the IDs using xsl:for-each.

 

--
P.S. I did not understand this part:

If the source file is a well-formed XML document, you should be able to import it as is - without having to modify it in any way.

 

Hi,

Thanks for replying promptly.  Answers to your comments follow below:

"The fields are defined in the METADATA section, which is sadly missing from your post."

This is the Metadata:

            <METADATA>
                <FIELD EMPTYOK="YES" MAXREPEAT="1" NAME="ArticleID" TYPE="TEXT"/>
            </METADATA>

"If you know in advance which ID types are included (or can be included) in an article..."

A record can include a minimum of 1 (*) and a maximum of 4 IDTypes.

The possible IDTypes are:

pubmed *
pii
doi
pmc
"...create a field for each type and populate it explicitly with the value from the ArticleId that has the corresponding IdType"

That's exactly what I'm wondering how to accomplish.

"a field should not be holding both the type and the value together"

Agreed.  It would make sense to have the IDType and it's content populate different fields.  How do I achieve this?

"P.S. I did not understand this part:

If the source file is a well-formed XML document, you should be able to import it as is - without having to modify it in any way."

These are the first two lines from the xml file:

<?xml version="1.0"?>
<!DOCTYPE PubmedArticleSet PUBLIC "-//NLM//DTD PubMedArticle, 1st January 2016//EN" "https://www.ncbi.nlm.nih.gov/corehtml/query/DTD/pubmed_160101.dtd">

These lines are rejected by FileMaker during XML/XSL import/processing.

I hope this information is useful to correctly sort out the custom XSL.

Best regards,

Daniel

Share this post


Link to post
Share on other sites
comment    1,372
2 hours ago, Quito said:

"...create a field for each type and populate it explicitly with the value from the ArticleId that has the corresponding IdType"

That's exactly what I'm wondering how to accomplish.

Try the following stylesheet:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/PubmedArticleSet">
    <FMPXMLRESULT xmlns="http://www.filemaker.com/fmpxmlresult">
        <METADATA>
            <FIELD NAME="PubmedID"/>
            <FIELD NAME="PiiID"/>
            <FIELD NAME="DoiID"/>
            <FIELD NAME="PmcID"/>
        </METADATA>
        <RESULTSET>
            <xsl:for-each select="PubmedArticle">
                <ROW>
                    <COL><DATA><xsl:value-of select="PubmedData/ArticleIdList/ArticleId[@IdType='pubmed']"/></DATA></COL>
                    <COL><DATA><xsl:value-of select="PubmedData/ArticleIdList/ArticleId[@IdType='pii']"/></DATA></COL>
                    <COL><DATA><xsl:value-of select="PubmedData/ArticleIdList/ArticleId[@IdType='doi']"/></DATA></COL>
                    <COL><DATA><xsl:value-of select="PubmedData/ArticleIdList/ArticleId[@IdType='pmc']"/></DATA></COL>
                </ROW>
            </xsl:for-each>
        </RESULTSET>
    </FMPXMLRESULT>
</xsl:template>

</xsl:stylesheet>

This will present four source fields to the Filemaker's import dialog. Of course, you will also need to have four corresponding target fields in your solution, so that you can map the imported data to them.

 

2 hours ago, Quito said:

<!DOCTYPE PubmedArticleSet PUBLIC "-//NLM//DTD PubMedArticle, 1st January 2016//EN" "https://www.ncbi.nlm.nih.gov/corehtml/query/DTD/pubmed_160101.dtd">

These lines are rejected by FileMaker during XML/XSL import/processing.

Indeed, the DOCTYPE declaration is causing a problem. I am not quite sure why - I suspect it's an OS permissions issue that comes up when the XSLT engine tries to access the online DTD document.

  • Like 1

Share this post


Link to post
Share on other sites
kalle    1

Hi everyone,

i am using fmp 13 to access the pubmed database as well and was happy to see that there are others doing it. i am replying to this thread because my problem is kind of similar.

The thing is, a year ago my setup was working fine: i would generate an url with a formula request xml from pubmed (via the eutils esearch feature) and the resulting xml would be parsed by a custom xslt and the values then would be imported without any problem.

the url would look like this:

http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmax=10000&usehistory=y&term=Aortic stenosis [tiab] AND 2016 [dp]

 

Going back to this database today my scripts all gave me the error that a whitespace was expected in line 1 column 50.

i tried opening the pubmed URL in a browser. copied the resulting  content to the text editor and saved it as a utf-8 xml  file. when i select this xml file as the source (instead of the URL) everything is working fine. 

i have no idea why this all worked about 12 month ago and doesn't work anymore. furthermore i would really like to get this working again...

have you had the same problems? any ideas?

i guess maybe pubmed changed something in the encoding, or the upgrade to macOS sierra.

 

Thanks

SearchResultXMLConvert.xsl

Share this post


Link to post
Share on other sites
comment    1,372

I don't see what this has to do with the original problem here. And if you get an error when running a script, the smart thing to do is post the exact script you were running, along with the verbatim error message. 

I am guessing you need to percent-encode the spaces in your URL, so that it reads:

https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmax=10000&usehistory=y&term=Aortic%20stenosis%20%5Btiab%5D%20AND%202016%20%5Bdp%5D

 

Share this post


Link to post
Share on other sites
Quito    1

Hi Kalle,

Just a few thoughts.  The E-utility Web Service (SOAP) was terminated on July 1, 2015.  See https://www.ncbi.nlm.nih.gov/books/NBK43082/?report=reader.

The new syntax is provided here:

https://www.ncbi.nlm.nih.gov/books/NBK25499/?report=reader#chapter4.Eutility_DTDs

Please check this page regarding the NCBI migration to https on September 30, 2016https://www.ncbi.nlm.nih.gov/home/develop/https-guidance/

If you are interested in fleshing out the complete xsl for FileMaker using the PubMed DTD, let me know.  It's close to 3000 options and I've managed to write the main 500 or so.

Best,

Daniel

  • Like 1

Share this post


Link to post
Share on other sites
kalle    1
On 3.6.2017 at 0:09 PM, comment said:

I don't see what this has to do with the original problem here. And if you get an error when running a script, the smart thing to do is post the exact script you were running, along with the verbatim error message. 

I am guessing you need to percent-encode the spaces in your URL, so that it reads:


https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=pubmed&retmax=10000&usehistory=y&term=Aortic%20stenosis%20%5Btiab%5D%20AND%202016%20%5Bdp%5D

Thanks

5 hours ago, Quito said:

Hi Kalle,

Just a few thoughts.  The E-utility Web Service (SOAP) was terminated on July 1, 2015.  See https://www.ncbi.nlm.nih.gov/books/NBK43082/?report=reader.

The new syntax is provided here:

https://www.ncbi.nlm.nih.gov/books/NBK25499/?report=reader#chapter4.Eutility_DTDs

Please check this page regarding the NCBI migration to https on September 30, 2016https://www.ncbi.nlm.nih.gov/home/develop/https-guidance/

If you are interested in fleshing out the complete xsl for FileMaker using the PubMed DTD, let me know.  It's close to 3000 options and I've managed to write the main 500 or so.

Best,

Daniel

Hi Daniel,

thanks for your reply! I will look into those links. 

Kalle

Share this post


Link to post
Share on other sites
kalle    1

Ok. Problem solved.

It was a very simple problem: As Daniel suggested it had to do with the http versus https request. I changed the URL to https and everything is fine!

Thanks Daniel!

  • Like 1

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


  • Similar Content

    • By Jeff M
      Hello,
      I've done quite a bit with FileMaker in the past, but have never jumped into the XML arena. Now I have a need to do so, and I'm really needing some help with something I'm sure is quite basic...but still foreign to me. I have an external data source which export XML in a very specific format (see attached). I need to import that into FileMaker so I can manipulate the data, do some various lookups, etc. Once I've made my changes in the data, I need to export in the exact format it was when imported so the external data source can properly import.
      Not sure if it makes sense to import into 2 tables (workflows, workflow steps) or just a single table. I'm open either way. Once the data is in FileMaker I can get all my calculations and lookups and scripts going and make the data what I need it to be, but then I'll need to export it.
      Can someone point me in the right direction?
      Thanks so much in advance!
      Jeff
      workflows_export.xml
    • By Andreas T.
      I need some help in determining if I am missing something crucial here or if there is a limitation or bug in the function BE_ApplyXSLT from the BaseElements plugin. I am using version 3.3.4 (which is the latest) on Mac OS X 10.11.6 using FileMaker 15 Adv. and 16 Adv.
      I am unable to get any output from the transformation, nor any error.
      According to the documentation, the command shall return either 0 when successful or an error. Link
      In my testing it only returns blank, and gives no error even when no output file is generated. It should not return blank in cany case according to the docs. (I suspect there may be a documentation error and that it actually returns blank when there is no error.)
      No error is found using Get(LastError). BE_GetLastError returns 0.
      The function format is this:
      BE_ApplyXSLT ( xmlPath ; xslt ; outputPath ) Here are the input parameters I have used:
      BE_ApplyXSLT (      "/var/folders/y6/1yf76qcs5y91h_w4j27hxp980000gp/T/S10.2/ABC_nota_108.xml" ;      BE_ReadTextFromFile ( "/var/folders/y6/1yf76qcs5y91h_w4j27hxp980000gp/T/S10.2/HTMLvisning/Stylesheet_Full.xsl" );     "/var/folders/y6/1yf76qcs5y91h_w4j27hxp980000gp/T/S10.2/output.html" ) I have verified that the files exist and that both the XML and XSLT are valid. All files are in UTF-8. I can transform the XML with this XSLT using other
      methods without errors. I have also tried having the XSLT in a variable.
      As I understand it, BaseElements uses the libxml2 engine. I have tried doing the transform from the command line directly and it works as it should:
      xsltproc -o /var/folders/y6/1yf76qcs5y91h_w4j27hxp980000gp/T/S10.2/output.html /var/folders/y6/1yf76qcs5y91h_w4j27hxp980000gp/T/S10.2/HTMLvisning/Stylesheet_Full.xsl /var/folders/y6/1yf76qcs5y91h_w4j27hxp980000gp/T/S10.2/ABC_nota_108.xml It also works fine using the XMLLib OSAX in AppleScript. Unfortunately, I need Windows support also so I would like to get the BaseElements function to work.
      I hope somone can shed some light on what may be going on.
    • By blueworld4
      I am having some trouble getting my XSLT file to work.  I have followed many of the posts on here and have referenced
      https://www.w3schools.com/xml/xsl_intro.asp
      https://www.w3schools.com/xml/tryxslt.asp?xmlfile=cdcatalog&xsltfile=cdcatalog
      I seem to be able to get the examples to work and have had success with the example files for metars.cache.xml weather meter readings, but can't see why my own isn't working properly when it looks the same to me.  Any guidance would be gratefully received.  I am not new to Filemaker but this is my first attempt at importing and exporting data to/from our solution in XML which meets the UK Government standards.  My XSLT seems only to match fields but not actually import and records.  I have attempted referencing the template match to root with  <xsl:template match="/Message"> but this causes an "Invalid Document Structure" error in Filemaker, which doesn't seem to be a problem with the w3schools try xslt page.  I have reverted back to the format in the meters.xslt example file and now hope that somebody can help!!
      metars.cache.xml
      meters.xslt
      ILR-99999999-1617-20160229-144401-01.xml
      ILRLearningDeliveryImport.xslt
    • By jeffwest2
      Trying to set up mirror sync for the first time using FM15 Pro (Not advanced). 
      I have got past the set up of where the database is, the issue is when I try and set up the , as I don't have Advanced I have to import the MirrorSync.xml file directly, when I click continue I get an error that says, 'XML/XSL information is not enough to proceed with import/export', no matter where I put this file it fails.
      Any body had this? and what did you do to get round it?, be great if I could get rid of this sooner rather than later.
×

Important Information

By using this site, you agree to our Terms of Use.