Quito Posted May 9, 2024 Posted May 9, 2024 (edited) Hi, Using Insert from URL, the result can be: 1. Downloaded as text into a text field (in XML format). 2. Downloaded as a text file into a container field, also in XML format, with a generic file name (efetch.fcgi). The tricky part is that the second line contains the DOCTYPE. When FileMaker reads this line, it spends up to twenty seconds there, and then progresses with the script. As there are many XSLTs, FileMaker reads the DOCTYPE that many times, extending the entire process to way over 3 minutes or so. If I manually remove the second line, the entire processing time goes down to the expected 1-2 seconds in total. In 1., I've commented out the second line with a calculation, but I haven't been able to get the script to recognize the calculated text as XML and use that modified XML to process with the XSLTs. In 2., I don't really know if I can modify the contents of efetch.fcgi from within its container field. What would be the best way to do this? Would both cases requiere the modified XML to be downloaded and reuploaded? Best regards, Daniel Edited May 9, 2024 by Quito Added screenshot; added clarification
comment Posted May 9, 2024 Posted May 9, 2024 (edited) 1 hour ago, Quito said: When FileMaker reads this line, it spends up to twenty seconds there, and then progresses with the script. Do you mean during the import? After you have exported/written the field's contents to somewhere on your hard disk? --- Added: Haven't we done this before? https://fmforums.com/topic/110220-dialog-window-xmlxsl-information-is-not-enough-to-proceed-with-importexport/?do=findComment&comment=492213 Edited May 9, 2024 by comment
Quito Posted May 9, 2024 Author Posted May 9, 2024 (edited) Hi, Comment, Thank you. Yes, during the import. The XML processing works if the XML file is manually sent to a container field. This triggers the XML/XSLT processing script correctly, the XML is sent to the Desktop and reimported using the XSLTs. My problem occurs if the XML is stored as a calculated text in a field OR as an XML within its container. Then the script fails. Can't the XML processing + XSLT occur directly against the stored files, without the need of the export-reimport step? ----- Yes, we have discussed this before, and I have updated the scripts accordingly, taking into account your insight as much as possible. Thank you again! My position is that both the XML and the XSLT are stored in their corresponding container fields, and that exporting the XML just to reimport it seems unnecessary. It does work, yet stripping the second line would make it perform faster. Also, there can be thousands of separate XMLs in the processing queue (one of my tests involve a batch with 4800 separate XML records; another test involves a single XML with tens of thousands of records than can be over 2GB in size). If every time an XML needs to be processed the Desktop gets a copy, then pretty soon the Desktop will get madly cluttered with XML files that then need to be deleted. As I do not know the path of the Desktop user, I don't think I can script delete the XML files after processing occurs. It also seems pretty dangerous to me to have scripts running against the Desktop, unless the user allows it. Thus the idea of handling eveything from within the tool. Now, I'm thinking that perhaps storing the XML from the text field into a temporary variable might do the trick. Or just importing the record directly from the XML server using an HTTP request, and skipping the Insert from URL step altogether. Yet, the problem regarding the second line will persist, and the import of a large file will take months. All the very best, Daniel Edited May 9, 2024 by Quito Clarification
comment Posted May 9, 2024 Posted May 9, 2024 (edited) 34 minutes ago, Quito said: exporting the XML just to reimport it seems unnecessary. Let me reiterate something I wrote in the other thread: You cannot import a file from a container field. The file must reside on your hard disk (unless you're importing it directly from an URL). I suspect you are confusing yourself by having the file inserted into a container field as reference only. In such case the file exists only on your hard disk and the container field stores only the path to it. If you want to strip the DOCTYPE declaration from the XML before you import it, then your process should follow these steps: Insert the file using the Insert from URL[] script step into a variable; Remove the DOCTYPE declaration; Write the result to a file in the temporary folder; Import the file. There are several options how to perform steps #2 and #3 which I won't go into now. 34 minutes ago, Quito said: If every time an XML needs to be processed the Desktop gets a copy, then pretty soon the Desktop will get madly cluttered with XML files that then need to be deleted. As I do not know the path of the Desktop user, I don't think I can manually delete the XML files after processing occurs. This is solved easily by using the temporary folder instead. Note that If you wanted, you could just overwrite the same file every time. But it is not necessary. Edited May 9, 2024 by comment
Quito Posted May 9, 2024 Author Posted May 9, 2024 OK, so POE.ai provided the following script, based on your reply: # Define script variables Set Variable [ $url ; "https://example.com/file.txt" Set Variable [ $tempFolder ; Get ( TemporaryPath ) ] Set Variable [ $tempFilePath ; $tempFolder & "temp.txt" ] # Insert file from URL into a variable Insert from URL [ Select ; $url ; $tempFilePath ] # Remove second line from the text Set Variable [ $text ; Substitute ( $text ; ¶ & GetValue ( $text ; 2 ) & ¶ ; ¶ ) ] # Write modified text to a temporary file Set Variable [ $fileHandle ; Open for Write ( $tempFilePath ) ] If [ $fileHandle ≠ "" ] Set Variable [ $writeResult ; Write to File ( $fileHandle ; $text ) ] Close File [ $fileHandle ] End If # Import the temporary file Import Records [ With dialog: Off ; "$tempFilePath" ] ------------- I don't expect it to work as is but, do you notice something overtly wrong for any step in particular?
comment Posted May 9, 2024 Posted May 9, 2024 15 minutes ago, Quito said: do you notice something overtly wrong for any step in particular? Just about everything. Let's not do this.
Quito Posted May 30, 2024 Author Posted May 30, 2024 OK, so it's taken me 20 days to progress to a promising, yet non-working script: Inside the first line is: Choose ( Abs ( Get ( SystemPlatform ) ) -1 ; /*MAC OS X*/ Get ( TemporaryPath ) & "Pubmed.xml" ; /*WINDOWS*/ "filewin:"& Get ( TemporaryPath ) & "Pubmed.xml" ) cXML_source contains a Substitute that removes the second line from the XML (the DOCTYPE with the DTD). Import fails with a [719] Error in transforming XML using XSL (from Xalan).
comment Posted May 30, 2024 Posted May 30, 2024 50 minutes ago, Quito said: cXML_source contains a Substitute that removes the second line from the XML (the DOCTYPE with the DTD). We don't see this part, so we don't know if it does what you claim or causes a problem. Nor do we see the XSLT. I would suspect that one or both files contains an XML declaration like: <?xml version="1.0" encoding="UTF-8"?> but since you are doing Export Field Contents for both, these files end up being encoded as UTF-16. But that's just a guess.
Quito Posted May 30, 2024 Author Posted May 30, 2024 (edited) Hi, Comment, The contents of the cXML_source are: Substitute ( XML_source ; [ "<!DOCTYPE PubmedArticleSet PUBLIC \"-//NLM//DTD PubMedArticle, 1st January 2024//EN\" \"https://dtd.nlm.nih.gov/ncbi/pubmed/out/pubmed_240101.dtd\">" ; "<!-- <!DOCTYPE PubmedArticleSet PUBLIC \"-//NLM//DTD PubMedArticle, 1st January 2024//EN\" \"https://dtd.nlm.nih.gov/ncbi/pubmed/out/pubmed_240101.dtd\"> -->" ] ) If I import the XML using the XSLT from the File/Import Records/XML Data Source, it performs flawlessly. So, it has to be something wrong with the script. The XML contains the following declaration: <?xml version="1.0" ?> Your assumption is correct. The XSLT states the "utf-8" encoding, twice. AFAIK, "utf-16" is necessary for asian languages, but otherwise, I don't understand the implications, nor how to correct the script. Maybe I have to force the use of "utf-8" during the download? Edited May 30, 2024 by Quito
Quito Posted May 30, 2024 Author Posted May 30, 2024 (edited) Solved it: In the GetContainerAtribute, I was writing a specific name for "filename" in some portions of the script. I noticed it when the script went through with "filename", yet failed with the specific name. Thanks, Comment. After at least 7 years, the PubMed en español project is finally ready for use in MacOS. Will be testing it shortly in Windows and on the server. Best regards, Daniel Edited May 30, 2024 by Quito
comment Posted May 30, 2024 Posted May 30, 2024 I think you could make this significantly simpler by using variables instead of fields. 35 minutes ago, Quito said: Will be testing it shortly in Windows and on the server. It won't work in a server-side script because you are using Export Field Contents. You should be writing to a data file instead. This was also already mentioned in the previous thread.
Quito Posted May 30, 2024 Author Posted May 30, 2024 I think you could make this significantly simpler by using variables instead of fields. Please elaborate further. It won't work in a server-side script because you are using Export Field Contents. You should be writing to a data file instead. This was also already mentioned in the previous thread. So, I tested it on Windows, made a few adjustments and finally it's working on both MacOS and Windows 11. Will work on the Write to Data File script now, and I'll open another topic, if necessary. Although the software has always been intended to be used on a server, I had to see it working locally first. Does the Write to Data File function for both local and server use? All the very best, Daniel
comment Posted May 30, 2024 Posted May 30, 2024 3 hours ago, Quito said: Please elaborate further. I thought I already did in my 4 points above. To expand further, it would probably look something like this (pseudocode, untested): # Download and pre-process the XML Insert from URL [ $XML; "https://your.source.com/xml" ] Set Variable [ $XML; RightValues ( $XML ; ValueCount ( $XML ) - 2 ) ] # Write to file Set Variable [ $filePath_XML; Get (TemporaryPath) & "source.xml" ] Create Data File [ $filePath_XML ] Open Data File [ $filePath_XML ; Target: $dataFile_XML ] Write to Data File [ File ID: $dataFile_XML ; Data source: $XML ] Close Data File [ File ID: $dataFile_XML ] for the XML part. For the XSLT, I would use the Insert Text[] step to store the XSLT in the script itself as $XSLT variable. Then write it to file using the same method as the XML: Set Variable [ $filePath_XSLT; Get (TemporaryPath) & "stylesheet.xml" ] Create Data File [ $filePath_XSLT ] Open Data File [ $filePath_XSLT ; Target: $dataFile_XSLT ] Write to Data File [ File ID: $dataFile_XSLT ; Data source: $XSLT ] Close Data File [ File ID: $dataFile_XSLT ] Now you can do: Import Records [ $filePath_XML; $filePath_XSLT ] 3 hours ago, Quito said: Does the Write to Data File function for both local and server use? Yes. FYI, every script step help page shows a compatibility table like this: In addition, you can see which script step are server-compatible by selecting "Server" from the compatibility menu in the top right corner of the Script Workspace.
Quito Posted June 15, 2024 Author Posted June 15, 2024 Hi, @comment, Write to Data File has replaced what was scripted previously. I am getting a 300 error (because the file is open?). I've checked around the Forums but cannot find a way to fix it. Is something missing in the script? I'm adding a screenshot. All the very best, Daniel
comment Posted June 15, 2024 Posted June 15, 2024 I don't know, because I am not able to reproduce the problem. See what happens if you pause the script for a second before trying to open the data file. And you certainly must close the data file after writing to it, before you attempt to use it in the import.
Quito Posted July 12, 2024 Author Posted July 12, 2024 Hi, Comment, Following your advice, after making the script a bit more legible, and adding Close Data File, the script progressed successfully: ----------- Insert from URL [ Select; With dialog: Off; Target: $Pubmedxml ; "https://eutils.ncbi.nlm.nih.gov/... Set Variable [ $Pubmedxml; Value: RightValues ( $Pubmedxml ; ValueCount ( $Pubmedxml ) -2 ) ] Set Variable [ $filePath_XML; Value: Get (DesktopPath) & "pubmed.xml" ] Get File Exists [ "$filePath_XML" ; Target: $fileExists ] If [ not $fileExists ] Create Data File [ "$filePath_XML" ; Create folders: Off ] End If Open Data File [ "$filePath_XML" ; Target: $dataFile_XML ] Show Custom Dialog [ "File ID" ; "File ID for " & $filePath_XML & ": " & $dataFile_XML ] Write to Data File [ File ID: $dataFile_XML ; Data source: $Pubmedxml; Write as: UTF-8 ] Close Data File [ File ID: $dataFile_XML ] --------- In time, the DesktopPath will change to TemporaryPath. Now I'm getting a 719 error (Error in transforming XML using XSL) when parsing the second stylesheet, but that's for another topic. Thank you sooo much and, All the very best, Daniel
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now