Jump to content
Joost Miltenburg

better way to go through data

Recommended Posts

Posted (edited)

LS,

There must be a better way to go through the data I receive from a webpage. It returns a list of comma separated emailaddresses about 33k.Currently I am doing it like this.( See below). It takes about 4 minutes. I can move it to FMServer. On FMServer it takes 2:20 for 32114 records. So I'll stick to the server.

I using FM15 at this client. Also multiple plugins are available to me, like MonkeyBread 7.3

Kind regards,

----

Freeze Window
Delete All Records [ With dialog: Off ]
New Record/Request
Set Variable [ $i ; Value: 0 ]

Insert from URL [ Select ; With dialog: On ; tester::g_text_field ; "url left out for privacy issues.php" ]
Set Variable [ $emailAddresses ; Value: Substitute ( tester::g_text_field ; "," ; "¶" )
] Set Variable [ $emailAdressesCount ; Value: ValueCount ( $emailAddresses ) ]
Set Field [ tester::g_text_field ; "" ]

Set Variable [ $tsStart ; Value: Get (CurrentTimestamp) ]
Loop     Set Variable [ $i ; Value: GetAsNumber ($i) + 1 ]     
Set Field [ tester::emailaddress ; GetValue ($emailAddresses; GetAsNumber ( $i )) ]     
Exit Loop If [ GetAsNumber ($i)>GetAsNumber ($emailAdressesCount) ]     
New Record/Request
End Loop

Set Variable [ $tsEnd ; Value: Get (CurrentTimestamp) ]
Set Variable [ $$duration ; Value: $tsEnd - $tsStart ]

 

Edited by Joost Miltenburg
update

Share this post


Link to post
Share on other sites
comment    1,392
Posted (edited)

Would it be faster if you export them to a file (as a return-separated list), then import it?

P.S. Does your data provider offer other formats?

 

 

Edited by comment

Share this post


Link to post
Share on other sites

Thnx !

I can check... Exporting to a return seperated list could be intresting. I'll try that!

Probably some JSON object would be possible, maybe xml, but no xml-result, but I could do that myself via xslt. 

Share this post


Link to post
Share on other sites
comment    1,392

IMHO, if you can get XML go for it, because it can be imported directly from the web (with suitable XSLT) without all these manipulations.

 

Share this post


Link to post
Share on other sites
Wim Decorte    456

Or json if you are using FM16 and that is an available format.  Parsing json is fast and easy and probably easier to pick up than XSLT if you are not already proficient in xslt.

Share this post


Link to post
Share on other sites

Thnx Wim,

But I am proficient enough with xslt to tackle this and the client isn't on FM16 yet. Also, I have yet to see a convincing JSON Demo on this, but if you could point me to one...

Share this post


Link to post
Share on other sites
comment    1,392
7 minutes ago, Wim Decorte said:

Parsing json is fast and easy

I doubt it's any faster or easier than parsing a comma-separated list. It's still a matter of downloading the document, parsing one value at a time, creating a new record, populating field/s ... Unlike XML that can be just imported. 

Share this post


Link to post
Share on other sites
Wim Decorte    456
22 minutes ago, Joost Miltenburg said:

Also, I have yet to see a convincing JSON Demo on this, but if you could point me to one...

 

What specifically would you want to be convinced of?  Perhaps I can point you to some demos.

14 minutes ago, comment said:

I doubt it's any faster or easier than parsing a comma-separated list. It's still a matter of downloading the document, parsing one value at a time, creating a new record, populating field/s ... Unlike XML that can be just imported. 

'Easier' is often and largely a matter of proficiency.  Joost indicated that he is proficient at XSLT so importing from XML is a good fit.  For many others it is not.

Few things are as fast as importing but we do have to consider the whole mechanics.  If 'insert from URL' can give you the JSON and you can parse it right there from a variable instead of having to output something like XML to disk and potentially also output the XSLT to disk before importing then perhaps just parsing the JSON could be faster.  Just highlighting options.

Share this post


Link to post
Share on other sites
comment    1,392
8 minutes ago, Wim Decorte said:

'Easier' is often and largely a matter of proficiency.  Joost indicated that he is proficient at XSLT so importing from XML is a good fit.  For many others it is not.

I am afraid we are not on the same page. I was comparing what OP does at the moment  - parsing a comma-separated list - with parsing JSON. There is no advantage to JSON in this comparison that I can see. There is a huge advantage to XML, because you do not have to output it to disk before importing it. You can import it directly from the URL (I think this is the third time I've said this in this thread).

As for skills: no one is (yet) proficient in parsing JSON. If you need to spend time learning one or the other, I recommend spending it on XSLT.

 

Share this post


Link to post
Share on other sites
Wim Decorte    456
1 minute ago, comment said:

As for skills: no one is (yet) proficient in parsing JSON. If you need to spend time learning one or the other, I recommend spending it on XSLT.

 

 

We differ on that point.  Plenty of people are already proficient in parsing JSON.  Heck, most of guys are and some of them presented on it at Devcon.  And I would probably direct people towards learning JSON parsing over XSLT any place where the two formats are available as outputs.  That would obviously just be a generalization, the task at hand and the skills at hand would dictate what we use in what scenario.  We try to have all the tools in our tool belt.

Share this post


Link to post
Share on other sites
comment    1,392
2 minutes ago, Wim Decorte said:

Plenty of people are already proficient in parsing JSON.  Heck, most of guys are

Guys?

Share this post


Link to post
Share on other sites

Both tools can be useful in this case. XML/XSLT is going to be my choice for this case. Importing with xslt seems faster than parsing and populating record after record and field after field.

Share this post


Link to post
Share on other sites
Wim Decorte    456
33 minutes ago, comment said:

Guys?

my guys

Share this post


Link to post
Share on other sites
comment    1,392

I presume you and your guys have had an opportunity to play with this before the rest of us. In the short time since version 16 came out, I haven't had a real need to process JSON input yet, and I suspect that's true for most of the community. So no advantage to JSON there, and a huge advantage to XML being directly importable. The only reason to learn JSON is the growing number of services that provide no XML alternative. 

Share this post


Link to post
Share on other sites

I agree with comment here. I spent a bit of time learning xml in the FM6 days. Thank you Beverly! Some of these new webservices just do JSON. 

JSON may be maturing, but xml, with stuff like XSD and all the other stuff is pretty useful. Not to mention, xml is pretty well human readable.

Share this post


Link to post
Share on other sites
Mike Duncan    14

If you have to reformat the data and handle it anyway, why not use a simpler format like CSV or Merge and skip XML or JSON? I have had issues in the past with arbitrarily large XML when importing into FM, both slow and on occasion crashing. I would avoid invoking xalan and xerces for processing and parsing XML unless that is really needed. I think they work best for smaller XML files.

I would also suspect arbitrarily large JSON would exhibit performance issues as well. When I have had to deal with lots of record as fast as possible, it was to grab data via a web viewer (or insert from URL) then export out to the file system and import directly to a table. Seems like this would be the fastest for you as well, if there are consistently a lot of records.

Share this post


Link to post
Share on other sites
jbante    141

XSLT seems to me like it would probably be the fastest. Exporting the CSV to disc and importing can also be advantageous if your schema is already a convenient match for that or you're willing to adapt your schema for that.

The built-in JSON functions have many virtues. Execution speed is not one of them. In every test I've run, GetValue on return-delimited lists leaves the built-in JSON functions in the dust. The MBS plug-in handles JSON differently in a way that makes it much faster (MBS does what I was hoping FileMaker would do with an in-memory data structure), if you're into plug-ins.

If you like sticking with return-delimited lists, you can make the processing of a large list go faster by pruning the list as you go:

Set Variable [ $value ; GetValue ( $list ; 1 ) ]

Set Variable [ $list ; Replace ( $list ; 1 ; Length ( $value ) + 1 ; "" )]

This makes the parsing of the whole list work in roughly O ( N^1.5 ) time, rather than O ( N^2 ).

  • Like 1

Share this post


Link to post
Share on other sites
comment    1,392
Posted (edited)
1 hour ago, Mike Duncan said:

If you have to reformat the data and handle it anyway, why not use a simpler format like CSV or Merge and skip XML or JSON?

No one suggested reformatting the data to XML or to JSON. When the data is available as XML or as JSON directly from the data provider, there is no need to reformat it: it can be consumed as is.

 

23 minutes ago, jbante said:

MBS does what I was hoping FileMaker would do with an in-memory data structure

I was hoping they would switch to XSLT 3.0 which can accept JSON as the input. Then you could import it as you do XML and not have to parse it yourself so tediously.

 

 

Edited by comment

Share this post


Link to post
Share on other sites

So, a little update. I created the xsl in about 10 minutes, but that is skill.

Now for the import via import records, type xml using the xsl. The import took about 5 seconds for a slightly larger data set.

 

So JSON with parsing vs importing with xml/xsl isn't in the same ballpark.
I would even go so far as to call it a different game.

Edited by Joost Miltenburg
grammer

Share this post


Link to post
Share on other sites
BruceR    154

Different game. OK; how so? You do not seem to have actually revealed the performance stats for parsing.

XML import took 5 seconds.

JSON parsing was; somewhere wandering a different ballpark. Buying hotdogs with Applepay.

Specifically, completion of JSON parsing took .... ?

Share this post


Link to post
Share on other sites

In my original post I mentioned that an insert from URL and then parsing the whole thing takes about 4 minutes on the client. Importing with xml/xsl takes me about 5 seconds to get the exact same result

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now


×

Important Information

By using this site, you agree to our Terms of Use.