bdam Posted March 13, 2009 Posted March 13, 2009 I'm wondering if anyone can shed a light on something for me regarding the performance of summary fields. I'm working on a large data set, currently 1.6 million records and will ultimately balloon beyond 16 million when data for all years is added. Each year will bring on 1.4 to 1.6 million new records. If I use four summary fields to calculate the total quantity, value, tax, and cost sold on the current set of 1.6 million the first time I do this it takes about 10 minutes of which the first summary field takes the most time. Subsequent summaries take a fraction of this time even if I go back to showing all the records. If I close FM and do it again I'm back to 10 minutes. So what is happening the first time that makes the second go so much faster? Is there a way to preserve that beyond a single session? Client: 8.5 and 9 Server: 8 Advanced on Windows 2003 All four fields being summarized are indexed
Søren Dyhr Posted March 13, 2009 Posted March 13, 2009 My take here is that the use of summary fields in a spreadsheet'ish fashion is a wrong approach, summary fields belong to dedicated subsummary reports derived by a scripts interaction establishing a found set. If an instant figure of the consequence of a posting in a ledger, is attempted - must it be established much more Paco di Pacioli'ish using lookups and redundant storage of data: http://fmforums.com/forum/showpost.php?post/312277/ ...so all in all is you problem, expectations originating from another tool - which realm is entirely different ... you have hit a FAQ! --sd
bdam Posted March 13, 2009 Author Posted March 13, 2009 Most of the time, these fields will be used as (sub)summaries for a found set although those sets and their groups are going to be large in and of themselves. For example, the total of all sales in New Jersey is going to be many thousands of records. I'm not complaining about the time it takes, I'm just wondering why the second time around it's really fast. >expectations originating from another tool - which realm is entirely different We're getting off topic but Filemaker.com's header is 'Database Software'. That, to me, means that comparing Filemaker to other database software is valid. If summarizing data in a straight-forward manner isn't in the realm of database software then what exactly is? Let's call a spade a spade: if Filemaker does something poorly then just say so and don't try to make excuses.
gdurniak Posted March 14, 2009 Posted March 14, 2009 it's faster the 2nd time, because FileMaker caches it's results, both locally, and on the server. The local cache is cleared, with every new session. Unfortunately, FileMaker is not designed for number crunching, so summaries are sluggish Make sure all your number fields are "stored". Calculations on Related Values are the worst. greg
Søren Dyhr Posted March 14, 2009 Posted March 14, 2009 Unfortunately, FileMaker is not designed for number crunching, so summaries are sluggish This is exactly the point i wish to convey - vessel of meaning via storage of data is more the name of the game. Given as an attempt to answer this: If summarizing data in a straight-forward manner isn't in the realm of database software then what exactly is? It can even be questioned if Filemaker then really adhere to the term database at all. Interface building has nothing to do with the being a database. I'm with Ernest Koe when he writes: In spite of the popular misnomer, FileMaker Pro is not a "database"; at least, not in the strictest sense of the word. It has an embedded database engine wrapped in an easily accessible interface that together make it a terrific tool based on a database, but calling FileMaker Pro a database misrepresents its real value. ....snipped boldly from: http://www.proofgroup.com/articles/2006/jun/filemakery_part_i You could then claim that Ryan Rosenberg either not knows what he's selling, or deliberately are giving us porkies to which we could do nothing, other than shrug pragmatically. Unfortunately can't some kind of obstacles in notions be eliminated completely - but getting the gist of a tool has a lot to do with it ... which deliberate or ignorant attempts of tearing the tool out of it's realm ....obviously (to me) can't be fully pre-emptively covered. --sd
LaRetta Posted March 14, 2009 Posted March 14, 2009 Hi bdam, I agree with you on this one. With the numbers you are running, you need true database speed. I would suggest that you include MySQL and Navicat in the mix. Use FileMaker for rapid development and UI and use separation (data in its own FM file). Then, when you have very heavy requirements such as this, let MySQL (its open source) handle it. Navicat (costs I think $30) is the easiest program I have EVER used (there is simply NO learning curve) and it eliminates the command line. You can get up to speed and begin producing lightening fast results in a few hours.
Søren Dyhr Posted March 14, 2009 Posted March 14, 2009 Neither am I directly disagreeing here, but this was what caught me: That, to me, means that comparing Filemaker to other database software is valid Because there is much more to the package than positioning it fair a squarely in the core of the term database as such, trade off's exist which craft-fully must be circumvented instead of being taken for granted. ESS'ing was introduced since Filemaker's philosophy fit's into the owner Apple's strategy: To develop around the actual use of the tool instead of being victimised by not quite being reaching demands from technology. Neither Apple products nor Filemaker recieves much thumbs ups from .... well read this: http://youritgovernancecoach.typepad.com/governance/2007/06/lessons_in_inno.html --sd
bdam Posted March 17, 2009 Author Posted March 17, 2009 Thanks for the thoughtful replies. I can appreciate that FM can't be everything to everyone; but if they do something poorly it would be nice if that was just said outright. Telling people that they have unreasonable expectations doesn't seem fair in light of FM's own marketing. They have a separate server product after all; how can people not expect it to do well at number crunching? What is a dedicated server for if not hosting and processing data? LaRetta, if I read into your comment you suggest using FM as the UI to a MySQL database. If I'm misunderstanding, my apologies. Have you, or anyone watching, done this and achieved performance benefits? It appears to me that ESS is still built upon ODBC which doesn't handle large datasets particularly well either.
Søren Dyhr Posted March 17, 2009 Posted March 17, 2009 Telling people that they have unreasonable expectations doesn't seem fair in light of FM's own marketing. I'm not designing Ryan Rosenbergs campaigns, I'm merely suggesting you apply the Lookup Last approach in stead of expecting summaries could do what ever you throw at them... you can not apply the Ochams Razor approach just picking the first method you encounter because it appears to be the simplest - pretty much the same with repeating fields vs portals and relations. Just as a sculpturer might pick a different chisel than you might expect him/her to do, is there how unfortunate as it might seem tool specific methods to everything ... here is it in the past said that filemaker is 80% circumvention and barely 20% straight forward programming - but the angle is then that it does something else than... not making the choice entirely daft. There are (as with everything) drawbacks to the approach since exclusion of a record here and there screws up the summing ... here should Pacioli's approach of postings in ink, be observed - So if an error occurs, should the posting be countered to eliminate the works of it. I hope you can see that getting the previous records recorded sum and adding the new postings figure to it ...is utterly ignorant to the size of the fund set. The field to show is best obtained by an extra relation sorting opposite of the postings which would make the last the first. --sd
mr_vodka Posted March 18, 2009 Posted March 18, 2009 We're getting off topic but Filemaker.com's header is 'Database Software'. That, to me, means that comparing Filemaker to other database software is valid. If summarizing data in a straight-forward manner isn't in the realm of database software then what exactly is? Let's call a spade a spade: if Filemaker does something poorly then just say so and don't try to make excuses. Yes but you can get a full size spade tool or you can get a small hand spade tool. In both cases they are indeed spades but one can move a lot more dirt. FileMaker is great for lots of things but when it comes to millions of records and then trying to summarize them, it will take a while. Then question that should be asked here is why you need to summarize 3 million records on a regular basis. Perhaps you can store the summaries for each year into another table or look into archiving some old data. As for using mySQL in conjunction with FileMaker, although it was not the intended purpose of using FileMaker strictly as a UI to an external database, you can take advantage of a hybrid solution. You can create a view in mySQL that will have the totals summed up and then just have FileMaker's ESS display that view. Even using mySQL though, you probably will find that it will take asome time to sum up 3 million records. You can create a stored procedure to save those sums into another table nightly if you wish but that would pretty much be the same as saving it in another table in FileMaker as I stated earlier.
bdam Posted March 19, 2009 Author Posted March 19, 2009 To re-iterate: I wasn't complaining about the time it took to calculate although I don't think it should be excused. I grasp that 'big sets are big' and usually we won't do the whole thing. My primary interest is what was going on with the subsequent queries, why they were fast, and if there was something I could do so attain that speed from the get-go. Gdurniak answered those questions succinctly and I thank him for it. My rant is more about accusing users of 'doing it wrong' when it's just something that FM is deficient at. I can appreciate that FM isn't great at calculating summaries but to say that summary fields aren't _meant_ to summarize large sets just doesn't past muster. It's the difference between "FM isn't good at that, here's a workaround" and "so all in all [it] is you[r] problem, expectations originating from another tool" suggesting that the user has unrealistic expectations. Then question that should be asked here is why you need to summarize 3 million records on a regular basis. That is a good question. I'm working on a solution for our sales division that will provide better access to our sales data. The database will essentially mirror the transaction table which lists every line item on every order. That gets to be a lot of records pretty quickly. Each year is approx 1.4 millions records and we've been keeping data since 1995. So something simple like company-wide sales will include a lot of records. It's true, we could do some back-end summary processing and that's something I've considered. However, there are a hundred and one different ways our sales division is going to want to slice this so going that route changes the project from a simple import and UI thing into a full blow data warehouse project. You're right to suggest it, it's probably the way it has to be done, but I just wanted to see if there was some performance boost I was missing first.
Søren Dyhr Posted March 19, 2009 Posted March 19, 2009 but I just wanted to see if there was some performance boost I was missing first. All right I infuriated you with my reply - which I regret it did, but did you give the Lookup Last approach a stab, my take it's much much faster than any aggregate or summary function, since it has been making sense ever since year 1290 here in Europe, whenever the arabs began approaching the situation this way ...should be unsaid - but it's way before Pacioli? It could be the frustration over "Not invented here" that are haunting here ... but how unfortunate it then might be, is a lot of our everyday gestalts invented elsewhere and often brought to new life in hybrid cultures. Take the mixture your language consist off... 40% Saxon 40% French 10 % Danish and 10% Latin.... it would make no sense to insist on only using the saxon terms for things, what if they would prove insufficient. --sd
gdurniak Posted March 25, 2009 Posted March 25, 2009 number crunching on SQL is over 1000x faster, so we keep our numeric data on Sybase SQL Anywhere, then query it from FileMaker, to display totals Since the data is entered once, and not edited, this works well just as we used to have the ability to "store compatible graphic", I would love to see "store compatible number", to allow for faster totals all FileMaker data is stored as text greg
mr_vodka Posted March 25, 2009 Posted March 25, 2009 Yes it would be nice to specify precise data types in FileMaker, but there are some great advantages to not having those strict restrictions as well.
Recommended Posts
This topic is 5721 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now