March 6, 201114 yr I have a Person table and a Payroll table, and they are related by a numeric ID number. The Payroll table has entries only when a paycheck is issued (once per month, on the 1st). I need to calculate, for each person, the # of months since the last paycheck. This is in a very large database (nearly 100,000 people with several million payroll records) so performance is critical. Question: what's the most efficient way to do this calculation? * Using an un-sorted relationship based on Person::ID = Payroll::ID, I could use Max(Payroll to Person::Date) However, logically this requires FM to access every related record to figure out which is the highest value. * Or, I could use a sorted relationship (Person::ID = Payroll::ID, sort by Date [Descending], in which case Payroll to Person::Date will give me the most recent date. However, logically this requires FM to sort every related record to figure the highest value -- sorts in FM seem fairly slow. As it turns out, the records in the Payroll table always arrive in chronological order, so in theory I could use something like this to simply grab the last record (using an un-sorted relationship) * Let ( dates=List(Payroll:Date) ; RightValues(dates;1) ) I'd appreciate real-world advice/experience if anyone has benchmarked these methods and found one to be better than the other, or if there are other techniques that are even faster? Also, this is in FMServerAdvanced 9v3 on Mac. EDIT: typos, clarification.
March 6, 201114 yr As it turns out, the records in the Payroll table always arrive in chronological order How about: Last ( Payroll::Date )
March 6, 201114 yr Author How about: Last ( Payroll::Date ) Thanks : I didn't know about that Last() function. I will experiment with this. It turns out that there are some cases where the payroll data is out-of-order (I had forgotten this) so I can't use this in every case, but in those cases where I can, I suspect Last() will be much faster than using Max() or the sorted relationship...
March 6, 201114 yr Disclaimer: I have not tested this and I know practically nothing about the inner workings of FMP. The assumption that Max() would be slower than Last() may seem reasonable - but over the years I've seen even more reasonable assumptions refuted by actual tests.
March 6, 201114 yr Author Disclaimer: I have not tested this and I know practically nothing about the inner workings of FMP. The assumption that Max() would be slower than Last() may seem reasonable - but over the years I've seen even more reasonable assumptions refuted by actual tests. I agree 100% -- some things that should be faster aren't. Here's some actual benchmarking data. The scenario is as described. The test is using a "replace all" to set a numeric field with a calculated result. 1,000 records. All times run 3x in a row and the final result taken (so that all data is cached in memory). Time Calculation ------------------------------------------------------------------------------- 8.5 Person to Payroll:Date (relationship by ID, sorted on date reverse) 5.6 Max(Person to Payroll::Date) -- relationship by ID only, not sorted 4.3 Last(Person to Payroll::Date) -- relationship by ID only, not sorted So, it does appear that the speeds make sense -- Last() is faster than Max() which is faster than getting the first record of a reverse-sorted date relationship. What's really interesting is the speed difference on repeating the runs. The first time I ran the test (when none of the records were in the cache) it took over 110 seconds, or 15 times slower. So I think in my case, the problem is not so much sorting or processing the records, as it is simply reading the records from disk. I know this server needs more ram and a faster disk, so that may be my next step rather than mucking around with the calculations...
March 6, 201114 yr What's really interesting is the speed difference on repeating the runs. The first time I ran the test (when none of the records were in the cache) it took over 110 seconds, or 15 times slower. That's the problem with these tests: you have to be really careful about what exactly are you testing. Indexing and caching can be very significant factors here. On the same note, it is my personal opinion that Anchor/Buoy (which you appear to be using) can also detract from performance - see: http://fmforums.com/forum/topic/56132-under-the-hood-question/page__p__265221#entry265221 http://fmforums.com/forum/topic/61041-relationship-graph-valuelists-by-groups-like-scriptmaker/page__view__findpost__p__288582
Create an account or sign in to comment