Efficiency of Calculations - cross referencing?

McCormick · August 26, 2003

Let's say I'm working with lots of dates, and for various reasons, I need to use lots of different manipulations of those dates: Week, Month, Next Week, Last Week, etc.

I have two choices:

This_Week = WeekofYear(Date)

Next_Week = (WeekofYear(Date) + 1)

or

Next_Week = (This_Week + 1)

In other words, referring to existing calcs or starting fresh. For a big database with lots of files, which is the more efficient approach to take?

Chuck · August 29, 2003

Honestly, I don't know which is more efficient. If you haven't done so already, I would test both methods, perhaps using a looping script to reference the data in each type. My guess would be that the first method would be faster, since I'm assuming that these fields are unstored and evaluating the second one would take a bit longer because it has to evaluate two unstored fields.

Chuck

CobaltSky · August 30, 2003

Hello McCormick,

In many cases, including the example you have cited, using a single calculation is more efficient if taken in isolation.

To illustrate this, in your example, with the single calc, the process for arriving at Next_Week involves four steps, viz:

1. Retrieve the value for date

2. Resolve the WeekofYear( ) function

3. Add 1

4. Write the result out to cache/disk

- ie four calls to the CPU (not necessarily four CPU cycles though, depending on the system and hardware architecture you're running on).

whereas the two-calc version requires:

1. Retrieve the value for date

2. Resolve the WeekofYear( ) function

3. Write the result of calc 1 out to cache/disk

and then

4. Retrieve the value for calc1

5. add 1 to it

6. Write the result out to cache/disk

Now the plot thickens. If, as would seem to be the case, you are already having to calculate This_Week in its own right, then there will be an overall saving of one step by referencing This_Week within the expression for Next_Week, thus reducing it from four steps to three.

But while on the face of it this seems more efficient (albeit by only one step in this instance), it may not be so in practice. This is because many contemporary operating systems (and hardware platforms) provide the capacity to process more than one instruction simultaneously within each CPU cycle. And whereas the six step procedure outlined above is one step less than the seven steps that would be required to calculate the two results independently, the second three steps in the six step sequence are dependent on the result of the first three and therefore cannot commence until the first three have concluded - thus forfeiting any system architecture advantages from simultaneous instruction handling.

So overall, in the case of this specific example, calculating the two results independently would be likely to be more efficient on most if not all system/hardware platforms.

However in a case where the calc for This_Week was a much more convoluted one, (eg involving 50 steps rather than three) the balance would tip in the other direction, as there would be a lot more steps to be saved by referencing This_Week within the expression for Next_Week.

Irrespective of this, I would caution against putting in place long chains of calcs each of which depends on the preceding calc because each must then be processed in turn before the final result can be returned. Try to keep the tiers of the table of dependencies down to two or three - solutions which have dozens of layers of dependency invariably struggle.

McCormick · September 2, 2003

Thanks, Ray - a very well explained summary.

And thanks, Chuck - I'll have to just test to find out the best answer for my situation.

Sign In

Efficiency of Calculations - cross referencing?

Recommended Posts

McCormick

Chuck

CobaltSky

McCormick

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information