Jump to content
madmac33

How to average a subquery

Recommended Posts

Hi,

i am a novice when it comes to anything more than basic SQL and am stuck trying to get an average (using SQL) of grouped query result.

My first query is:

SELECT SUM(diffDateTime) as theDiffSum 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1' 
GROUP BY ows_ID 
HAVING theDiffSum>99
ORDER BY theDiffSum ASC

this gives me a result like:
100
120
123
123
290

Ideally, i would like to form a query on this result set to give an average - a single number, using SQL (not FMP functions).

I've looked around but have been unable to find anything that works (or understand).

e.g.

SELECT AVG(
SELECT SUM(diffDateTime) as theDiffSum 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1' 
GROUP BY ows_ID 
HAVING theDiffSum>99
ORDER BY theDiffSum ASC) as test

Any ideas would be greatly appreciated - it is really screwing with my brain (or what's left of it?

Share this post


Link to post
Share on other sites

Could you not just replace the SUM with AVG like this :-

SELECT AVG(diffDateTime) as theDiffAvg 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1'

 

Share this post


Link to post
Share on other sites
2 minutes ago, rwoods said:

Could you not just replace the SUM with AVG like this :-


SELECT AVG(diffDateTime) as theDiffAvg 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1'

 

Unfortunately not. I need the average of the SUMs.

e.g.

1. the AVG of the SUMs (100,120,123,123,290) = 756 / 5 = 151.2

2. the AVG of the diffDateTime could be ((10,20,30,40), (10,20,20,30,40), (10,20,30,40,23), (10,20,30,63), (10,20,30,40,50,50,90), = 756 / 25 = 30.24

 

Share this post


Link to post
Share on other sites

What is the reason you would prefer to do this with SQL instead of a function in FileMaker's calculation engine? You could easily use a (non-SQL) custom function to calculate an average from the return-delimited list result of your first query.

Edited by jbante

Share this post


Link to post
Share on other sites

I currently have a custom function doing this, but it seems a fiddly workaround and inefficient for this purpose as eventually I'll be iteratively calling this SQL (or write a more complex call if I can work it out). 

Share this post


Link to post
Share on other sites

When working in FileMaker, you should usually presume that ExecuteSQL is one of the slower links in the chain, not the calculation engine. You may have a point in this case, but only because naïve parsing of return-delimited lists with calculations is a quadratic-time operation — not that it makes much difference up to a couple hundred rows or so.

Share this post


Link to post
Share on other sites
4 hours ago, jbante said:

When working in FileMaker, you should usually presume that ExecuteSQL is one of the slower links in the chain, not the calculation engine. You may have a point in this case, but only because naïve parsing of return-delimited lists with calculations is a quadratic-time operation — not that it makes much difference up to a couple hundred rows or so.

I'm drawing on a database of 100k+ records.

  1. First Query - retrieves up to 1,000 results (each is a SUM of up to 1-1,000 records - GROUPS of records)
    SELECT SUM(t) as sumT FROM Briefs WHERE t>4 AND Group='ABC' AND Status ='1' GROUP BY myID ORDER BY sumT ASC
    Grouped Data = (1,2,3) (10,2,3,5) (23,2) (10,90)
    >> SUMS = 6, 20, 25, 100  (could be up to a 1,000 results)
     
  2. Next I calculate quartile info - Q1, Q3, IQR, Upper/Lower limits ($Upper might be 45)
     
  3. Second Query (orig post) - similar to First Query but limiting the SUMS by the HAVING clause (45) using UpperLimit from quartile calculations)
    SELECT SUM(t) as sumT FROM Briefs WHERE t>4 AND Group='ABC' AND Status ='1' GROUP BY myID HAVING sumT>$Upper   ORDER BY sumT ASC
    >> SUMS = 6, 20, 25, 100
     
  4. Now I calculate the adjusted AVERAGE (= 17) of the Second Query resulting SUMS = 6, 20, 25 (with custom function)
    >> AVG = 17

 

Steps 2/3/4 could be combined into a custom function (is there a simple function that returns a list from a list based on a maximum value  - any ideas?)
i.e. loop through sorted SUMS (step 1), total SUMS up to $Upper (45 - step 2), then average the total (step 4)


Just thought it might be faster to ExecuteSQL >>> AVG( Step 3 query ) - to get the single number output.

The above sequence to be performed approx 3,640 times to produce various graphs for analysis (each week). 

I will also try to produce an SQL using  clauses and subqueries to simplify:
e.g. from Step 1
>> SUMS = ABC   |    1   |   6, 20, 25, 100 
>> SUMS = ABC   |    2   |   16, 20, 25, 33 
>> SUMS = XYZ   |    1   |    20, 20, 25 
>> SUMS = XYZ   |    2   |    6, 20, 25, 100, 200 
>> SUMS = QJZ   |    1   |    1, 11, 22, 33, 55, 77, 200, ... 13600
>> SUMS = QJZ   |    2   |    11, 88

Hope this give a better picture to what i'm trying to accomplish. Any help/suggestions welcome :)

Share this post


Link to post
Share on other sites

Since your result from step 1 is sorted, you can get your result from step 3 without making another ExecuteSQL call. You could do a binary search on the result from step 1 to find the last value within your upper limit, and use LeftValues ( $step1Result ; $positionOfLastValueToInclude ). This may or may not be faster; try it, measure the time, and use the faster approach.

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×

Important Information

By using this site, you agree to our Terms of Use.