How to average a subquery

December 28, 20169 yr

Hi,

i am a novice when it comes to anything more than basic SQL and am stuck trying to get an average (using SQL) of grouped query result.

My first query is:

SELECT SUM(diffDateTime) as theDiffSum
FROM Briefs
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1'
GROUP BY ows_ID
HAVING theDiffSum>99
ORDER BY theDiffSum ASC

this gives me a result like:
100
120
123
123
290

Ideally, i would like to form a query on this result set to give an average - a single number, using SQL (not FMP functions).

I've looked around but have been unable to find anything that works (or understand).

e.g.

SELECT AVG(
SELECT SUM(diffDateTime) as theDiffSum
FROM Briefs
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1'
GROUP BY ows_ID
HAVING theDiffSum>99
ORDER BY theDiffSum ASC) as test

Any ideas would be greatly appreciated - it is really screwing with my brain (or what's left of it?

December 28, 20169 yr

Could you not just replace the SUM with AVG like this :-

SELECT AVG(diffDateTime) as theDiffAvg 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1'

December 28, 20169 yr

Author

2 minutes ago, rwoods said:
Could you not just replace the SUM with AVG like this :-
SELECT AVG(diffDateTime) as theDiffAvg 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1'

Unfortunately not. I need the average of the SUMs.

e.g.

1. the AVG of the SUMs (100,120,123,123,290) = 756 / 5 = 151.2

2. the AVG of the diffDateTime could be ((10,20,30,40), (10,20,20,30,40), (10,20,30,40,23), (10,20,30,63), (10,20,30,40,50,50,90), = 756 / 25 = 30.24

December 28, 20169 yr

What is the reason you would prefer to do this with SQL instead of a function in FileMaker's calculation engine? You could easily use a (non-SQL) custom function to calculate an average from the return-delimited list result of your first query.

Edited December 28, 20169 yr by jbante

December 28, 20169 yr

Author

I currently have a custom function doing this, but it seems a fiddly workaround and inefficient for this purpose as eventually I'll be iteratively calling this SQL (or write a more complex call if I can work it out).

December 29, 20169 yr

When working in FileMaker, you should usually presume that ExecuteSQL is one of the slower links in the chain, not the calculation engine. You may have a point in this case, but only because naïve parsing of return-delimited lists with calculations is a quadratic-time operation — not that it makes much difference up to a couple hundred rows or so.

December 30, 20169 yr

Author

4 hours ago, jbante said:

When working in FileMaker, you should usually presume that ExecuteSQL is one of the slower links in the chain, not the calculation engine. You may have a point in this case, but only because naïve parsing of return-delimited lists with calculations is a quadratic-time operation — not that it makes much difference up to a couple hundred rows or so.

I'm drawing on a database of 100k+ records.

First Query - retrieves up to 1,000 results (each is a SUM of up to 1-1,000 records - GROUPS of records)
SELECT SUM(t) as sumT FROM Briefs WHERE t>4 AND Group='ABC' AND Status ='1' GROUP BY myID ORDER BY sumT ASC
Grouped Data = (1,2,3) (10,2,3,5) (23,2) (10,90)
>> SUMS = 6, 20, 25, 100 (could be up to a 1,000 results)
Next I calculate quartile info - Q1, Q3, IQR, Upper/Lower limits ($Upper might be 45)
Second Query (orig post) - similar to First Query but limiting the SUMS by the HAVING clause (45) using UpperLimit from quartile calculations)
SELECT SUM(t) as sumT FROM Briefs WHERE t>4 AND Group='ABC' AND Status ='1' GROUP BY myID HAVING sumT>$Upper ORDER BY sumT ASC
>> SUMS = 6, 20, 25, ~~100~~
Now I calculate the adjusted AVERAGE (= 17) of the Second Query resulting SUMS = 6, 20, 25 (with custom function)
>> AVG = 17

Steps 2/3/4 could be combined into a custom function (is there a simple function that returns a list from a list based on a maximum value - any ideas?)
i.e. loop through sorted SUMS (step 1), total SUMS up to $Upper (45 - step 2), then average the total (step 4)

Just thought it might be faster to ExecuteSQL >>> AVG( Step 3 query ) - to get the single number output.

The above sequence to be performed approx 3,640 times to produce various graphs for analysis (each week).

I will also try to produce an SQL using clauses and subqueries to simplify:
e.g. from Step 1
>> SUMS = ABC | 1 | 6, 20, 25, 100
>> SUMS = ABC | 2 | 16, 20, 25, 33
>> SUMS = XYZ | 1 | 20, 20, 25
>> SUMS = XYZ | 2 | 6, 20, 25, 100, 200
>> SUMS = QJZ | 1 | 1, 11, 22, 33, 55, 77, 200, ... 13600
>> SUMS = QJZ | 2 | 11, 88

Hope this give a better picture to what i'm trying to accomplish. Any help/suggestions welcome

December 30, 20169 yr

Since your result from step 1 is sorted, you can get your result from step 3 without making another ExecuteSQL call. You could do a binary search on the result from step 1 to find the last value within your upper limit, and use LeftValues ( $step1Result ; $positionOfLastValueToInclude ). This may or may not be faster; try it, measure the time, and use the faster approach.