Jump to content

How to average a subquery


This topic is 2645 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Hi,

i am a novice when it comes to anything more than basic SQL and am stuck trying to get an average (using SQL) of grouped query result.

My first query is:

SELECT SUM(diffDateTime) as theDiffSum 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1' 
GROUP BY ows_ID 
HAVING theDiffSum>99
ORDER BY theDiffSum ASC

this gives me a result like:
100
120
123
123
290

Ideally, i would like to form a query on this result set to give an average - a single number, using SQL (not FMP functions).

I've looked around but have been unable to find anything that works (or understand).

e.g.

SELECT AVG(
SELECT SUM(diffDateTime) as theDiffSum 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1' 
GROUP BY ows_ID 
HAVING theDiffSum>99
ORDER BY theDiffSum ASC) as test

Any ideas would be greatly appreciated - it is really screwing with my brain (or what's left of it?

Link to comment
Share on other sites

Could you not just replace the SUM with AVG like this :-

SELECT AVG(diffDateTime) as theDiffAvg 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1'

 

Link to comment
Share on other sites

2 minutes ago, rwoods said:

Could you not just replace the SUM with AVG like this :-


SELECT AVG(diffDateTime) as theDiffAvg 
FROM Briefs 
WHERE diffDateTime>4 AND Group1 = 'ABC' AND Status = '1'

 

Unfortunately not. I need the average of the SUMs.

e.g.

1. the AVG of the SUMs (100,120,123,123,290) = 756 / 5 = 151.2

2. the AVG of the diffDateTime could be ((10,20,30,40), (10,20,20,30,40), (10,20,30,40,23), (10,20,30,63), (10,20,30,40,50,50,90), = 756 / 25 = 30.24

 

Link to comment
Share on other sites

What is the reason you would prefer to do this with SQL instead of a function in FileMaker's calculation engine? You could easily use a (non-SQL) custom function to calculate an average from the return-delimited list result of your first query.

Edited by jbante
Link to comment
Share on other sites

When working in FileMaker, you should usually presume that ExecuteSQL is one of the slower links in the chain, not the calculation engine. You may have a point in this case, but only because naïve parsing of return-delimited lists with calculations is a quadratic-time operation — not that it makes much difference up to a couple hundred rows or so.

Link to comment
Share on other sites

4 hours ago, jbante said:

When working in FileMaker, you should usually presume that ExecuteSQL is one of the slower links in the chain, not the calculation engine. You may have a point in this case, but only because naïve parsing of return-delimited lists with calculations is a quadratic-time operation — not that it makes much difference up to a couple hundred rows or so.

I'm drawing on a database of 100k+ records.

  1. First Query - retrieves up to 1,000 results (each is a SUM of up to 1-1,000 records - GROUPS of records)
    SELECT SUM(t) as sumT FROM Briefs WHERE t>4 AND Group='ABC' AND Status ='1' GROUP BY myID ORDER BY sumT ASC
    Grouped Data = (1,2,3) (10,2,3,5) (23,2) (10,90)
    >> SUMS = 6, 20, 25, 100  (could be up to a 1,000 results)
     
  2. Next I calculate quartile info - Q1, Q3, IQR, Upper/Lower limits ($Upper might be 45)
     
  3. Second Query (orig post) - similar to First Query but limiting the SUMS by the HAVING clause (45) using UpperLimit from quartile calculations)
    SELECT SUM(t) as sumT FROM Briefs WHERE t>4 AND Group='ABC' AND Status ='1' GROUP BY myID HAVING sumT>$Upper   ORDER BY sumT ASC
    >> SUMS = 6, 20, 25, 100
     
  4. Now I calculate the adjusted AVERAGE (= 17) of the Second Query resulting SUMS = 6, 20, 25 (with custom function)
    >> AVG = 17

 

Steps 2/3/4 could be combined into a custom function (is there a simple function that returns a list from a list based on a maximum value  - any ideas?)
i.e. loop through sorted SUMS (step 1), total SUMS up to $Upper (45 - step 2), then average the total (step 4)


Just thought it might be faster to ExecuteSQL >>> AVG( Step 3 query ) - to get the single number output.

The above sequence to be performed approx 3,640 times to produce various graphs for analysis (each week). 

I will also try to produce an SQL using  clauses and subqueries to simplify:
e.g. from Step 1
>> SUMS = ABC   |    1   |   6, 20, 25, 100 
>> SUMS = ABC   |    2   |   16, 20, 25, 33 
>> SUMS = XYZ   |    1   |    20, 20, 25 
>> SUMS = XYZ   |    2   |    6, 20, 25, 100, 200 
>> SUMS = QJZ   |    1   |    1, 11, 22, 33, 55, 77, 200, ... 13600
>> SUMS = QJZ   |    2   |    11, 88

Hope this give a better picture to what i'm trying to accomplish. Any help/suggestions welcome :)

Link to comment
Share on other sites

Since your result from step 1 is sorted, you can get your result from step 3 without making another ExecuteSQL call. You could do a binary search on the result from step 1 to find the last value within your upper limit, and use LeftValues ( $step1Result ; $positionOfLastValueToInclude ). This may or may not be faster; try it, measure the time, and use the faster approach.

Link to comment
Share on other sites

This topic is 2645 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.