Jump to content
Sign in to follow this  
Mark Welch

Can't get summary or count to work right

Recommended Posts

I'm trying to identify duplicate URLs (for thumbnail images) in a database, in order to identify "stand-in" images that should not be shown (for example, many merchants have a standard image that says "image not available" which is silly for me to display).

I've defined a summary field called "count_of_thumb" and I've tried creating new layouts several different ways, but all I ever end up with is either all 1's or all values the same (being the count of all records, I'm not sure if it's all records with a value in the field or a count of unique valies).

Clearly I am not understanding the concept of a "summary field" or a "subsummary" in a layout.

I've tried defining a layout to contain just the URL and the count_of_thumb values, but again it just shows either 1 or the total count; I cannot figure out any way to get a report that lists, for each URL, the number of occurrences of that value in this field across the database.

Share this post


Link to post
Share on other sites

I think you need to define a SELF JOIN RELATIONSHIP using the URL as the match on both sides of the relationship. Then you can use the COUNT function to evaluate the number of records that contain the same URL.

The SELF JOIN is a strange concept at first - a TABLE pointing at ITSELF, however it is perfectly valid and works just like any other relationship.

Share this post


Link to post
Share on other sites

I cannot figure out any way to get a report that lists, for each URL, the number of occurrences of that value in this field across the database.

Add a sub-summary part to your report, defined as "when sorted by: URL". Place the summary field in this part, and sort the records by the URL fields.

If you like, you can delete the layout's body part to get a summary report.

Share this post


Link to post
Share on other sites

IdealData wrote: > "I think you need to define a SELF JOIN RELATIONSHIP using the URL as the match on both sides of the relationship. Then you can use the COUNT function to evaluate the number of records that contain the same URL. The SELF JOIN is a strange concept at first - a TABLE pointing at ITSELF, however it is perfectly valid and works just like any other relationship." <

I simply don't understand this -- it just doesn't sound right, and given the incredibly long delays I've experienced trying to use a summary field in my database, I don't think this is something I should try.

comment wrote: > "Add a sub-summary part to your report, defined as "when sorted by: URL". Place the summary field in this part, and sort the records by the URL fields. // If you like, you can delete the layout's body part to get a summary report." <

I believe that this is exactly what I did, although while repeating the sequence (as described below) I realized that I had not selected "include subtotals").

The result (after a 4-5 minute delay while FileMaker was "Summarizing field 'count_of_thumb'") was either a list of URLs all containing the value 1, or all containing the value 2102538.

Maybe I should replace the Count() function in the summary field with a calculated field that just always contains the value 1. It really does seem absurd to embed a separate field into the database table just to generate this type of report, especially seeing that this field forces many 4-5 minute delays while FileMaker recalculates these values.

Here are my steps:

(1) View the layout that matches the table that contains the data.

(2) Change to Layout Mode (View|Layout Mode)

(3) New Layout/Report

(4) Select "List/Report" as the layout type

(5) Select "Report with grouped data" (this time, I checked "include subtotals" but earlier I did not check that because the count_of_thumb field is supposed to already contain the count.

(6) Select two fields: thumbnail_image_url and count_of_thumb

(7) Organize Records by Category "thumbnail_image_url"

(8) Sort records by thumbnail_image_url

(9) Default theme

(10) no headers or footers; do not create a script

(11) View report in Browse Mode

(12) Finish

FileMaker then sorts the data (another 5-minute wait).

Eventually, the result is a list of URLs, all accompanied by the value 2102538. In layout mode, I can't even see what the sections are called; nothing is resizable except the field values (which oddly default to very short stubby fields, forcing me to change the layout and then wait yet again for 4-5 minutes for FileMaker to summarize.

Eventually, I clicked on the 'count of thumb' field in the bottom row and changed it somehow, I think into a 'subsummary' value which it was NOT in the layout as generated. Then it display some values other than 1 or 2102538 -- but clearly it's not working because there are URLs with the count value of 0 (which is illogical, since the very existence of a URL mandates a 'count' value greater than zero).

Navigating this immense layout is simply impossible; any attempt to scroll triggers a long, long pause.

I surrender -- I am going to write a script that uses brute force: load each record in the database, check if the image URL is the same as the record before, and if so increment a counter and store it in the count field, otherwise store 1 in the count field.

Edited by Guest

Share this post


Link to post
Share on other sites

1. Sub-summaries do not show in Browse mode (in version 9 and below) - you have to view the report in Preview mode or print it.

2. It seems you have more than two million records in this table - it will take some time to sort them (esp. for the first time, when they are being indexed). BTW, your brute force script will need to sort them too.

3. You can reveal/collapse the part names in Layout mode - see the help for more.

4. With that many records, deleting the body part is practically a must. All you need is the sub-summary part with the two fields in it.

Share this post


Link to post
Share on other sites

Thanks -- I definitely did try to exclude the body section, but I'm not sure if I succeeded since I cannot see what the three sections were called.

I am simply not going to waste any more time on a "summary field," as it clearly drags performance.

But then again, maybe it's not the summary field. Even after deleting it, I am getting long periods when FileMaker won't respond but actually doesn't seem to be doing anything. Eventually I had to shut down the program via Windows, and when I reloaded it started "checking for consistency..." and if the progress bar is a reasonable measure, that's going to take at least a half-hour.

When I started, I viewed FileMaker as a prototyping tool, and did not actually expect FileMaker to be scalable beyond 100,000 records, but when it seemed to work even with 250,000 records, I thought maybe I could go live with FileMaker Server as my live database.

Now it seems quite clear that it's not going to work as a live database with millions of records. Now I am beginning to understand why most companies that offer FileMaker Server hosting set such low file-size limits.

Edited by Guest

Share this post


Link to post
Share on other sites

I am simply not going to waste any more time on a "summary field," as it clearly drags performance.

Me neither, as clearly you know more about this than I do.

Share this post


Link to post
Share on other sites

Thanks to everyone for your advice. After spending time on the phone with FileMaker support, it seems very clear that these performance issues are neither unusual nor unexpected -- FileMaker simply cannot handle the size database that I need, despite its marketing claims.

Share this post


Link to post
Share on other sites

Even after deleting it, I am getting long periods when FileMaker won't respond but actually doesn't seem to be doing anything.

I expect you are throwing out a baby with the water. I also expect that FM can handle what you throw at it if given half a chance. It sounds like you were having other issues but you haven't explored why ) -- your quote two sentences above this one). If you are depending upon some help geek at FM, instead of listening to seasoned developers who USE FileMaker, then you're a bit ahead of yourself. :wink2:

Edited by Guest
Added sentence

Share this post


Link to post
Share on other sites

At some point, I must decide whether to continue with a solution that is rapidly getting more complex and unworkable, with the hope that somehow I might make it work, or to move back to start over with another solution. That "help geek at FM," like the other help geeks at FM that I have talked with, made it very clear that FM does not wish to provide any support for its product; he was happy to confirm that yes, FileMaker is just really slow, and with that many records, yes, it might just not work. And absolutely yes, to get a list of unique values and a count for each, I'd need to go through a complex series of steps that would slow down my entire database to a crawl -- for a function that I know how to do with SQL with a single command.

I appreciate the assistance that folks on this forum have provided. And yes, after spending time "exploring" the performance problems, I did not find a solution, I stopped looking for a solution, because I recognized that every single problem I encounter ends up taking many hours or days to solve, and after a few dozen of those any ease-of-use and simplicity benefit from using FileMaker was completely lost.

When things are getting worse and worse, and when two unexpected new problems arise after each problem is solved, I have to accept that I am not making progress, and I need to change direction. Maybe FileMaker could meet my needs, but after spending so much time and realizing how much more work is required to learn this tool, I recognize that there are other tools that are much more likely to get me to a solution much sooner.

The obvious solution, of course, is for me to hire a programmer. I've done that, several times, with ASP/SQL Server and later with PHP/MySQL, and what I ended up with, after spending lots of money, were some programmers (I don't pick 'em very well, I know), and some partial code that I couldn't understand or adapt. I'm sure that there are some extraordinary folks who use FileMaker, whom I might hire to help me, but unfortunately it's more likely that if I keep trying, I'll just spend more time and money on "the wrong things," and I'll end up nowhere.

So it's time for me to stop and find a new strategy.

Share this post


Link to post
Share on other sites

Mark,

Filemaker's Table Occurrences (T.O.) is a powerful construct.

Let me see if I can provide a simple example.

TO1

Create a global field in that table.

This is going to be used as a match field to another T.O. For the sake of this simple example, let's say you have a field named URL.

Create a second field, as a global variable, and name it URL_match.

Duplicate the TO1 parent table - you now have a second instance of the same table - a child table.

Set a relationship between the URL_Match global field in TO1 to the URL field in the child T.O.

Use SET FIELD to set the global URL_Match field to a URL that you know is in the URL field.

Now the child T.O. will contain (virtually) only those records that match the global URL_Match field condition. You can create layouts with these fields from the child table, knowing that they will only contain the fields that match the global URL_Match.

In the parent T.O. (it will also appear in the child T.O), create a field that does a COUNT (pk from child T.O.). This field IN THE PARENT T.O. will contain a valid count of the # of records in the child T.O.

NOTE: The same field in the child T.O will NOT contain valid info - counts in a record do not work to count the records in the same table, but do work to count those in a related table.

This technique can be used with globals that are changeable (as in the example defined), or global constants, or multi-relationship conditions.

I have a person list, where I want to edit just one person. Rather than doing a find (time consuming) I set the global variable on the parent side of the relationship, and the child T.O. has THAT person's data ONLY - instantaneous in my DBs and probably nearly so in your massive example.

The power of T.O.s - took me forever to get the concepts.

Now I use Anchor-Buoy modeling (a.k.a. squids) for everything. Individual table occurrence groups (TOG) instances for almost every layout - and very little code anymore.

My core data model is the primary T.O. and their primary relationships.

For each layout, I have a new T.O. model, using just the T.O.s that are relevant to what I want to display, and relationships that are specific to what I need to display. Few if any FINDs or other time consuming script executions ever need to be run.

There are some excellent white papers out there on TOG, self-joins and anchor-buoy modeling. I think you will find this approach vastly simplifies

Share this post


Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×

Important Information

By using this site, you agree to our Terms of Use.