Jump to content

Only show top 5 categories in a subsummary, omitting other records


This topic is 5437 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Posted

I know I've seen this somewhere before, but I can't find it now. What's the easiest way to limit the number of subsummary categories to the top "N" categories when the subsummary part is resorted based on the count field?

Another way I'm thinking about this is that rather than omitting the minor categories, lumping them in together as "other" and getting the count for the rest. Here's an example


Bob

     Methanol       10

     Water          8

     Chloroform     6

     Other          5

Bill

     Hexanes        15

     Acetonitrile   12

     Formaldehyde   10

     Other          8



etc.  

This is rather than all the different categories grouped into other.

I'm thinking that I'm going to have to use GetSummary() and that always made my head hurt... and I'm out of practice on this stuff.

I never worried about doing this before, but now that I want to use charts to display this subsummary data, the chart labels won't really work when there are too many categories.

Posted

Thanks for the response, I think something like that could work. Although it looks like in that example, each of the scores is a separate record, wher in my example, each of the counts of different items (chemicals in this case) is a subsummary count of that number of records.

Thinking more about this, I don't think omitting records is what I want to do. I want to group the records making up less than say 20% of the total items for a category into an "other" category. Doing this for the whole found set shouldn't be too bad


If(GetSummary(CategoryCount;Category)/Get(Foundcount))<0.2





but this will be different for a subcategory





If (GetSummary(SubCategoryCount;SubCategory)/GetSummary(CategoryCount;Category))<0.2

Then after I find categories where this is true, I'd have to flag all the records belonging to those minor groups. Then my report would actually sort on a calculated field, where the result of the calculation is the same as the category field unless the record was flagged as belonging to a minor category, in which case the field would be "Other"

Essentially, I'd have to run the report twice. Once to get the minor categories, then again to get the actual report

Just thinking out loud and wondering if there's a flaw in this logic

Posted

Well, flagging records can be problematic if your solution is being shared.

I don't know what other factors play here, but you can also count via a self-join.

Posted

Good call on the flagging... I remember a Matt Petrowsky article on flagging a while back. Too bad my subscription ran out. I guess that would require some sort of session table.

Even if I count by a self join, I'm essentially trying to recategorize records on the fly based on how many there are in a certain subset, and I'm not sure if I can sort on a calculation field based on a Get() function off a multikey self join. I'm guessing it might work but be really slow since the field would be unindexed.

Have a good weekend.

This topic is 5437 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.