August 7, 200817 yr Sorry if this has been covered, but "mode" isn't the easiest thing to search for. Has anyone figured out a clever way to find the mode of related values, or a value list, in the context of a calculation? And by mode I mean Mode, like Mode, Median, Mean, etc. Most common value. I smell a custom function, but I'm somewhat baffled that FM doesn't include this function. Or am I blind?
August 7, 200817 yr Yes, I believe it requires a recursive calculation, hence a custom function. There's one here: http://www.fmforums.com/forum/showtopic.php?tid/168216/ though I'm not sure why it requires the list to be sorted - I might take a look at it later.
August 8, 200817 yr This seems to work... /* Mode ( ValueList ) custom function First Try... ( maybe buggy and sure can be nicer ) Returns a string with two words: first word is the number of times that a value is into the value list second word is the value If more than a value has the same number of presence into the list, new rows are added. Values can be numbers or strings */ Let([ adj = ¶ & Substitute ( ValueList ; ¶ ; "¶¶" ) & ¶; first = ¶ & GetValue ( ValueList ; 1) & ¶; count = PatternCount ( adj ; first ); next = Substitute ( TrimAll ( Substitute ( Substitute ( adj ; first ; ¶ ) ; [ " " ; "§" ] ; [ ¶ ; " " ]) ; 1 ; 1 ) ; [ " " ; ¶ ] ; [ "§" ; " " ] ) ]; Case( ValueCount ( ValueList ); Case( LeftWords ( Mode ( next ) ; 1 ) > count ; Mode ( next ) ; LeftWords ( Mode ( next ) ; 1 ) = count ; count & " " & GetValue ( ValueList ; 1) & ¶ & Mode ( next ) ; count & " " & GetValue ( ValueList ; 1) ) ) )
August 8, 200817 yr Bah... If too much text, it slows down until stop working with error... ... and there is a wrong test. Very buggy ! ( for comment ) Maybe I'll need the $
August 8, 200817 yr It could be done like this, I think: Mode ( listOfValues ; result ; freq ) Returns the most frequently occurring value/s in listOfValues, along with the number of occurrences. Let ( [ countValues = ValueCount ( listOfValues ) ; this = GetValue ( listOfValues ; 1 ) ; n = ValueCount ( FilterValues ( listOfValues ; this ) ) ; nextResult = Case ( n < freq ; result ; n = freq ; result & ¶ & this ; this ) ; nextFreq = Max ( n ; freq ) ; xcl = Substitute ( ¶ & listOfValues & ¶ ; [ ¶ ; "¶¶" ] ; [ ¶ & this & ¶ ; "" ] ; [ "¶¶" ; ¶ ] ) ; nextList = Middle ( xcl ; 2 ; Length ( xcl ) - 2 ) ] ; Case ( countValues ; Mode ( nextList ; nextResult ; nextFreq ) ; freq & " x " & Substitute ( result ; ¶ ; ¶ & freq & " x " ) ) ) The result and freq parameters should be empty on initial call. Examples: Mode ( "10¶11¶12¶11" ; "" ; "" ) = "2 x 11" ; Mode ( "a¶b¶c¶a¶c" ; "" ; "" ) = "2 x a¶2 x c" ;
August 8, 200817 yr Hi comment there is something wrong with your CF, too. First of all, if the List is empty, it gives: " x "... but real problems comes with a large text. Try to copy a generic text and to put it into a field that autoenters: Substitute ( generic text ; " " ; ¶ ) I triend with: "Mode ( ValueList ) custom function First Try... ( maybe buggy and sure can be nicer ) Returns a string with two words: first word is the number of times that a value is into the value list second word is the value If more than a value has the same number of presence into the list, new rows are added. Values can be numbers or strings" ... and the result was: 2 x ( 2 x ) BTW: isn't this a case to use $variables ?
August 8, 200817 yr I didn't bother with input validation. If your list is empty, or contains empty values, you will get unexpected results. It's not a matter of size - the function is tail-recursive, so it should handle lists of up to 50,000 DISTINCT values (the actual list could be much larger).
August 8, 200817 yr Author Lovely. Thanks, comment. Figuring out how that function works made my brain literally burst forth from my ears.
August 10, 200817 yr I played around with this and came up with a similar CF but I had few 'wonders'. Would a sort followed by a scan be quicker? Would it be an idea to include a check for the (latest) value count being greater than the count of the remaining values? I think I'd be trying to find out a bit more about the data characteristics if possible, because efficiency depends so much on these. Finally, I wondered where the values came from in the first place. Presumably, they are in individual records? In this case, a sort with summary counts should provide the required values with very little trouble. Even better(?) a self-join on Value'. Just curious but too lazy to satisfy my own curiosity.
August 10, 200817 yr I don't know. I am more lazy than you are, and what's more - I think some considerations are more important than raw speed (unless you need to do this often, on large sets). A script would require adding a summary count field, if one doesn't exists. A script needs to be initiated. A script would require the value field to be indexed. I believe the function is quite efficient (one iteration per distinct value). Adding a check for freq > remaining values is a double-edged sword: it would save evaluations in those cases where one or two values stand out significantly above the rest, but it would also add an evaluation at each iteration - most likely unnecessary with more typical data.
Create an account or sign in to comment