Translate on keywords filter

Ocean West · November 8, 2015

What I am trying to do:

I am importing transactions from a bank file, each transaction descriptions look like this:

Recur Pymnt 09/25/13 18:44 6901 NETFLIX.COM NETFLIX.CO NETFLIX.COM CA 160599
POS 09/20/13 17:05 6901 USPS 0525270026 SAN DIEGO CA 005030
POS 10/07/13 18:16 6901 COSTCO WHSE #108 SAN MARCOS CA 176242
POS 10/02/13 14:23 6901 OFFICE DEPOT 00 SAN DIEGO CA 051103

I am trying to replicate something that Quicken did where once you edit the string and change the USPS to US Post Office then all subsequent imports will translate the values.

The original imported description field remains intact un-modified, and there is another field where the user can enter the desired description.

I have a table that contains a keyword value and a result value, (these examples may be simplistic but have at least been consistent in originating descriptions some are badly abbreviated. The translation is not just capitalization but abbreviation expansion.

NETFLIX.COM = Netflix
USPS = US Post Office
COSTCO = Costco 
OFFICE DEPOT = Office Depot

This translation most likely will be a post import batch process as the first few attempts are slow in performing.

I have tried a few custom functions but they seem a bit slow - if used in a calculation and i am looking at rows of data.

I was using Agnes Borough's CustomList function and also FilterList but just not coming up with the best syntax to achieve consistent results.

Humbly

Stephen

comment · November 8, 2015

I am not sure I understand your question - esp. this part:

1 hour ago, Ocean West said:

The original imported description field remains intact un-modified, and there is another field where the user can enter the desired description.

Why won't you parse the imported lines into fields? And what exactly is the role of that "another field where the user can enter the desired description"?

If you had the keyword value isolated in a field, you could look up the replacement using a relationship. Without this, you will have to do multiple substitutes, one for each pair of key/value. This may be slow, if you rely on getting the pairs via a relationship. Perhaps you should start by loading them into $$variables.

As for a possible technique, have a look at:
http://www.briandunning.com/cf/851

(this may need to be adapted to work on text instead of a list).

Edited November 8, 2015 by comment

Ocean West · November 8, 2015

yes data are imported in separate fields as needed:

Date | Description | Amount etc...

the description (the imported value) will not be user editable so that you can always compare the bank statement from the value imported.

however for other reporting purposes the Modified Description field is the field the user will enter manually or will be translated based on the translation table.

when the user is finding they are constantly overriding something and there is a pattern to the entry then they can add the keyword patterns to the translation table and the desired result value -

thanks comment i'll look at the CF

Ocean West · November 8, 2015

just for clarification this sample below is the description field for 4 different records:

Recur Pymnt 09/25/13 18:44 6901 NETFLIX.COM NETFLIX.CO NETFLIX.COM CA 160599
POS 09/20/13 17:05 6901 USPS 0525270026 SAN DIEGO CA 005030
POS 10/07/13 18:16 6901 COSTCO WHSE #108 SAN MARCOS CA 176242
POS 10/02/13 14:23 6901 OFFICE DEPOT 00 SAN DIEGO CA 051103

I guess I need a word filter to compare the contents of this against the keyword its and then reduce each string to a single unique instance so that i can use that to match via a relationship to the keyword library

NETFLIX.COM
USPS
COSTCO
OFFICE DEPOT

comment · November 8, 2015

You've lost me there. I asked why don't you parse the lines into fields, and you said that you do. If you do, then you would have 4 records like this:

RecordNumber    Description
1               NETFLIX.COM
2               USPS
3               COSTCO
4               OFFICE DEPOT

and you could be using lookup to change "NETFLIX.COM" to "Netflix", "USPS" to "US Post Office" and so on.

Ocean West · November 8, 2015

the input:

Description =

Recur Pymnt 09/25/13 18:44 6901 NETFLIX.COM NETFLIX.CO NETFLIX.COM CA 160599

output after passed thru a custom function.

Quote

Recur Pymnt 09/25/13 18:44 6901 NETFLIX.COM NETFLIX.CO NETFLIX.COM CA 160599

Netflix

a Function to filter or strip out all the characters that don't match the strings in the filter list, resulting with single instance of the string so it can be related to the translation table or just output the result since they share the same position in the stack.

Translate ( table::description ; $$filter_list ; $$result_list )

the $$filter_list would be a List ( translate::keyword ) the $$result_list would be List ( translate::result )

the function would know that NETFLIX.COM is on the 14th return and then result with the corresponding value from the GetValue ( $$result_list ; 14 )

563fcbc450e73_ScreenShot2015-11-08at2.24

comment · November 8, 2015

Okay, I see what you mean (I think). I still don't see why you won't parse the string first - as it seems clearly divided into columns:

|POS| |09/20/13| |17:05| |6901| |USPS| |0525270026| |SAN DIEGO| |CA 005030|

Anyway, going with your direction: what if there are two matches, and what if there's no match?

The second part of your question is solved here:
http://www.briandunning.com/cf/908

Edited November 8, 2015 by comment

Ocean West · November 8, 2015

the data is already in separate fields importing the data from the bank -

the description field is just packed with redundant info – the monospace font made it look like columns - and yes there is a pattern but most of the stuff is just noise, and somewhat dependent on how the merchant publishes their transaction data and the hard limits as to how many characters the bank sets to the description text.

25 minutes ago, comment said:

what if there are two matches, and what if there's no match?

stop & ignore after first match, no match return nothing - meaning the user can manually enter text or go and add some keywords:result pairs to the library, for future operations.

comment · November 8, 2015

18 minutes ago, Ocean West said:

stop & ignore after first match, no match return nothing

Try it this way:

FirstMatch ( text ; listOfValues ) =

Let ( [
searchString = GetValue ( listOfValues ; 1 ) ; 
countValues = ValueCount ( listOfValues )
] ;
Case ( 
Position ( text ; searchString ; 1 ; 1 ) ; searchString ;
countValues > 1 ; FirstMatch ( text ; RightValues ( listOfValues ; countValues - 1 ) )
)
)

You would call this with Description being the text parameter and a list of all keywords being the listOfValues. The result can be used as the input for CorrespondingValue() or as the matchfield for a lookup directly from the Translations table..

Ocean West · November 9, 2015

Thanks Comment that worked perfectly it will probably be even faster one i set the valueLists into a $$var at startup or on charges vs going thru a cartesian relationship

Let (
[
  k = List(translate_keywords::keywords);
  r = List(translate_keywords::result) ; 
  m = FirstMatch ( description ; k )
];
  CorrespondingValue ( k ; m ; r )
)

Lee Smith · November 9, 2015

Oh oh, Ocean West double posted. LOL

Sign In

Translate on keywords filter

Recommended Posts

Ocean West

comment

Ocean West

Ocean West

comment

Ocean West

comment

Ocean West

comment

Ocean West

Lee Smith

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information