Ocean West Posted November 8, 2015 Posted November 8, 2015 What I am trying to do: I am importing transactions from a bank file, each transaction descriptions look like this: Recur Pymnt 09/25/13 18:44 6901 NETFLIX.COM NETFLIX.CO NETFLIX.COM CA 160599 POS 09/20/13 17:05 6901 USPS 0525270026 SAN DIEGO CA 005030 POS 10/07/13 18:16 6901 COSTCO WHSE #108 SAN MARCOS CA 176242 POS 10/02/13 14:23 6901 OFFICE DEPOT 00 SAN DIEGO CA 051103 I am trying to replicate something that Quicken did where once you edit the string and change the USPS to US Post Office then all subsequent imports will translate the values. The original imported description field remains intact un-modified, and there is another field where the user can enter the desired description. I have a table that contains a keyword value and a result value, (these examples may be simplistic but have at least been consistent in originating descriptions some are badly abbreviated. The translation is not just capitalization but abbreviation expansion. NETFLIX.COM = Netflix USPS = US Post Office COSTCO = Costco OFFICE DEPOT = Office Depot This translation most likely will be a post import batch process as the first few attempts are slow in performing. I have tried a few custom functions but they seem a bit slow - if used in a calculation and i am looking at rows of data. I was using Agnes Borough's CustomList function and also FilterList but just not coming up with the best syntax to achieve consistent results. Humbly Stephen
comment Posted November 8, 2015 Posted November 8, 2015 (edited) I am not sure I understand your question - esp. this part: 1 hour ago, Ocean West said: The original imported description field remains intact un-modified, and there is another field where the user can enter the desired description. Why won't you parse the imported lines into fields? And what exactly is the role of that "another field where the user can enter the desired description"? If you had the keyword value isolated in a field, you could look up the replacement using a relationship. Without this, you will have to do multiple substitutes, one for each pair of key/value. This may be slow, if you rely on getting the pairs via a relationship. Perhaps you should start by loading them into $$variables. As for a possible technique, have a look at:http://www.briandunning.com/cf/851 (this may need to be adapted to work on text instead of a list). Edited November 8, 2015 by comment
Ocean West Posted November 8, 2015 Author Posted November 8, 2015 yes data are imported in separate fields as needed: Date | Description | Amount etc... the description (the imported value) will not be user editable so that you can always compare the bank statement from the value imported. however for other reporting purposes the Modified Description field is the field the user will enter manually or will be translated based on the translation table. when the user is finding they are constantly overriding something and there is a pattern to the entry then they can add the keyword patterns to the translation table and the desired result value - thanks comment i'll look at the CF
Ocean West Posted November 8, 2015 Author Posted November 8, 2015 just for clarification this sample below is the description field for 4 different records: Recur Pymnt 09/25/13 18:44 6901 NETFLIX.COM NETFLIX.CO NETFLIX.COM CA 160599 POS 09/20/13 17:05 6901 USPS 0525270026 SAN DIEGO CA 005030 POS 10/07/13 18:16 6901 COSTCO WHSE #108 SAN MARCOS CA 176242 POS 10/02/13 14:23 6901 OFFICE DEPOT 00 SAN DIEGO CA 051103 I guess I need a word filter to compare the contents of this against the keyword its and then reduce each string to a single unique instance so that i can use that to match via a relationship to the keyword library NETFLIX.COM USPS COSTCO OFFICE DEPOT
comment Posted November 8, 2015 Posted November 8, 2015 You've lost me there. I asked why don't you parse the lines into fields, and you said that you do. If you do, then you would have 4 records like this: RecordNumber Description 1 NETFLIX.COM 2 USPS 3 COSTCO 4 OFFICE DEPOT and you could be using lookup to change "NETFLIX.COM" to "Netflix", "USPS" to "US Post Office" and so on.
Ocean West Posted November 8, 2015 Author Posted November 8, 2015 the input: Description = Recur Pymnt 09/25/13 18:44 6901 NETFLIX.COM NETFLIX.CO NETFLIX.COM CA 160599 output after passed thru a custom function. Quote Recur Pymnt 09/25/13 18:44 6901 NETFLIX.COM NETFLIX.CO NETFLIX.COM CA 160599 Netflix a Function to filter or strip out all the characters that don't match the strings in the filter list, resulting with single instance of the string so it can be related to the translation table or just output the result since they share the same position in the stack. Translate ( table::description ; $$filter_list ; $$result_list ) the $$filter_list would be a List ( translate::keyword ) the $$result_list would be List ( translate::result ) the function would know that NETFLIX.COM is on the 14th return and then result with the corresponding value from the GetValue ( $$result_list ; 14 )
comment Posted November 8, 2015 Posted November 8, 2015 (edited) Okay, I see what you mean (I think). I still don't see why you won't parse the string first - as it seems clearly divided into columns: |POS| |09/20/13| |17:05| |6901| |USPS| |0525270026| |SAN DIEGO| |CA 005030| Anyway, going with your direction: what if there are two matches, and what if there's no match? The second part of your question is solved here:http://www.briandunning.com/cf/908 Edited November 8, 2015 by comment
Ocean West Posted November 8, 2015 Author Posted November 8, 2015 the data is already in separate fields importing the data from the bank - the description field is just packed with redundant info – the monospace font made it look like columns - and yes there is a pattern but most of the stuff is just noise, and somewhat dependent on how the merchant publishes their transaction data and the hard limits as to how many characters the bank sets to the description text. 25 minutes ago, comment said: what if there are two matches, and what if there's no match? stop & ignore after first match, no match return nothing - meaning the user can manually enter text or go and add some keywords:result pairs to the library, for future operations.
comment Posted November 8, 2015 Posted November 8, 2015 18 minutes ago, Ocean West said: stop & ignore after first match, no match return nothing Try it this way: FirstMatch ( text ; listOfValues ) = Let ( [ searchString = GetValue ( listOfValues ; 1 ) ; countValues = ValueCount ( listOfValues ) ] ; Case ( Position ( text ; searchString ; 1 ; 1 ) ; searchString ; countValues > 1 ; FirstMatch ( text ; RightValues ( listOfValues ; countValues - 1 ) ) ) ) You would call this with Description being the text parameter and a list of all keywords being the listOfValues. The result can be used as the input for CorrespondingValue() or as the matchfield for a lookup directly from the Translations table..
Ocean West Posted November 9, 2015 Author Posted November 9, 2015 Thanks Comment that worked perfectly it will probably be even faster one i set the valueLists into a $$var at startup or on charges vs going thru a cartesian relationship Let ( [ k = List(translate_keywords::keywords); r = List(translate_keywords::result) ; m = FirstMatch ( description ; k ) ]; CorrespondingValue ( k ; m ; r ) )
Recommended Posts
This topic is 3447 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now