Jump to content

Best way to handle Imported text and move/copy to other fields?


This topic is 1561 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Using: FMPA 16

Good evening - I have a text field that's auto-populated from an xls file.  It's up to six lines long, and contains various data per record that looks like any of the following:

(SAMPLE 1)
ANYUSER RICK & MELANIE
123 ANYWHERE ST
LAS VEGAS
NV 89108

(SAMPLE 2)
ANYUSER RICK
ANYUSER MELANIE & JANINE
123 ANYWHERE ST
LAS VEGAS
NV 89108

(SAMPLE 3)
ANYUSER RICK 50%
ANYUSER MELANIE 50%
C/O THE RICK AND MEL TRUST
123 ANYWHERE ST
LAS VEGAS
NV 89108

(SAMPLE 4)
THE ANYUSER COMPANY
123 ANYWHERE ST
COLDSTREAM BC V1B 3W8
CANADA

What I need to accomplish: I need to non-manually populate this data into separate fields.  Those  fields are OWNER1, OWNER2, ADDY1, ADDY2, CITY, ST, ZIP, COUNTRY.  Unfortunately, I cannot alter the data any further before it gets to my database.

For the life of me, I cannot figure out the best way to make this happen.

Does anyone have any suggestions for me?  Thank you in advance!

Link to comment
Share on other sites

I don't think it's possible to automate this process without introducing some additional rules that are not apparent from the given examples.

 I notice that the last line can be either the country or the (US?) state and ZIP code. We need a way to determine which one it is - perhaps we can assume that if the line contains any digits, it is the latter?

Next there is the problem of deciding whether the 2nd line is owner #2 or the street address (this is assuming that the number of owners is always either 1 or 2). I don't think a computer is able to look at "ANYUSER MELANIE & JANINE" or "C/O THE RICK AND MEL TRUST" and make this decision on its own, without some helper rule.

Finally, the line "COLDSTREAM BC V1B 3W8" is going to be very difficult to parse into city (which can be more than one word), state and ZIP without some additional rules to determine where the city name ends and where the ZIP code starts.

 

Edited by comment
Link to comment
Share on other sites

A couple things that may help re - your response: 

1) The Country CANADA is usually (lets call it 98% of the time another country appears) the only other variable country besides the US.

2) Sample 1 is the most common occurrence - probably occurs 75% of the time.

3) NV is the most common state, by about 95%.

Maybe we could work our way up from the bottom somehow by detecting and parsing the state and zip code first, then move up to city, then street?  Also, is there a way to detect the numeric portion of the street to differentiate it from the name(s) above it?

 

Link to comment
Share on other sites

Under certain assumptions (which I hope will be self-evident), you could something like:

Check if the last line matches the pattern of "@@ #####"; if yes, set the Country field to USA (or leave it empty?),  the State field to the first word of the last line, the ZIP field to the second word of the last line, and the City to the entire line before last .

Otherwise , set the Country field to the entire last line. 

If the last line is "Canada", parse the line before last as follows: the last 2 words (or 7 characters) go to the ZIP field, the 3rd word from right is the State, and the rest is the City. If the last line is something else, then we don't know how to handle the line before last.

Now we need to look at the remaining lines: if there are 2 remaining lines, then the first of them is Owner1 and the other one is Address 1. If there are 4 of them, then the first 2 are owners and the other two are the address. If there are 3 remaining lines, then check if the 2nd line begins with a digit character (or with a word that consist of digits only?). If yes, then there is one owner and two address lines, otherwise it's two owners and one address line.

The problem with this approach is, of course, that if will fail colossally when any of these assumptions is not true, as well as the missing instructions for countries other than USA and Canada. So at best it can serve as a first draft for human review.

 

Edited by comment
  • Like 1
Link to comment
Share on other sites

This topic is 1561 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.