Text Concatenation, stripping spaces and punctuation

GBNYC · March 22, 2006

Here's what i need to do, and I'm not sure if it's possible:

Out company needs to automatically generate a MATCHCODE field (a unique identifier), based upon the concatenation of three fields

- the first five alphanumeric characters of the COMPANY name

- the five digit ZIP (including leading zero)

- the first four alphanumeric characters of the ADDRESS.

note I say alphanumeric characters - spaces and punctuation are not to be included. Certain records would be easy, ie.

COMPANY: Sharp Imaging

ADDRESS:1143 Main Stree

ZIP: 04054

would result in

MATCHCODE: SHARP040541143

But this:

COMPANY: A.B. Stryker

ADDRESS: 14 Elm Street

ZIP: 11243

would result in something like...

MATCHCODE: A.B. 14 E11243

when I would need, based upon my criteria:

MATCHCODE: ABSTR14EL11243

Note this ignores spaces and punctuation and "counts" only the first five and four (respectively) valid alphanumeric characters in the COMAPNY and ADRRESS fields.

I hope I've made things clear - I'm thinking this can be done, but I could certainly use some help in this. Thanks in advance...

comment · March 22, 2006

You could use nested Substitute()s to filter out any unwanted characters (you'd need to anticipate all possible ones, though).

However, there's no guarantee that the result will be unique. Furthermore, if one of the parameters is changed, then the code cannot be recomputed. IOW, this is not a good idea. Use an auto-entered serial number, with no meaning attached to it, instead.

Lee Smith · March 22, 2006

using version 6, would require using the substitute function for your unique code you are after. (I'm wondering, isn't A.B 14 E1123 just as unique as ABSTR11EL1123?)

anyway, try

Left(Substitute(Substitute(Substitute(Company, "," , ""),

"." , ""),

" " , ""), 5) &

Left(Substitute(Substitute(Substitute(Address, "," , ""),

"." , ""),

" " , ""), 4) &

Left(Substitute(Substitute(Substitute(Zip, "," , ""),

"." , ""),

" " , ""), 5)

HTH

Lee

LaRetta · March 22, 2006

Hmmmm ... assume Denver, Colorado has Brown Company and Brown Hydraulics. And they both have post office boxes.

Brown Company

PO Box 243

Denver, CO 32749

ID = BROWPOBO32749

Brown Hydraulics

PO Box 196

Denver, CO 32749

ID = BROWPOBO32749

Russian roulette comes to mind ... :wink2:

Yeah, then Brown Hydraulics moves - how DARE THEY!!

Edited March 22, 2006 by Guest

GBNYC · March 22, 2006

You are all correct - I am also designing with a Unique serialized id to avoid the potholes of the old approach - this need is for legacy purposes) there will be data I/O with some ancient Paradox system, which no ones knows how to program, and the Matchcode needs to live on for some period of time...)

I got this working - thanks for your help and heads up...

GBNYC · March 23, 2006

Lee

I'm curious, because you tie your "Substitute" answer to version 6 - is there a simpler way of doing this in later versions of Filemaker(6 is what we have, and I don't know that an upgrade is in the budget for now, but I'd like to accumulate ammunition)

Thanks

Lee Smith · March 23, 2006

I'm curious, because you tie your "Substitute" answer to version 6 - is there a simpler way of doing this in later versions of Filemaker(6 is what we have, and I don't know that an upgrade is in the budget for now, but I'd like to accumulate ammunition)

Good question, It converted to this, when I converted the file to v8:

Left(Substitute(Company; ["," ; ""];

["." ; ""];

[" " ; ""]); 5) & Left(Substitute(Address; ["," ; ""];

["." ; ""];

[" " ; ""]); 4) & Left(Substitute(Zip; ["," ; ""];

["." ; ""];

[" " ; ""]); 5)

Lee

Fenton · March 23, 2006

I would agree with LaRetta, that you might want to include a few more characters, and check for mismatches. It is surprising what you see when you have thousands of records. A "1-in-a-thousand" chance happens several times.

There may also be false negatives, where a small mispelling skews the data. This happens often with people's names, less so with companies. But if data is entered, it is not always consistent. So you may also want to cast a slightly wider net, for those which are not matched with a tight one. It requires manual testing, but the numbers are small.

Another useful tool in later versions (since you want ammunition }:( -) is the Filter() function. Then you can be sure you'll have ONLY what you ask for.

Filter ( Company; "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789" )

And, you can use compound relationships, with multiple key fields. So you don't need to bloat the database with so many indexed concatenated fields. This makes it fairly painless to see up more than one test.

I would use the legacy "match" to populate/create correct IDs. This should be done right after every Import. Then checked.

Sign In

Text Concatenation, stripping spaces and punctuation

Recommended Posts

GBNYC

comment

Lee Smith

LaRetta

GBNYC

GBNYC

Lee Smith

Fenton

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information