probably easy to implement, but I'm a beginner

June 4, 200620 yr

Newbies

To all that have the know how to implement what seems logically like an easy to solve problem (considering the power of this amazing program),

I have a directory of text files (~600 of them) that were typed by my dad's secretary about 10 years ago that contain, in each file, one contact's information for an address book.

Unfortunately (for me), the files are in no delimited format with which I could use a standard import to gather the data into the necessary component fields.

I have included a zipfile containing a sample of the variations of the sourcefiles that I need to parse. I tried to choose the most diverse ones so that the algorithm suggested might be over-engineered and be able to handle all, but this is just optimistic praying :

** This is what my goal is:

I would like to run through these text files (usually one contact per file, so long as it does not contain an alternate contact as in the WEBER.TXT file that is in the archive I attached herein), and pull out the components of a basic address book, provided that they exist for that record.

Such as:

Name (First, Last) or Company Name with Contact

Street Address

City, State Zip

Telephone #

Notes: (TO CONTAIN ALL EXTRANEOUS TEXT THAT REMAINS AFTER PARSING IMPORTANT DATA (ABOVE))

It does not need to be perfect, as there will be a secretary to scrub through and correct any oversights that a script may not catch.

In addition, and this is probably really easy but I am just so new at this I haven't a clue how to locate and remove multiple lines of text, but about 1/5 of the text files were written originally in wordperfect as envelopes (including a return address). I would like to strip out the return address completely, as it will probably just make the location of the important data more difficult (an example of which can be seen in the file joesmith.txt within the archive attached herein).

Thank you, in advance, for all your help. I really haven't a clue where to begin. I started reading the man pages for 'grep' and it seems like this would be a way to do it, I just don't know how to pass the data to fields and such methods.

I appreciate it very much.

Yours truly,

Jonathan

FileMaker_Project.zip

June 4, 200620 yr

There doesn't seem to be any sort of re-occuring format between the files...

June 5, 200620 yr

Hi Jonathan,

I agree with [color:blue]Genx on this. Your example text seems to be doubled spaced, and single spaced.

You should normailze it, and remove the returns between the fields. I use TextWrangler, which as some great features for finding and replacing.

The record should look like:

FirstName (tab)LastName (tab) StreetAddress (tab) City (tab) State (tab) Zip

HTH

Lee

June 5, 200620 yr

Author
Newbies

Hey all,

thanks for the replies, but my goal was to try and pick out information based on the few unique traits that exist, such as:

1. Any text in the format (###) ###-#### or ###/###-#### is a phone number.

2. Any Line with a number as the first character, followed by a string of one or more characters likely will be a street address

- OR -

The line preceding a line that contains a string, followed by a "," followed by 5 numbers, will be street address, while the identified line contains "city, state, zipcode".

3. then the rest of the information, for all practical purposes can be clumped together and placed in a "notes", or "misc" field for later revision by my dad.

The agitating part is that I am able to pick out information like the phone number using a grep statement, but I don't know the syntax enough to know how to use regular expressions within filemaker...I think I am a bit confused.

Thanks for any clarification.

Jonathan

June 5, 200620 yr

Author
Newbies

Merged--

I have a problem that I would appreciate some tips on how to solve:

I have a directory of text files that contain contact information of the form:

Name

Company Name

Street Address

City, State Zip

but it is not as straightforward because the above block of text may occur anywhere within the text file, many times accompanied by some notes that are pertinent to each of the contacts.

Can someone suggest a way, other than one-by-one, to extract this block of data from about 600 records, regardless of leading or trailing info, so long as the format of the actual address is as above?

Many Thanks,

Jonathan

Edited June 5, 200620 yr by Guest

June 5, 200620 yr

Filemaker does not have native grep capability. If you have a way of parsing out the required information with grep, go ahead and do it in a text editor such as TextWrangler, or any other tool that supports grep operations. Once you have that, format the data as indicated in Lee's post above, and import it into Filemaker.

TextWrangler's predecessor, BBEdit Lite, could also concatenate several text files into one - sounds like it would be useful to start with something like that.