Can FMP be used to clean up import files BEFORE the import?

MacHammer · November 8, 2006

Hi everyone,

One of the processes I have to manually do prior to importing data into one of our solutions is to clean all the commas out of a tab-delimited file. Currently, I have to open the file in a text editor and replace all commas with a space, and save the file. Then, FMP can be used to import all of the data properly.

I'm trying to automate as much as I can in this process. I'd like to be able to "pre-import" the text file and then use FMP to search for the commas and replace with a space and then use the resulting data as the source of the import into the actual data table.

Might this be possible?

Thanks for your help!

Mac Hammer

ThatOneGuy · November 8, 2006

Hi MacHammer:

Yeah, seems like it could be done when thinking in terms of a buffer file. I can't offer great specifics yet on all the routines you would need (only one's hairdresser knows for sure), but much of that "groundskeeping" could be done in the buffer file.

Additionally, we can have a full or partial complement of the "main" file's fields in the buffer file. This provides an opportunity to prep the data, review the data for any integrity issues, as well as a means to check whether the records we're about to import already exist in the main file ... if one needed to do so.

Stick with it. You've got a good idea!

Edited November 8, 2006 by Guest
for clarity

MacHammer · November 8, 2006

Thanks!

I'm playing with it now and have created the "buffer" file you mentioned as a test, but it could grow into the actual solution. What I've toyed with right now is to import the data, not into a text field, but into a container. Each line of the data IS coming in, without the comma messing up the import. If we try to import the same data into a TEXT field, the comma screws it up and the data before the comma is one record, while the data after is a second record. That, of course, is what I'm trying to avoid.

So, the container seems to handle the import, but since it is a container, I can't seem to edit the data. Then, I made a calculation field that basically sets the new field equal to the container, but makes it text. However, I still can't edit it because it is now a calculation based on an un-editable field. AAAAAAAHHHHHHH! !!! !!!

I seem SOOOO close.

Keep those ideas coming! I know we can do it.

Mac

ThatOneGuy · November 8, 2006

Cool! You're hot on the trail. Just thinking out loud here, so don't hold me to the fire, but what about ...

• importing the text file into a single text field (a global, perhaps)

• performing any of the preliminary groundskeeping you can at that point (could have an assortment of scripted buttons to remove/replace commas and the like with preferred characters through the use of functions such as, but not limited to, Substitute, Filter, Upper, Lower, etc.)

• if you want to go with the "shadow" fields approach in your buffer file, a script could then parse those massaged, return-separated values in the "big" field and create a record for each value....

• Otherwise, perhaps the Export Field Contents feature could lend a hand.

You gotta stop me here! As long as you don't mind doing the experimenting, I'll try to help out where I can. Let us know what you find.

MacHammer · November 8, 2006

Cool.

While you were responding, here is what I've learned:

First off, the container import field will actually separate records into properly split records the INCLUDE the commas, so we already get around the first comma problem right there. And actually, the commas aren't the problem, per se...

The data file we import from has commas that separate Last Name and First Name fields, as in "Hammer, Mac" and in the Residential Address as "City, State ZIP". They aren't problematic, assuming they don't mess with the import. Once in the database, we can deal with the commas, as they primarily just then get printed on envelopes for mail delivery, and the post office certainly doesn't care about them. So, that might solve the actual problem, right there.

The next step, is since the text file is made of fixed-length data fields, we have to calculate out that the first 40 characters are the NAME, the second 40 are the Res Address, etc. That isn't hard, just time consuming.

Then, I want to IMPORT/EXPORT TO/FROM this buffer to the actual database, where I import just the field data and get rid of all the fixed-length data in the main database.

I'm not trying to duck the actual resolution of finding a way to remove the commas while in the buffer file, but I may have solved the larger issue!

Mac

ThatOneGuy · November 9, 2006

That's pretty impressive, Mac! Wow, you've really taken off running with this. That's great!

Then, I want to IMPORT/EXPORT TO/FROM this buffer to the actual database, where I import just the field data and get rid of all the fixed-length data in the main database. Emphasis added.

The more refinement that can be done in the Buffer file, the less drag there may be when it comes to importing to the Main file, especially if the Main file is big, honkin' monster. Some people have reported very slow import/record creation. Some say FM8 is better than FM7, others say it's just the opposite. The more work that can be done in the Buffer, it may be beneficial.

If you have FM8.5 Advanced (or even FM8.0 Advanced), you can "copy" whole tables out of the main file and "paste" them into the Buffer file. Same with fields.

I bring this up because one of the "selling points" for me in this Buffer file model is the opportunity to review/clean/purge the data before it is truly added to the Main file. If you've ever squirreled up a target file with an erroneous/premature import, you know why this appeals to me.

In any respect, I commend you for your work! :beertime:

Russell Watson · November 9, 2006

Hi! I'm also working on an import/export utility - interesting topic....

what I don't understand here though is what the problem is when importing tab-delimited data.

I've just made a test file with name and address fields and can quite happily import and export data to/from the fields - including commas. How do commas affect your import? What extension does the file have .txt .tab .csv?

MacHammer · November 9, 2006

Mr. Watson,

You know, not being able to import commas seemed very strange to me, and I went back to my original solution and realized that the initial import was probably botched, and I blamed it on the commas. My new "buffer" database is able to handle the commas just fine. Now, I'm trying to remember is there was another downstream reason that we needed to kill the commas, but I can't remember one if there was. I still like the idea of having the buffer database to clean up stuff before it lands in the real user database, so I'm still going to keep working on this, but you are right. The commas shouldn't matter in a Tab-Delimited solution.

Thanks.

Mac

xochi · November 10, 2006

One issue: I've found that if you don't specify the import format, filemaker attempts to choose the format based on the contents. So, if you have commas in the text, FM will decide it's a comma-delimited format even if it's really a tab-delimited one. This has bit me a few times.

However, if you do set it properly to Tab-delimited, it should handle commas just fine.

MacHammer · November 10, 2006

That! Seems to be exactly what happened...and I got bit.

I've now corrected it.

Thank you.

Mac

Sign In

Can FMP be used to clean up import files BEFORE the import?

Recommended Posts

MacHammer

ThatOneGuy

MacHammer

ThatOneGuy

MacHammer

ThatOneGuy

Russell Watson

MacHammer

xochi

MacHammer

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Explore

Affiliate Forums

Activity

Important Information