Import plain text to FM

Wardiam · October 31, 2013

Hi everybody,

happy Halloween!!! I have a problem to import a text file into FM. I have this scheme (without "[", "]" symbols):

>[Accession Code1] [Description1]

[ADFGDLFGRTGKRTKRDTHKRKTHKDRTHRDKTKHDR

FGFGHSFHSFHSFGHFGSHFSSFHSFHFGHSFGHSFH

GFHSGFHSFGHS]

>[Accession Code2] [Description2]

[ADFGRVSIVNANVAENBVIAENBIOAENBIHRDKTKHDR

IVAVNAOENVIOENVIOAENVVIAENANOIANAVIOENVA

GFHSGFHSFGHS]

....

I would like to include this information in different columns as:

Column1: [Accession Code]

Column2: [Description]

Column3: [Letter Sequence]

Could anyone help me?

Thank you very much,

Wardiam

comment · October 31, 2013

It's not a format that Filemaker can import directly. You could probably import it into a temp table and parse it out from there. It would be best to attach an actual file as an example, so we can see exactly how it's structured (esp. where carriage returns are). Zip the file before posting it here to protect against accidental modification by all those servers along the way.

Wardiam · November 1, 2013

I attach an example file.

Pputida.zip

Thanks.

comment · November 1, 2013

See if this works for you:

Parser.fp7.zip

Note: make sure "Tab-Separated Text Files" is selected when you choose the file to import.

Wardiam · November 1, 2013

Yes...Great!!! It 's exactly that I want. I'm going to study your example to understand it.

Thank you very much,

Wardiam

Wardiam · November 1, 2013

Your example works fabulously but, is it possible to import the "SEQUENCE" field without carriage returns, please?

Thanks again,

Wardiam

comment · November 1, 2013

I'm going to study your example to understand it.

Please do - that is the purpose of the assistance you get here. Then you'll be also able to make the above (or any other) required modification (as well as remove one entirely redundant step I forgot in there...).

Wardiam · November 1, 2013

Yes, it's true, I understand you. Then I will continue studing your example and if I have some doubt I tell you again.

Thanks,

Wardiam

bogwort47 · February 12, 2014

Hi

txt = Text Editor, sorry I was being lazy.

Character Count? The file lines I gave a sample of above are from a population of 28,194,115 lines - 1.02 GB on disk (1,022,402,995 bytes)

The largest file has about 23.5 Million lines 2 to 3 times the character count - 2.16 GB on disk (2,155,500,248 bytes)

This is the link to where the files are: http://data.gov.uk/dataset/anonymised_mot_test

The 2 problem files are the 2013 files, bottom left. Sorry, FMP 8.0V1 not '7' which crops up in the file extn as 'fp7'.

Is this OK?

Thanks for coming back to me.

PS I am confused - I cannot find my original post and your questions now - Heh Ho

I just copied one of the files and pasted it into TextWrangler without any problems. I then did a find and replace | with t and it seemed to do what you want?

Sheesh, Lee. I didn't reply here because I didn't want to ruin your work - and now you do?

Sorry, FMP 8.0V1 not '7' which crops up in the file extn as 'fp7'.

Please update your profile to Reflect your current version, here is a quick link for your convenience. MY PROFILEhttp://fmforums.com/forum/index.php?app=core&module=usercp&tab=core&area=profileinfo

Hi comment,

What did I do? I didm’t mean to screw things up. :sad:

Uhm, you need to move these last ones (starting from #9) to the new thread you opened here?

http://fmforums.com/forum/topic/91080-how-to-deal-with-large-text-imports/

Ok - I've done that - it now shows '8'.

Sorry, I'm getting confused here - are you suggesting I get hold of Text Wrangler and try that please?

Lee Smith · February 12, 2014

Because, the original poster multiple posted this question, I had to merge a couple of his topics/ replies, Unfortunately, it makes this thread hard to follow.

Lee Smith · February 12, 2014

What I’m saying is, you will have two put the files into some text editor or Excel and manipulate the information in order to import it into your FileMaker files.

My example above would be the first step in getting it manipulated to bring them into FileMaker. I find TextWrangler ability to use regular expressions very helpful in cleaning up text files like this.

Hope this helps,

Lee

comment · February 12, 2014

@Lee: which file did you download?

Lee Smith · February 12, 2014

@ comment

http://data.dft.gov.uk/anonymised-mot-test/12-03/item_detail.txt

comment · February 12, 2014

Why don't you try this one on for size?

http://data.dft.gov.uk/anonymised-mot-test/12-03/test_result_2013.txt.gz

Lee Smith · February 12, 2014

Holy cow, is this one from the same site. Over 4 hours to download a text file?

bogwort47 · February 12, 2014

Hi Lee

Thanks for that - I've been a bit confused this week and I was really beginning to think I was loosing it - thank you for your patience - I've tried Wrangler with the larger file both copying across and using the insert file content function but failed because of insufficient memory. I guess because I am limited to 2G that even with the smaller file I'm still looking at a file splitter?

comment · February 12, 2014

Splitting the files into smaller chunks is the smart thing to do here, IMHO - not only because the required text manipulation, but also because importing such a large data file can easily be aborted in the middle of the process due to RAM running out.

This is not really a Filemaker question now, but I'd suggest you learn how to use the split command from the command line (i.e. the Terminal application).

bogwort47 · February 12, 2014

Sorry I'm a bit thick - Terminal Application??

OK - In Utilities I guess

bogwort47 · February 13, 2014

Gentlemen - Thanks for getting my brain straightened out. I eventually found a free test version of File Splitter SE (also in Apple Store) which works like a dream. I've split the files into ~ 500,000 Mb chunks (quite quick), used Text Editor to change to csv (~6 hrs) and now I'm importing into FMP the 1st chunk (looking like about 5 hrs).

For what it is worth Excel imports the original file just fine without converting to csv BUT the version I have has a max page size of ~65K lines - so with 30 x10⁶ lines to import that would look like ~ 0.5 x 10⁶ pages - unless there is a bigger version of Excel??

The question is now: "When will FMP be modified so that the 'field separator' can be selected by the operator before import OR better still determine for itself what the separator is after getting an answer to the question 'How many fields in original data?'?"

Thanks again,

comment · February 13, 2014

AFAIK, the current version of Excel (since 2007?) supports ~1 million rows.

When will FMP be modified so that the 'field separator' can be selected by the operator before import OR better still determine for itself what the separator is after getting an answer to the question 'How many fields in original data?

My guess? Never. Unlike a spreadsheet, a database requires a very strictly structured input. It makes sense to limit the support to a few well-established formats and leave the "guess what my structure is" game to the more forgiving applications.

bogwort47 · February 14, 2014

Thank you very much for your responses - both prompt and helpful.

Wardiam · September 25, 2014

Hi everybody,

Comment, I use frequently your import+parse script to reorganize my protein databases but I have a lot of problems with big databases. I have in this link a database with 130MB of plain text:

https://drive.google.com/file/d/0B5qk9fj1FG3ZMTRoSDQyRF9na0E/edit?usp=sharing

I initially can import the file to my filemaker solution but when I use the parse script after half an hour, filemaker is closed with an error message. I have tried to increase the cache size up to 512 MB (maximum size) but the program is closed again. Could anyone help me to improve application performance?

Thanks,

Wardiam

Sign In

Import plain text to FM

Recommended Posts

Wardiam

comment

Wardiam

comment

Wardiam

Wardiam

comment

Wardiam

bogwort47

Lee Smith

Lee Smith

comment

Lee Smith

comment

Lee Smith

bogwort47

comment

bogwort47

bogwort47

comment

bogwort47

Wardiam

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information