October 31, 201312 yr Hi everybody, happy Halloween!!! I have a problem to import a text file into FM. I have this scheme (without "[", "]" symbols): >[Accession Code1] [Description1] [ADFGDLFGRTGKRTKRDTHKRKTHKDRTHRDKTKHDR FGFGHSFHSFHSFGHFGSHFSSFHSFHFGHSFGHSFH GFHSGFHSFGHS] >[Accession Code2] [Description2] [ADFGRVSIVNANVAENBVIAENBIOAENBIHRDKTKHDR IVAVNAOENVIOENVIOAENVVIAENANOIANAVIOENVA GFHSGFHSFGHS] .... I would like to include this information in different columns as: Column1: [Accession Code] Column2: [Description] Column3: [Letter Sequence] Could anyone help me? Thank you very much, Wardiam
October 31, 201312 yr It's not a format that Filemaker can import directly. You could probably import it into a temp table and parse it out from there. It would be best to attach an actual file as an example, so we can see exactly how it's structured (esp. where carriage returns are). Zip the file before posting it here to protect against accidental modification by all those servers along the way.
November 1, 201312 yr See if this works for you: Parser.fp7.zip Note: make sure "Tab-Separated Text Files" is selected when you choose the file to import.
November 1, 201312 yr Author Yes...Great!!! It 's exactly that I want. I'm going to study your example to understand it. Thank you very much, Wardiam
November 1, 201312 yr Author Your example works fabulously but, is it possible to import the "SEQUENCE" field without carriage returns, please? Thanks again, Wardiam
November 1, 201312 yr I'm going to study your example to understand it. Please do - that is the purpose of the assistance you get here. Then you'll be also able to make the above (or any other) required modification (as well as remove one entirely redundant step I forgot in there...).
November 1, 201312 yr Author Yes, it's true, I understand you. Then I will continue studing your example and if I have some doubt I tell you again. Thanks, Wardiam
February 12, 201411 yr Hi txt = Text Editor, sorry I was being lazy. Character Count? The file lines I gave a sample of above are from a population of 28,194,115 lines - 1.02 GB on disk (1,022,402,995 bytes) The largest file has about 23.5 Million lines 2 to 3 times the character count - 2.16 GB on disk (2,155,500,248 bytes) This is the link to where the files are: http://data.gov.uk/dataset/anonymised_mot_test The 2 problem files are the 2013 files, bottom left. Sorry, FMP 8.0V1 not '7' which crops up in the file extn as 'fp7'. Is this OK? Thanks for coming back to me. PS I am confused - I cannot find my original post and your questions now - Heh Ho I just copied one of the files and pasted it into TextWrangler without any problems. I then did a find and replace | with t and it seemed to do what you want? Sheesh, Lee. I didn't reply here because I didn't want to ruin your work - and now you do? Sorry, FMP 8.0V1 not '7' which crops up in the file extn as 'fp7'. Please update your profile to Reflect your current version, here is a quick link for your convenience. MY PROFILEhttp://fmforums.com/forum/index.php?app=core&module=usercp&tab=core&area=profileinfo Hi comment, What did I do? I didm’t mean to screw things up. Uhm, you need to move these last ones (starting from #9) to the new thread you opened here? http://fmforums.com/forum/topic/91080-how-to-deal-with-large-text-imports/ Ok - I've done that - it now shows '8'. Sorry, I'm getting confused here - are you suggesting I get hold of Text Wrangler and try that please?
February 12, 201411 yr Because, the original poster multiple posted this question, I had to merge a couple of his topics/ replies, Unfortunately, it makes this thread hard to follow.
February 12, 201411 yr What I’m saying is, you will have two put the files into some text editor or Excel and manipulate the information in order to import it into your FileMaker files. My example above would be the first step in getting it manipulated to bring them into FileMaker. I find TextWrangler ability to use regular expressions very helpful in cleaning up text files like this. Hope this helps, Lee
February 12, 201411 yr Why don't you try this one on for size? http://data.dft.gov.uk/anonymised-mot-test/12-03/test_result_2013.txt.gz
February 12, 201411 yr Holy cow, is this one from the same site. Over 4 hours to download a text file?
February 12, 201411 yr Hi Lee Thanks for that - I've been a bit confused this week and I was really beginning to think I was loosing it - thank you for your patience - I've tried Wrangler with the larger file both copying across and using the insert file content function but failed because of insufficient memory. I guess because I am limited to 2G that even with the smaller file I'm still looking at a file splitter?
February 12, 201411 yr Splitting the files into smaller chunks is the smart thing to do here, IMHO - not only because the required text manipulation, but also because importing such a large data file can easily be aborted in the middle of the process due to RAM running out. This is not really a Filemaker question now, but I'd suggest you learn how to use the split command from the command line (i.e. the Terminal application).
February 13, 201411 yr Gentlemen - Thanks for getting my brain straightened out. I eventually found a free test version of File Splitter SE (also in Apple Store) which works like a dream. I've split the files into ~ 500,000 Mb chunks (quite quick), used Text Editor to change to csv (~6 hrs) and now I'm importing into FMP the 1st chunk (looking like about 5 hrs). For what it is worth Excel imports the original file just fine without converting to csv BUT the version I have has a max page size of ~65K lines - so with 30 x106 lines to import that would look like ~ 0.5 x 106 pages - unless there is a bigger version of Excel?? The question is now: "When will FMP be modified so that the 'field separator' can be selected by the operator before import OR better still determine for itself what the separator is after getting an answer to the question 'How many fields in original data?'?" Thanks again,
February 13, 201411 yr AFAIK, the current version of Excel (since 2007?) supports ~1 million rows. When will FMP be modified so that the 'field separator' can be selected by the operator before import OR better still determine for itself what the separator is after getting an answer to the question 'How many fields in original data? My guess? Never. Unlike a spreadsheet, a database requires a very strictly structured input. It makes sense to limit the support to a few well-established formats and leave the "guess what my structure is" game to the more forgiving applications.
September 25, 201411 yr Author Hi everybody, Comment, I use frequently your import+parse script to reorganize my protein databases but I have a lot of problems with big databases. I have in this link a database with 130MB of plain text: https://drive.google.com/file/d/0B5qk9fj1FG3ZMTRoSDQyRF9na0E/edit?usp=sharing I initially can import the file to my filemaker solution but when I use the parse script after half an hour, filemaker is closed with an error message. I have tried to increase the cache size up to 512 MB (maximum size) but the program is closed again. Could anyone help me to improve application performance? Thanks, Wardiam
Create an account or sign in to comment