Wardiam Posted October 31, 2013 Share Posted October 31, 2013 Hi everybody, happy Halloween!!! I have a problem to import a text file into FM. I have this scheme (without "[", "]" symbols): >[Accession Code1] [Description1] [ADFGDLFGRTGKRTKRDTHKRKTHKDRTHRDKTKHDR FGFGHSFHSFHSFGHFGSHFSSFHSFHFGHSFGHSFH GFHSGFHSFGHS] >[Accession Code2] [Description2] [ADFGRVSIVNANVAENBVIAENBIOAENBIHRDKTKHDR IVAVNAOENVIOENVIOAENVVIAENANOIANAVIOENVA GFHSGFHSFGHS] .... I would like to include this information in different columns as: Column1: [Accession Code] Column2: [Description] Column3: [Letter Sequence] Could anyone help me? Thank you very much, Wardiam Link to comment Share on other sites More sharing options...
comment Posted October 31, 2013 Share Posted October 31, 2013 It's not a format that Filemaker can import directly. You could probably import it into a temp table and parse it out from there. It would be best to attach an actual file as an example, so we can see exactly how it's structured (esp. where carriage returns are). Zip the file before posting it here to protect against accidental modification by all those servers along the way. Link to comment Share on other sites More sharing options...
Wardiam Posted November 1, 2013 Author Share Posted November 1, 2013 I attach an example file. Pputida.zip Thanks. Link to comment Share on other sites More sharing options...
comment Posted November 1, 2013 Share Posted November 1, 2013 See if this works for you: Parser.fp7.zip Note: make sure "Tab-Separated Text Files" is selected when you choose the file to import. Link to comment Share on other sites More sharing options...
Wardiam Posted November 1, 2013 Author Share Posted November 1, 2013 Yes...Great!!! It 's exactly that I want. I'm going to study your example to understand it. Thank you very much, Wardiam Link to comment Share on other sites More sharing options...
Wardiam Posted November 1, 2013 Author Share Posted November 1, 2013 Your example works fabulously but, is it possible to import the "SEQUENCE" field without carriage returns, please? Thanks again, Wardiam Link to comment Share on other sites More sharing options...
comment Posted November 1, 2013 Share Posted November 1, 2013 I'm going to study your example to understand it. Please do - that is the purpose of the assistance you get here. Then you'll be also able to make the above (or any other) required modification (as well as remove one entirely redundant step I forgot in there...). Link to comment Share on other sites More sharing options...
Wardiam Posted November 1, 2013 Author Share Posted November 1, 2013 Yes, it's true, I understand you. Then I will continue studing your example and if I have some doubt I tell you again. Thanks, Wardiam Link to comment Share on other sites More sharing options...
bogwort47 Posted February 12, 2014 Share Posted February 12, 2014 Hi txt = Text Editor, sorry I was being lazy. Character Count? The file lines I gave a sample of above are from a population of 28,194,115 lines - 1.02 GB on disk (1,022,402,995 bytes) The largest file has about 23.5 Million lines 2 to 3 times the character count - 2.16 GB on disk (2,155,500,248 bytes) This is the link to where the files are: http://data.gov.uk/dataset/anonymised_mot_test The 2 problem files are the 2013 files, bottom left. Sorry, FMP 8.0V1 not '7' which crops up in the file extn as 'fp7'. Is this OK? Thanks for coming back to me. PS I am confused - I cannot find my original post and your questions now - Heh Ho I just copied one of the files and pasted it into TextWrangler without any problems. I then did a find and replace | with t and it seemed to do what you want? Sheesh, Lee. I didn't reply here because I didn't want to ruin your work - and now you do? Sorry, FMP 8.0V1 not '7' which crops up in the file extn as 'fp7'. Please update your profile to Reflect your current version, here is a quick link for your convenience. MY PROFILEhttp://fmforums.com/forum/index.php?app=core&module=usercp&tab=core&area=profileinfo Hi comment, What did I do? I didm’t mean to screw things up. Uhm, you need to move these last ones (starting from #9) to the new thread you opened here? http://fmforums.com/forum/topic/91080-how-to-deal-with-large-text-imports/ Ok - I've done that - it now shows '8'. Sorry, I'm getting confused here - are you suggesting I get hold of Text Wrangler and try that please? Link to comment Share on other sites More sharing options...
Lee Smith Posted February 12, 2014 Share Posted February 12, 2014 Because, the original poster multiple posted this question, I had to merge a couple of his topics/ replies, Unfortunately, it makes this thread hard to follow. Link to comment Share on other sites More sharing options...
Lee Smith Posted February 12, 2014 Share Posted February 12, 2014 What I’m saying is, you will have two put the files into some text editor or Excel and manipulate the information in order to import it into your FileMaker files. My example above would be the first step in getting it manipulated to bring them into FileMaker. I find TextWrangler ability to use regular expressions very helpful in cleaning up text files like this. Hope this helps, Lee Link to comment Share on other sites More sharing options...
comment Posted February 12, 2014 Share Posted February 12, 2014 @Lee: which file did you download? Link to comment Share on other sites More sharing options...
Lee Smith Posted February 12, 2014 Share Posted February 12, 2014 @ comment http://data.dft.gov.uk/anonymised-mot-test/12-03/item_detail.txt Link to comment Share on other sites More sharing options...
comment Posted February 12, 2014 Share Posted February 12, 2014 Why don't you try this one on for size? http://data.dft.gov.uk/anonymised-mot-test/12-03/test_result_2013.txt.gz Link to comment Share on other sites More sharing options...
Lee Smith Posted February 12, 2014 Share Posted February 12, 2014 Holy cow, is this one from the same site. Over 4 hours to download a text file? Link to comment Share on other sites More sharing options...
bogwort47 Posted February 12, 2014 Share Posted February 12, 2014 Hi Lee Thanks for that - I've been a bit confused this week and I was really beginning to think I was loosing it - thank you for your patience - I've tried Wrangler with the larger file both copying across and using the insert file content function but failed because of insufficient memory. I guess because I am limited to 2G that even with the smaller file I'm still looking at a file splitter? Link to comment Share on other sites More sharing options...
comment Posted February 12, 2014 Share Posted February 12, 2014 Splitting the files into smaller chunks is the smart thing to do here, IMHO - not only because the required text manipulation, but also because importing such a large data file can easily be aborted in the middle of the process due to RAM running out. This is not really a Filemaker question now, but I'd suggest you learn how to use the split command from the command line (i.e. the Terminal application). Link to comment Share on other sites More sharing options...
bogwort47 Posted February 12, 2014 Share Posted February 12, 2014 Sorry I'm a bit thick - Terminal Application?? OK - In Utilities I guess Link to comment Share on other sites More sharing options...
bogwort47 Posted February 13, 2014 Share Posted February 13, 2014 Gentlemen - Thanks for getting my brain straightened out. I eventually found a free test version of File Splitter SE (also in Apple Store) which works like a dream. I've split the files into ~ 500,000 Mb chunks (quite quick), used Text Editor to change to csv (~6 hrs) and now I'm importing into FMP the 1st chunk (looking like about 5 hrs). For what it is worth Excel imports the original file just fine without converting to csv BUT the version I have has a max page size of ~65K lines - so with 30 x106 lines to import that would look like ~ 0.5 x 106 pages - unless there is a bigger version of Excel?? The question is now: "When will FMP be modified so that the 'field separator' can be selected by the operator before import OR better still determine for itself what the separator is after getting an answer to the question 'How many fields in original data?'?" Thanks again, Link to comment Share on other sites More sharing options...
comment Posted February 13, 2014 Share Posted February 13, 2014 AFAIK, the current version of Excel (since 2007?) supports ~1 million rows. When will FMP be modified so that the 'field separator' can be selected by the operator before import OR better still determine for itself what the separator is after getting an answer to the question 'How many fields in original data? My guess? Never. Unlike a spreadsheet, a database requires a very strictly structured input. It makes sense to limit the support to a few well-established formats and leave the "guess what my structure is" game to the more forgiving applications. Link to comment Share on other sites More sharing options...
bogwort47 Posted February 14, 2014 Share Posted February 14, 2014 Thank you very much for your responses - both prompt and helpful. Link to comment Share on other sites More sharing options...
Wardiam Posted September 25, 2014 Author Share Posted September 25, 2014 Hi everybody, Comment, I use frequently your import+parse script to reorganize my protein databases but I have a lot of problems with big databases. I have in this link a database with 130MB of plain text: https://drive.google.com/file/d/0B5qk9fj1FG3ZMTRoSDQyRF9na0E/edit?usp=sharing I initially can import the file to my filemaker solution but when I use the parse script after half an hour, filemaker is closed with an error message. I have tried to increase the cache size up to 512 MB (maximum size) but the program is closed again. Could anyone help me to improve application performance? Thanks, Wardiam Link to comment Share on other sites More sharing options...
Recommended Posts
This topic is 3512 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now