Jump to content
Claris Engage 2025 - March 25-26 Austin Texas ×

This topic is 3712 days old. Please don't post here. Open a new topic instead.

Recommended Posts

Posted

Hi everybody,

 

happy Halloween!!! I have a problem to import a text file into FM. I have this scheme (without "[", "]" symbols):

 

>[Accession Code1] [Description1]

[ADFGDLFGRTGKRTKRDTHKRKTHKDRTHRDKTKHDR

FGFGHSFHSFHSFGHFGSHFSSFHSFHFGHSFGHSFH

GFHSGFHSFGHS]

 

>[Accession Code2] [Description2]

[ADFGRVSIVNANVAENBVIAENBIOAENBIHRDKTKHDR

IVAVNAOENVIOENVIOAENVVIAENANOIANAVIOENVA

GFHSGFHSFGHS]

 

....

 

I would like to include this information in different columns as:

 

Column1: [Accession Code]

Column2: [Description]

Column3: [Letter Sequence]

 

Could anyone help me?

 

Thank you very much,

Wardiam

Posted

It's not a format that Filemaker can import directly. You could probably import it into a temp table and parse it out from there. It would be best to attach an actual file as an example, so we can see exactly how it's structured (esp. where carriage returns are). Zip the file before posting it here to protect against accidental modification by all those servers along the way.

Posted

Yes...Great!!! It 's exactly that I want. I'm going to study your example to understand it.

 

Thank you very much,

Wardiam

Posted

Your example works fabulously but, is it possible to import the "SEQUENCE" field without carriage returns, please?

 

Thanks again,

Wardiam

Posted

I'm going to study your example to understand it.

 

Please do - that is the purpose of the assistance you get here. Then you'll be also able to make the above (or any other) required modification (as well as remove one entirely redundant step I forgot in there...).

Posted

Yes, it's true, I understand you. Then I will continue studing your example and if I have some doubt I tell you again.

 

Thanks,

Wardiam

  • 3 months later...
Posted

Hi

txt = Text Editor, sorry I was being lazy.

Character Count? The file lines I gave a sample of above are from a population of 28,194,115 lines - 1.02 GB on disk (1,022,402,995 bytes)

The largest file has about 23.5 Million lines 2 to 3 times the character count - 2.16 GB on disk (2,155,500,248 bytes)

This is the link to where the files are: http://data.gov.uk/dataset/anonymised_mot_test

The 2 problem files are the 2013 files, bottom left. Sorry, FMP 8.0V1 not '7' which crops up in the file extn as 'fp7'.

Is this OK?

Thanks for coming back to me.

PS I am confused - I cannot find my original post and your questions now - Heh Ho

I just copied one of the files and pasted it into TextWrangler without any problems. I then did a find and replace | with t and it seemed to do what you want?

Sheesh, Lee. I didn't reply here because I didn't want to ruin your work - and now you do?

Sorry, FMP 8.0V1 not '7' which crops up in the file extn as 'fp7'.

Please update your profile to Reflect your current version, here is a quick link for your convenience. MY PROFILEhttp://fmforums.com/forum/index.php?app=core&module=usercp&tab=core&area=profileinfo

Hi comment,

What did I do? I didm’t mean to screw things up. :sad:

Uhm, you need to move these last ones (starting from #9) to the new thread you opened here?

http://fmforums.com/forum/topic/91080-how-to-deal-with-large-text-imports/

Ok - I've done that - it now shows '8'.

Sorry, I'm getting confused here - are you suggesting I get hold of Text Wrangler and try that please?

Posted

 Because, the original poster  multiple posted this question, I had to merge a couple of his topics/ replies, Unfortunately, it makes this thread hard to follow. 

Posted

What I’m saying is, you will have two put the files into some text editor or Excel and manipulate the information in order to import it into your FileMaker files.

 

My example above would be the first step in getting it manipulated to bring them into FileMaker. I find TextWrangler ability to use regular expressions very helpful in cleaning up text files like this.

 

Hope this helps,

 

Lee

Posted

Hi Lee

 

Thanks for that - I've been a bit confused this week and I was really beginning to think I was loosing it - thank you for your patience - I've tried  Wrangler with the larger file both copying across and using the insert file content function but failed because of insufficient memory.  I guess because I am limited to 2G that even with the smaller file I'm still looking at a file splitter?

Posted

Splitting the files into smaller chunks is the smart thing to do here, IMHO - not only because the required text manipulation, but also because importing such a large data file can easily be aborted in the middle of the process due to RAM running out.

 

This is not really a Filemaker question now, but I'd suggest you learn how to use the split command from the command line (i.e. the Terminal application).

Posted

Gentlemen - Thanks for getting my brain straightened out.  I eventually found a free test version of File Splitter SE (also in Apple Store) which works like a dream.  I've split the files into ~ 500,000 Mb chunks (quite quick), used Text Editor to change to csv (~6 hrs) and now I'm importing into FMP the 1st chunk (looking like about 5 hrs).

 

For what it is worth Excel imports the original file just fine without converting to csv BUT the version I have has a max page size of ~65K lines - so with 30 x106 lines to import that would look like ~ 0.5 x 106 pages - unless there is a bigger version of Excel??

 

The question is now: "When will FMP be modified so that the 'field separator' can be selected by the operator before import OR better still determine for itself what the separator is after getting an answer to the question 'How many fields in original data?'?"

 

Thanks again,

Posted

AFAIK, the current version of Excel (since 2007?) supports ~1 million rows.

 

 

When will FMP be modified so that the 'field separator' can be selected by the operator before import OR better still determine for itself what the separator is after getting an answer to the question 'How many fields in original data?

 

My guess? Never. Unlike a spreadsheet, a database requires a very strictly structured input. It makes sense to limit the support to a few well-established formats and leave the "guess what my structure is" game to the more forgiving applications.

  • 7 months later...
Posted

Hi everybody,

 

Comment, I use frequently your import+parse script to reorganize my protein databases but I have a lot of problems with big databases. I have in this link a database with 130MB of plain text:

 

https://drive.google.com/file/d/0B5qk9fj1FG3ZMTRoSDQyRF9na0E/edit?usp=sharing

 

I initially can import the file to my filemaker solution but when I use the parse script after half an hour, filemaker is closed with an error message. I have tried to increase the cache size up to 512 MB (maximum size) but the program is closed again. Could anyone help me to improve application performance?

 

Thanks,

Wardiam

This topic is 3712 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.