Need script to Extract Text

K1200 · July 24, 2006

I have an instance where an imported text file needs to be parsed into separate blocks of text based on a user-designated delimiter character. For example, the user might specify "#" for the following text file:

[color:orange]First section of text. # Second section of text. # Third section of text. Sections can contain entire paragraphs; even multiple paragraphs. # Final section of text.

This would need to result in the following four separate records in FMP:

[color:orange]First section of text.

Second section of text.

Third section of text. Sections can contain entire paragraphs; even multiple paragraphs.

Final section of text.

I can use the Position and Left text functions to get the first occurrence, but what I need is an "extract text" function that can move forward each time it is called. Any advice will be greatly appreciated.

Søren Dyhr · July 24, 2006

I think I would approach the matter like this (attachment)

--sd

recieveNSpit.zip

K1200 · July 24, 2006

Very nice. I've never had occasion to use the Extend function so I have some research to do to fully understand this method. The drawback, if there is one, is that I'll have to determine the maximum sections that will ever be encountered (i.e., the 300 repetitions in your example), but I think that won't be a real problem.

Thanks very much for an elegant solution.

Fenton · July 24, 2006

According the FileMaker 7 technical specs (which I don't know if they've expanded, but it's likely irrelevant for this question), you can have 32,767 repetitions for a field.

K1200 · July 24, 2006

Oops -- there is a problem.

Upon further testing of receiveNSpit.fp7, I realized it also breaks the text on carriage returns as well as the designated delimiter. Although that would be a good method for many uses, it won't quite work for me.

I experimented with changing the in the calculated field to another symbol <@> but that only caused it to embed them in the results and break ONLY on CR's found in the text. The result I need is the opposite of that: leave the CR's in the imported text and terminate each imported text field only at the delimiter.

Would you have a suggestion for an enhancement to receiveNSpit.fp7? My only idea is to flag individual segments (with some "to be continued" symbol) and reconstruct the sections that contained CR's via a second script?

comment · July 24, 2006

Try defining the repeating calculation as:

Substitute (

GetValue (

Substitute ( Extend ( g_RecieverField ) ;

[ ¶ ; "§" ] ;

[ Extend ( ChosenDelimiter ) ; ¶ ] ) ;

Get ( CalculationRepetitionNumber ) ) ;

"§" ; ¶ )

K1200 · July 24, 2006

Thanks, Comment. -- It works!

I was just beginning to grasp how Soren's solution worked and now you've raised the bar. Exchanging the CR's, performing the extract and then exchanging them back -- in one statement!

Thanks much.

Sign In

Need script to Extract Text

Recommended Posts

K1200

Søren Dyhr

K1200

Fenton

K1200

comment

K1200

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information