Exporting index data

Paco · August 30, 2010

Hello,

I have a database which is consist of about 10.000 records. I want to export all data of a text field which will be indexed alphabetically.

For example:

First record of x field contains "The bog turtle is a semiaquatic turtle"

And secord record "It is the smallest North American turtle"

Now i want to export this data like:

a

American

bog

is

It

North

semiaquatic

smallest

the

turtle

Thank you for your help

Paco

comment · August 30, 2010

I can think of couple of ways - unfortunately, neither one is quite simple:

1. Export as XML, using a custom XSLT stylesheet. The stylesheet needs to (a) tokenize the field contents into words; (: sort the tokens and © remove duplicate tokens.

2. Define a repeating calculation field to split the text into individual words. Import this into another table as separate records, then export grouped by word.

Paco · August 31, 2010

I can see my data as separated words when i click on the field "insert from index" option.

Is it possible to export this index data but as separated words?

comment · August 31, 2010

I don't know of a way to export (or even copy) the index.

Is this a one-time operation or do you need to do this periodically? For a one time thing, I believe it would be easier to export the text as is, and produce the word index in another application.

Lee Smith · August 31, 2010

I'm not sure why you would need this, but

I would do it this way.

Export the Field using the Tab Delimitated format

Open it in TextWrangler

Find space and replace with r

Sort Lines

Process Duplicate Lines

You now have your list.

HTH

Lee

Fenton · August 31, 2010

It is also possible, if your text field is not very large,* to use a Custom Function (requires FileMaker Pro Advanced) to get the unique words of a field, in a single record. It must be a stored calculation.

You can then use the Design function ValueListItems (Get(FileName), "value list name") to produce an index. I don't know if there's a limit of the size that ValueListItems can return.

UniqueWords

http://www.briandunning.com/cf/478

*Actually, you don't really need "unique" words in this first stage. You just need words. ValueListItems will take care of the uniqueness.

P.S. As comment says, if you only need to do this once in a while, it may be easier (since you don't have FileMaker Pro Advanced) to use external tools. The free TextWrangler can do a Find/Replace to create "words", then sort and de-dupe the lines.

comment · August 31, 2010

Find space and replace with r

You would probably want to replace punctuation symbols with , too.

Lee Smith · August 31, 2010

Good Point.

After replacing the spaces, then do a Grep Find and replace using

Find

[?|.|,|!]

Replace with nothing.

HTH

Lee

Paco · September 1, 2010

Thank you for your replies. I think it is impossible to remove duplicate words. Because my text data is very large (about 5.000 pages) and removing duplicate words will take more times.

I am a translator and i need this word list because of there are some different writings of same words. Ok. maybe i can control it by using filemaker index window.

LaRetta · September 1, 2010

I may be missing something here but wouldn't something like this work (attached). By its very nature, it eliminates duplicates and it can be sorted. If you want one field with the final results, you can always create a value list based upon this field and export just the calculation but records would work, wouldn't it?

BTW, by using xWords, non-word characters are dropped automatically.

UPDATE: This will delete the records as it goes.

Peel.zip

Edited September 1, 2010 by Guest

comment · September 1, 2010

Nice (I'm just not sure why you delete the original records, though).

LaRetta · September 1, 2010

No reason other than I envisioned feeding multiple records into it and it spitting the words out the backside and returning to empty state waiting for new supply of records to process. FEED ME SEYMOUR :smile2:

bruceR · September 1, 2010

If applescript is allowed try this.

It gets the field contents (field = cell contents across found set in applescript; think column.)

Breaks it into new-line delimited words, passes it to shell script that uses sort and uniq functions.

GetWords.fp7.zip

Paco · September 7, 2010

Your solution is GREAT! It has got some more time spitting the text data to words due to large amount (about 500.000 words), but it works! Thanks for the sharing.

Paco · September 7, 2010

If applescript is allowed try this.

It gets the field contents (field = cell contents across found set in applescript; think column.)

Breaks it into new-line delimited words, passes it to shell script that uses sort and uniq functions.

Your solution is also works if you have a small text data. I have got an error message due to large text.

Paco

Sign In

Exporting index data

Recommended Posts

Paco

comment

Paco

comment

Lee Smith

Fenton

comment

Lee Smith

Paco

LaRetta

comment

LaRetta

bruceR

Paco

Paco

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information