Need help with moving fields between tables

aujjp · April 19, 2005

I have a very large table (137,000 records - 49Meg DB) that I am processing with a script. Part of the script is to move the contents of a field in table1 (the large one) to a field in table2. The move is accomplished on a field by field basis and then the record is indexed using GotoRecord/Request/Page[next]. The script works fine but the further it indexes into table1 the slower it gets. By the time it gets to record 5,000 the move is taking minutes (per record!!). Allowing the script to run for 15 hours on a G5 with 2Gig ram only processes 10,000 records. I have tried using the Flush Cache to Disk script step, increasing cache memory to 99M, index everything, index nothing but nothing seems to improve the performance. BTW, doing a manual Find/Replace on a particular field takes about 15 minutes. Any ideas?

Ender · April 19, 2005

Hi John,

To solve this specific question, we'll likely need to see the actual script steps. Also, is this running on Server or not? And what specific rev of FileMaker are you running?

But I would ask why it's necessary to move the field contents from one table to another. Could you simply reference the related field's data? (Are they related?)

aujjp · April 19, 2005

Version is Pro7. It is running on a standalone G5.

Background. I have been given a single large table containing 137,000 records which I will refer to as table1. This table is basically an aggregation of a large number of lists that have been imported and massaged to line up with the field definitions. The basic stuff is there like name, address, city, phone, email, what list they were imported from and membership date. There are many, many duplicates. The task is to de-dupe table1 based on name and address but not lose any of the other data. So while the name+address fields may match, the rest of the field contain unique data that needs to be preserved in the resulting de-duped record. I.E. it is important to know how many times the person was in table1 and all of their respective sources.

My solution. I created a new table (in the same database) that I will refer to as table2. This table has all the same primary fields as table1 but I have added additional fields that are basically new instances of important fields in table1. For instance, table1 has a listname field that contains what list the record came from. Table2 has a listname1, listname2, etc. Bad table layout, I know. I do not want to create additional tables and define relationships between them to hold these values. Table1 is sorted based on name and address and table2 is empty.

The script starts with the first record in table1 and seeds table2 with all the pertinent fields. This sub-script is called moved MoveRecord and uses the New Record Request, copy, and paste script steps. I discovered that in order to use the Copy and Paste script steps, you have to keep going back and forth between tables using the GotoLayout[] script step. Ponderous but whatever.

The script then moves to the next record in table1 using GotoRecord/Request/Page[next] and compares the name/address with the name/address of the current record in table2. If they match, then the desired fields from table1 are moved to the next available corresponding fields in table2. IsEmpty checking is done on the respective fields in table2 and moved on to next field until an empty destination field is found. There are 2 calculated fields in table2 that keep track of how many times the person appears in table1.

If the name/addess don't match, then the record in table1 is moved to a new record in table2 and the process starts all over again using the Loop script step.

When the last record in table1 is processed, table2 contains a de-duped list of names/address plus all the additional fields in the table2 record.

Pretty simple, it works but it is slow. Using the Pause script step and stepping through the script, I have determined that it is moving the fields from a record in table1 to a record in table2 that takes forever once you get more than a couple thousand records into table1.

Long explanation but I hope it clarifies things a bit.

Ender · April 19, 2005

From the algorithm you described, it makes sense that it takes longer with the later records.

There are a few ways to get unique records from a set with duplicates, and there are lots of threads about this in the forums. You could use any of these methods to get your unique name-address combinations into a separate table from your original, then keep your original records to hold those additional fields. You should be able to get all the unique name-address pairs into a Names table, and keep the original records with the additional source information in your Source table, without any nested looping.

One method for de-duping you might try is sorting the Source records by your name-address calc, and exporting the name and address fields, using the 'Group by' option to group the exported records by the name-address calc. Then import this data into the Names table.

I'd also suggest that you create a serial ID field in the Names table, and populate this and a corresponding ID field in the Source table, so you don't have to rely on the name-address calc as a key into the future (names and addresses tend to change.)

Sign In

Need help with moving fields between tables

Recommended Posts

aujjp

Ender

aujjp

Ender

Create an account or sign in to comment

Create an account

Sign in

Browse

Site Support

Forums

Blogs

Marketplace

Activity

Important Information