Jump to content
Sign in to follow this  
xochi

Fastest way to find non-matching records

Recommended Posts

(Posted to the Server forum as I believe this is a performance issue specific to Server)

I have two very large tables (600,000 records each) that are related by a 2 field relationship -- Date, and ID (numeric). Call them Table A and Table B. It's a one-to-one relationship.

I have a need to make sure that every record in table A has a child in table B.

Right now, I'm doing this:


Go to layout (Table A)

Perform Find (finds for TableB::ID = *)

Show ommtted records

loop

  set field TableB::ID ; Table A::ID

  go to next ; exit after last

end loop

The problem: The initial find step takes 15 minutes or so! I've tried using the inverted find (e.g. search for TableB::ID = "=", but it seems that a Find for a "=" in a non-existant child record will not return anything...

Ideas? Is there a Find syntax for "find for records in table A which have no child in table B"?

I've tried GoToRelatedRecords (match all in found set) but it seems to have the same performance issues as the find for TableB::ID = "*"

Share this post


Link to post
Share on other sites

This may be extra overhead but what happens if you create a concatenated calc field of ID &"_" & Date in both tables, use ≠ as the join operator, and then GTRR... It may be faster.

Share this post


Link to post
Share on other sites

I think you may be right -- with 600K records, the 2-field match is just too much. I may try an alternate approach: an auto-enter calc in TableA that pulls in the serial number from Table B, and then create a new relationship which is based on that serial number alone. This single-key should be much faster (I'd hope).

Share this post


Link to post
Share on other sites

Also...

This recent post may be of interest to you...

http://fmforums.com/forum/showtopic.php?tid/195605

Share this post


Link to post
Share on other sites

Still, I wonder ... have you tried ( and I possibly just didn't read ) starting in Table B and GTRR to Table A? Then Show Omitted?

Share this post


Link to post
Share on other sites

The preliminary testing with GTRR showed it to be no faster... :

Share this post


Link to post
Share on other sites

Also...

This recent post may be of interest to you...

http://fmforums.com/forum/showtopic.php?tid/195605

Trying to understand this technique applied to my case. Would this mean that I would set up an import from table A to table B, and that if field validations in table B are correct (e.g. a Unique requirement on one of the fields) that this process happens very quickly?

Share this post


Link to post
Share on other sites

I'm unsure if it would help you here but I think John is right - I think it would do the trick!!

You concatenate Date and ID in Table B. Set the validation to be 'always' and 'unique.' You don't even need to worry about showing all records first in Table B. Then script would be:

Go To Table A

Show All Records

Go To Table B

Import [ ADD ONLY and specify the two fields and any others you wish)

... if the records don't exist ( based upon the unique calculation you created ), they will be added into Table B. If they already exist in Table B, they will be skipped and not imported because they failed unique validation. It is very fast; particularly when, even after you have your found set, you still need to manipulate or create records in Table B in other methods but not this one.

I should have seen the connection to your need ... way to go, JOhn! :laugh2:

UPDATE: Actually, it is Michael's technique. But I wanted to expain a condensed version because I got a bit long-winded on that thread about it. But the technique is incredible!

LaRetta

Edited by Guest

Share this post


Link to post
Share on other sites

Thanks all for the help. I've found what I think is a decent compromise.

Add a serial number field to table B, and a new foreign field to table A: TableA::fkTableBSerial (auto enter value = TableB::Serial, Indexed)

Then, my script changes as follows:


Go to layout (Table A)

Perform Find (finds for TableA::fkTableBSerial = "")

loop

  set field [TableB::ID ; Table A::ID]  /* creates the sibling record  if needed, but won't create a duplicate if it already exists */

  set field [TableA::fkTableBSerial; TableB::Serial] /* remember the foreign key */

  go to next ; exit after last

end loop

 

With this change, the initial find for non-matched records is lightning fast (a few seconds or two) since it's operating on a stored, indexed field. I don't have to create concatenated fields in table B and A.

However, in the data I'm using, 95%+ of the child records already exist in table B, so some of the performance may not be relevant to other situations.

I like the idea of using an import, too -- seems like both methods have some tradeoffs (adding extra fields vs. adding extra script steps, etc.)

Also, I realized my data set is bigger than I had said: Table A is 600K records, but Table B is 1.2 million+. Also, I was wrong when I said it was one-to-one, it's actually Many(TableA) to one (TableB).

We really need to put together a public domain benchmark suite for FileMaker server. :

Edited by Guest

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Sign in to follow this  

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.