Jump to content
Server Maintenance This Week. ×

Fastest way to find non-matching records


This topic is 5804 days old. Please don't post here. Open a new topic instead.

Recommended Posts

(Posted to the Server forum as I believe this is a performance issue specific to Server)

I have two very large tables (600,000 records each) that are related by a 2 field relationship -- Date, and ID (numeric). Call them Table A and Table B. It's a one-to-one relationship.

I have a need to make sure that every record in table A has a child in table B.

Right now, I'm doing this:


Go to layout (Table A)

Perform Find (finds for TableB::ID = *)

Show ommtted records

loop

  set field TableB::ID ; Table A::ID

  go to next ; exit after last

end loop

The problem: The initial find step takes 15 minutes or so! I've tried using the inverted find (e.g. search for TableB::ID = "=", but it seems that a Find for a "=" in a non-existant child record will not return anything...

Ideas? Is there a Find syntax for "find for records in table A which have no child in table B"?

I've tried GoToRelatedRecords (match all in found set) but it seems to have the same performance issues as the find for TableB::ID = "*"

Link to comment
Share on other sites

This may be extra overhead but what happens if you create a concatenated calc field of ID &"_" & Date in both tables, use ≠ as the join operator, and then GTRR... It may be faster.

Link to comment
Share on other sites

I think you may be right -- with 600K records, the 2-field match is just too much. I may try an alternate approach: an auto-enter calc in TableA that pulls in the serial number from Table B, and then create a new relationship which is based on that serial number alone. This single-key should be much faster (I'd hope).

Link to comment
Share on other sites

Also...

This recent post may be of interest to you...

http://fmforums.com/forum/showtopic.php?tid/195605

Link to comment
Share on other sites

Also...

This recent post may be of interest to you...

http://fmforums.com/forum/showtopic.php?tid/195605

Trying to understand this technique applied to my case. Would this mean that I would set up an import from table A to table B, and that if field validations in table B are correct (e.g. a Unique requirement on one of the fields) that this process happens very quickly?

Link to comment
Share on other sites

I'm unsure if it would help you here but I think John is right - I think it would do the trick!!

You concatenate Date and ID in Table B. Set the validation to be 'always' and 'unique.' You don't even need to worry about showing all records first in Table B. Then script would be:

Go To Table A

Show All Records

Go To Table B

Import [ ADD ONLY and specify the two fields and any others you wish)

... if the records don't exist ( based upon the unique calculation you created ), they will be added into Table B. If they already exist in Table B, they will be skipped and not imported because they failed unique validation. It is very fast; particularly when, even after you have your found set, you still need to manipulate or create records in Table B in other methods but not this one.

I should have seen the connection to your need ... way to go, JOhn! :laugh2:

UPDATE: Actually, it is Michael's technique. But I wanted to expain a condensed version because I got a bit long-winded on that thread about it. But the technique is incredible!

LaRetta

Edited by Guest
Link to comment
Share on other sites

Thanks all for the help. I've found what I think is a decent compromise.

Add a serial number field to table B, and a new foreign field to table A: TableA::fkTableBSerial (auto enter value = TableB::Serial, Indexed)

Then, my script changes as follows:


Go to layout (Table A)

Perform Find (finds for TableA::fkTableBSerial = "")

loop

  set field [TableB::ID ; Table A::ID]  /* creates the sibling record  if needed, but won't create a duplicate if it already exists */

  set field [TableA::fkTableBSerial; TableB::Serial] /* remember the foreign key */

  go to next ; exit after last

end loop

 

With this change, the initial find for non-matched records is lightning fast (a few seconds or two) since it's operating on a stored, indexed field. I don't have to create concatenated fields in table B and A.

However, in the data I'm using, 95%+ of the child records already exist in table B, so some of the performance may not be relevant to other situations.

I like the idea of using an import, too -- seems like both methods have some tradeoffs (adding extra fields vs. adding extra script steps, etc.)

Also, I realized my data set is bigger than I had said: Table A is 600K records, but Table B is 1.2 million+. Also, I was wrong when I said it was one-to-one, it's actually Many(TableA) to one (TableB).

We really need to put together a public domain benchmark suite for FileMaker server. :

Edited by Guest
Link to comment
Share on other sites

This topic is 5804 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.