Jump to content

help with exceuteSQL, "or" vs. "and" relationship?


Recommended Posts

I have a database of around 35K music tracks, each of which has a file ID and any number of keywords up to around 200. There is a fileID-keywordID table, and a main table with a global keyword field. Using a relationship between keyword ID's in the two tables, the user selects keywords from some pop-down menus for different categories, and I create a list of keywords in the global field - this is the "Magic Key", which automatically shows any tracks that contain those keywords in a related portal. However, this shows *any* tracks containing each keyword (ie this is an "or" relationship). I then wanted the option to show "all" (an "and" relationship) which would only display tracks which contain every one of the keywords in the list. In order to do that, the only thing I could figure out was to change the relationship from keywordID-keywordID to fileID-fileID. Then I do an executeSQL "select fileID from file_kwd where kwdID = ?" and add the result to the global list of file ID's. Then on to the next kwdID, and each time loop through and find only those file ID's that occur in both lists, until I get a final list with all of the intersections. When I do this on my local solution, it is reasonably fast enough. But on a remote server it is way too slow to be useable, and I'm assuming it is because of the many executeSQL statements required. Am I approaching this the wrong way, and/or is there a way to speed it up?

Link to post
Share on other sites
7 hours ago, human said:

some pop-down menus for different categories

This lends me to think that what you might want to do is simply use find mode ( or script it). You might not even need a keyword table. I’d avoid esql for this. 

  • Like 1
Link to post
Share on other sites

Actually I discovered that just changing the order of the relationships and avoiding "daisy chaining" them fixed the issue, and it's now fast enough. However I have a summary field on the portal because i want a count of the records in the portal, and that seems to be very slow to calculate each time.

Link to post
Share on other sites
1 hour ago, human said:

I discovered that just changing the order of the relationships and avoiding "daisy chaining" them fixed the issue,

Can you explain in detail what you did? Offhand, I cannot think of a simple method to accomplish an AND relationship. My first instinct would be to apply a filter to a portal based on an OR relationship - but with the numbers of records you have that could prove to be too slow.

 

1 hour ago, human said:

I have a summary field on the portal because i want a count of the records in the portal, and that seems to be very slow to calculate each time.

See if one of the methods suggested here can help:
https://www.teamdf.com/blogs/a-lightning-fast-alternative-to-the-count-function/

 

Link to post
Share on other sites

Well it's not simple, that's for sure. The relationship between the main file and the fileID-keywordID file is through the fileID. There is a global fileID field in the main file. By populating that with a carriage return delimited list of fileID's (the "magic key"), the portal shows all of the file ID's in that list (I use the "UniqueValues" function to eliminate duplicates).

For the "any" search: Each time to user selects another keyword from a pop-down list, I set a variable to executeSQL "select fileID from fileID-keywordID where kwdID = ?".  That returns a list of fileID's which contain that keyword, and that gets added to the list that is the global fileID field (the magic key). A list of all keywordID's that have been selected is also maintained and displayed (as actual text, not as ID's - eg "ambient" or "moody" etc).

For the "all" search: Same as above, but I first save the previous list of ID's. Then I loop through the current list of keywdID's, and on each item in the keyword list I do the exceuteSQL as above, but then I compare the new list of fileID's to the previous list of fileID's, and use FilterValues to find the intersection of those two lists, which then becomes the new "previous" list. When the loop is done I have a list of fileID's which contain all keywordID's in the keywordID list. This is surprisingly fast, even on the server it only takes a few seconds, though I haven't tried it with very long lists yet.

There is also a "remove" button which removes selected keywords from the list, and then loops again to build the new list.

There are 9 categories of keywords selectable via 9 pop-down menus with a script trigger on each. I decided to use a portal because, although it's easy to display 37,000 lines, it's not very useful for users without being able to narrow it down to keywords. I originally wanted to have a second portal which would be a kind of matrix from which users could select and unselect keywords, instead of selecting from pop-down lists. But currently i have no idea how, unless maybe I leverage javascript to do that.

The actual music files themselves are stored on Amazon S3, and can be played via a small web-viewer pop up which also displays the wave form and other detailed info such as time, date, etc.

 

I'll check out that link, thanks.

Link to post
Share on other sites

as per the link your referenced, using an unstored Get(FoundCount) field is way faster that using a summary count field. Thanks!

Link to post
Share on other sites
47 minutes ago, human said:

first save the previous list of ID's. Then I loop through the current list of keywdID's, and on each item in the keyword list I do the exceuteSQL as above, but then I compare the new list of fileID's to the previous list of fileID's, and use FilterValues to find the intersection of those two lists, which then becomes the new "previous" list.

I thought that's what you did to begin with. Al least I cannot see the difference between this description and the original one:

On 1/22/2021 at 10:22 PM, human said:

I do an executeSQL "select fileID from file_kwd where kwdID = ?" and add the result to the global list of file ID's. Then on to the next kwdID, and each time loop through and find only those file ID's that occur in both lists, until I get a final list with all of the intersections.

Personally, if portal filtering were too slow, I would just find the records using a series of Constrain Found Set steps after the initial find to reduce the list.

 

  • Like 1
Link to post
Share on other sites

In the first one I was describing the "and" process in order to show that for each iteration of the loop there is an executeSQL statement, and I thought that was what was slowing it down. But it turns out that simply changing the order of the related tables resulted in a big speed-up. 

As far as I can see, there would be no way to use a constrained find, because the keywords and their ID's are not in the same table as the other track info. They are only related through the fileID-keywordID table, so that requires a lookup, which I am doing with the executeSQL statement. But maybe I am wrong about that. Anyway, it's working great now, except for the first time a user searches, which always seems to take a bit longer than subsequently. I'm guessing that has something to do with how I have indexing set up, so i am playing around with that.

BTW I have no control over the raw data - that is coming from a mysql database that someone else set up and I can't change it.

Thanks.

Edited by human
Link to post
Share on other sites
28 minutes ago, human said:

As far as I can see, there would be no way to use a constrained find, because the keywords and their ID's are not in the same table as the other track info.

You can search related fields.

 

Link to post
Share on other sites

I will probably re-code to use extended/constrained finds, for two reasons:

- executeSQL works great on a local DB, and on a server to client except for the first time it's run, which can take 20-30 seconds. After that it's fine, but that initial delay is annoying. I've Googled around and can't find a solution to that.

- using a portal to display the data has it's down sides:

 - need a different portal for each different sort order, because you can't sort by calculation. I have 8 columns to sort up or down, so I need 16 copies of the same portal.

 - "Find" mode inside a portal does not confine the search to just the records in the portal, which is a drag. 

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.