Chrisjon Posted August 1, 2013 Posted August 1, 2013 Hi I couldn't find a forum that relates specifically to sorting so am posting here. I need to sort URL's, but some of them are very long, so I split them into 100 char chunks, and then do a sort on up to four of these chunks. It works, but it's slow! What I want to do is to somehow uniquely encode the whole URL into a number or some other representation that correctly and uniquely allows me to sort on a single field with the encoded value within it. Code() could work, if I add up the 5 char Unicodes it produces for each char of the url left to right maybe, but there is a chance that in doing this I will not end up with guaranteed unique values? Any ideas anyone? Thanks
eos Posted August 1, 2013 Posted August 1, 2013 If you like SF (or “SciFi” … ), you may know Frederik Pohl's short story The Gold at the Starbow's End (1972, expanded into the novel Starburst, 1982): a crew of renegade astronauts sends back a series of earth-shattering discoveries to the old home planet, but out of spite decide to encode them using a Gödel numbering scheme, so even the fastest computers (as imagined ca. 1972) aren't capable of deciphering those radio messages fast enough to prevent a global downfall. I guess you'd need something similar to get a unique representation of a string. Searching the CF list on briandunning.com for “string” gives you a boatload of finds, “unique” a lot fewer, but I didn't see anything that fits. You can try this one. I don't know if it works correctly (want to write a decoder for the result?), or if it is what you need, and how the sort performance would be, and most of all, what Kurt Gödel would have to say about all that , but I take any old excuse to dabble at writing recursive CFs. SomeSortofGodel.fmp12.zip
David Jondreau Posted August 1, 2013 Posted August 1, 2013 Why are you sorting the URLs? "Long URLs" usually indicate some sort of UUID and I can't imagine why you would need those sorted alphabetically. If you do need them sorted then some hash scheme is going to screw that up. Sorting speed is not determined by the number of fields you're sorting on or the length of the text in those fields. It's determined by the field type (number is slightly faster than text), whether those fields have an unstored reference to a related field, and the total number of fields and amount of information stored in the table. Hashing the sort field probably won't improve performance.
Chrisjon Posted August 2, 2013 Author Posted August 2, 2013 Thanks guys... The issue is that FM will only sort on character fields to a limited length. My solution needs to sort on some very long URLs that are well over these limits, so I split the field into several sub fields and then sort on those in the appropriate order. Doing so on a data set of some 30000 records takes a while, so all I'm looking to do is to take the full original URL, hash it in some way that correctly represents its alpha sort order into a number, and sort on that singular number field. Right now eos's solution doesn't crack it (but thanks!), so it looks as though I'll have to code up a CF something that will do it, in order to create a number field that I can index, so resulting in a much faster sort. Was hoping to avoid having to do so.......! Thanks all
Recommended Posts
This topic is 4132 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now