Alex Taylor Posted October 22, 2013 Posted October 22, 2013 I'm working on a solution which I'm trying to optimize for speed (who isn't!) and I was hoping I could get some feedback on my current method. Here's the problem: I'm working on an assignment which is a scheduling system for shift workers. Each employee can, in advance, specify the dates and time ranges during which they want to be available to work. So some employees might say they would like to work Mondays to Fridays, 9am to 5pm, others might specify night shifts, others may say they're only available certain days out of the week. Given this list of employees, and a list of shifts (consisting of dates and time ranges) that need to be covered, I want to determine the best employee to assign to each shift based on their availability. This is a "work with what you've got" scenario, so if the most available employee can only cover 90% of a shift, that will be considered the "best" employee (i.e. the one with the most availability). My tables (I'm only listing the fields that are necessary for this example): Employee_Availability Stores one or more records for each employee specifying their availability id id_Employee (foreign key) date startTime endTime Shift id date startTime endTime And some example data: Shifts: id date startTime endTime 1 1/1/2013 09:00 17:00 2 1/1/2013 10:00 18:00 3 1/1/2013 15:00 23:00 Employee Availability: id id_Employee date startTime endTime 56 101 1/1/2013 07:00 17:00 57 102 1/1/2013 06:00 18:00 58 103 1/1/2013 10:00 18:00 59 104 1/1/2013 17:00 23:00 60 105 1/1/2013 16:30 23:00 For shift #1, employees 101, 102 and 103 would all be eligible to work, since their availability overlaps with the shift time. But given that #101 and 102 are available for the entire shift, and 103 is not, the best employee would be 101 or 102 (and at that point the choice could essentially be random). For shift #3, all employees are initially considered. None of the employees can take the entire shift, but #105 is the most available out of all of them, so #105 is selected. My current method Currently, I import a list of shifts into the Shift table. This table has a relationship to Employee Availability such that it automatically shows all eligible shifts: Shift -> Employee_Availability ---------------------------------------------- date = date startTime < endTime endTime > startTime From here, I use a custom function that takes the shift data and all related availability data as its input, so the input would look like this (syntax is 'id date startTime endTime'): 1 1/1/2013 09:00 17:00 56 1/1/2013 07:00 17:00 57 1/1/2013 06:00 18:00 58 1/1/2013 10:00 18:00 The recursive custom function compares the first line against each remaining line and determines the line with the highest value of overlapping time, and returns its id. I store that id and use it to assign the shift. I'm using the List() function on a calculated field to pull the related data from Shot over into my calculation in the Shift table for use in this function. I'm finding that this is very slow, however, given that I have 3000+ Employee_Availability records and want to calculate 50-300 Shift records at any given time. So long story short, my current method works, but the use of List makes it slow (Using ExecuteSQL() instead of List() seems to be just as slow). I'm trying to take a step back and think of an alternate solution to this problem which may be more elegant, or faster. Any ideas would be much appreciated.
eos Posted October 22, 2013 Posted October 22, 2013 this should show everyone who is available on those dates and be very fast. This would only match on employees whose availability is identical to the shift. But IIUC, the OP is looking for a degree of availability, since the problem of availability per se is already solved. See if this helps you: EDIT obviously not … removed file.
No_access Posted October 22, 2013 Posted October 22, 2013 I know I wanted to finish, I got sidetracked again "so I deleted my post"... . My finished solutions was going to use a repeating field for the relationship, but I like your solutions better.
comment Posted October 22, 2013 Posted October 22, 2013 I'm working on a solution which I'm trying to optimize for speed I don't think you have much room for optimization as such - it's more a question of which slow method is the least slow. Unless... Are your shifts really so varied in start/end times? And even if they are, is there some method to the madness? Perhaps a way could be found to have the degree of availability for each shift on that day stored in the availability record itself. @eos I don't understand your file. Clearly, the relationship between shifts and availabilities is many-to-many; the result you are calculating is correct for the first related shift only.
Alex Taylor Posted October 22, 2013 Author Posted October 22, 2013 Thanks for the attachment, eos! However, I believe the calculation you're using in the Employee table to determine the amount of overlap would not cover situations where there's more than one shift in a given day. Since the calculation only has access to the first matched record in the relationship, subsequent shifts would see the data from the first shift displayed in the portal. I believe the calculation has to be done from the Shift side so that each shift can be evaluated in turn. I came up with this calculation to determine the value of the overlap: Max( 0; Min ( employeeEnd; shiftEnd ) - Max ( employeeStart; shiftStart ) ) This provides a concise way of figuring out the degree of availability, and is fast. Although what's not fast is grabbing the related data to put through this calculation. Are your shifts really so varied in start/end times? And even if they are, is there some method to the madness? Perhaps a way could be found to have the degree of availability for each shift on that day stored in the availability record itself. This is more of an assignment, not for any real-world scenario just yet, although I wanted to assume a round-the-clock scenario where there might be multiple overlapping shifts and shifts throughout the day. But you're right, I suppose this is a lot of data to crunch, so I shouldn't be surprised by the speed!
eos Posted October 22, 2013 Posted October 22, 2013 I don't understand your file. Clearly, the relationship between shifts and availabilities is many-to-many; the result you are calculating is correct for the first related shift only. What's not to understand? Clear though it may be, I simply overlooked this little fact because I tested on one shift record only and got carried away … Thanks for the attachment, eos! However, I believe the calculation you're using in the Employee table to determine the amount of overlap would not cover situations where there's more than one shift in a given day. Yes, I gathered as much … good you found a solution!
comment Posted October 22, 2013 Posted October 22, 2013 there might be multiple overlapping shifts and shifts throughout the day. Having "multiple overlapping shifts and shifts throughout the day" is fine - but if I know in advance that on a Thursday there are 10 shifts with known start/end times, then I can calculate the degree of availability for each of the 10 shifts right there when the availability record is submitted. From this point on I would be working with stored data only, so the speed should pick up significantly. If that's not possible, I would suggest you try it this way: For each shift, find the related availability records. Next, populate a couple of global fields with the current shift's start and end times and let each availability record calculate the overlap. Finally, sort the found set by overlap and pick the winners. That should minimize the unstored calc effort to the eligible records only, one shift at a time. Also, do not put any unstored calculation on the layout.
Alex Taylor Posted October 22, 2013 Author Posted October 22, 2013 For each shift, find the related availability records. Next, populate a couple of global fields with the current shift's start and end times and let each availability record calculate the overlap. Finally, sort the found set by overlap and pick the winners. That should minimize the unstored calc effort to the eligible records only, one shift at a time. Also, do not put any unstored calculation on the layout. Good method, comment! I didn't think of using global fields before. Looks like using a scripted loop to set the globals and grab the ID of the related record with the highest availability runs the same speed as a calculation that grabs a List() of the related records and passes it to my custom function. So maybe I've hit the natural limit of how fast this is going to be
comment Posted October 22, 2013 Posted October 22, 2013 (edited) Are you sure you are testing this properly? You need to make sure that nothing else is being evaluated other than what's essential to the tested process. Ideally, you'd use a new file without the custom function (or at least without the fields using it). Sorting by unstored calc is bound to be slow - however, I would still expect it to be faster than "pseudo-sorting" by a custom function. Perhaps not much faster, but at least enough to be noticed. --- Another thing you might try is replacing a number field's content with the overlap formula, then sort using that field (we are assuming here that none of the records will be locked by another user). Edited October 22, 2013 by comment
Alex Taylor Posted October 22, 2013 Author Posted October 22, 2013 Are you sure you are testing this properly? You need to make sure that nothing else is being evaluated other than what's essential to the tested process. Ideally, you'd use a new file without the custom function (or at least without the fields using it). Here's how I've got my original method set up: A number field in Shift with an AutoEnter calc that sends the shift data + related availability data (using List() of a stored calc field in the Employee_Availability table that concatenates the required fields) to the custom function. The custom function returns the ID of the best availability record. A script with a "Replace Field Contents" step that clears the result of the AutoEnter calc field, thus forcing it to recalculate for every record in the found set. I've added timers to this script so I can see how long it runs for. I tried the Globals method by adding two Global fields to the Shift table: one for the start time, one for the end time. I created an unstored calc in the Employee_Availability table that calculates availability based on these globals. Then I sorted my Shift-->Employee_Availability relationship by descending availability. I tested it with a script that simply loops through each Shift record, sets the two global fields, then sets a "best_availability" field with the first ID it finds through the Shift-->Employee_Availability relationship, which has already been sorted by merit of setting the global fields and causing the unstored calc to update. Timing this script yields about the same result to within 1 second as the custom function method.
comment Posted October 22, 2013 Posted October 22, 2013 Then I sorted my Shift-->Employee_Availability relationship by descending availability. I can't be sure it makes a real difference, but I meant sorting the found set in the Employee_Availability table itself: Roughly: Freeze Window Loop Set Field [ gStart ; Shift::Start ] Set Field [ gEnd; Shift::End ] Go to Related Record [ Employee_Availability; Show related only ] Sort Records Go to Record [ First ] Set Field [ Shift::best_availability ; Employee_Availability,::EmpID ] Go to Layout [ Shift ] Go to Record [ Next, Exit after last ] End Loop Use empty layouts in Form view. Do not sort the relationship, otherwise the sort will happen twice.
Alex Taylor Posted October 23, 2013 Author Posted October 23, 2013 I implemented your script and gave it a shot, although it looks like it's considerably slower than the first two methods. My guess is because of the multiple layout switching involved. In the first script where I tried the globals idea, I just stayed on one layout and looped through the Shift records, relying on the unstored calc to do the work behind-the-scenes for each availability record. Thanks for all the help, I've definitely learned a few things!
Recommended Posts
This topic is 4118 days old. Please don't post here. Open a new topic instead.
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now