Jump to content
Server Maintenance This Week. ×

Calculating Overlapping Number Ranges


Alex Taylor

This topic is 3844 days old. Please don't post here. Open a new topic instead.

Recommended Posts

I'm working on a solution which I'm trying to optimize for speed (who isn't!) and I was hoping I could get some feedback on my current method. Here's the problem:

 

I'm working on an assignment which is a scheduling system for shift workers. Each employee can, in advance, specify the dates and time ranges during which they want to be available to work. So some employees might say they would like to work Mondays to Fridays, 9am to 5pm, others might specify night shifts, others may say they're only available certain days out of the week.

 

Given this list of employees, and a list of shifts (consisting of dates and time ranges) that need to be covered, I want to determine the best employee to assign to each shift based on their availability. This is a "work with what you've got" scenario, so if the most available employee can only cover 90% of a shift, that will be considered the "best" employee (i.e. the one with the most availability).

 

My tables (I'm only listing the fields that are necessary for this example):

 

Employee_Availability

Stores one or more records for each employee specifying their availability

id

id_Employee (foreign key)

date

startTime

endTime

 

Shift

id

date

startTime

endTime

 

 

And some example data:

 

Shifts:

id	date		startTime	endTime
1	1/1/2013	09:00		17:00
2	1/1/2013	10:00		18:00
3	1/1/2013	15:00		23:00

Employee Availability:

id	id_Employee	date		startTime	endTime
56	101		1/1/2013	07:00		17:00
57	102		1/1/2013	06:00		18:00	
58	103		1/1/2013	10:00		18:00
59	104		1/1/2013	17:00		23:00
60	105		1/1/2013	16:30		23:00

For shift #1, employees 101, 102 and 103 would all be eligible to work, since their availability overlaps with the shift time. But given that #101 and 102 are available for the entire shift, and 103 is not, the best employee would be 101 or 102 (and at that point the choice could essentially be random).

 

For shift #3, all employees are initially considered. None of the employees can take the entire shift, but #105 is the most available out of all of them, so #105 is selected.

 

 

My current method

 

Currently, I import a list of shifts into the Shift table. This table has a relationship to Employee Availability such that it automatically shows all eligible shifts:

Shift		->	Employee_Availability
----------------------------------------------
date		=	date
startTime 	<	endTime
endTime		>	startTime

From here, I use a custom function that takes the shift data and all related availability data as its input, so the input would look like this (syntax is 'id   date   startTime   endTime'):


1 1/1/2013 09:00 17:00
56 1/1/2013 07:00 17:00
57 1/1/2013 06:00 18:00
58 1/1/2013 10:00 18:00

The recursive custom function compares the first line against each remaining line and determines the line with the highest value of overlapping time, and returns its id. I store that id and use it to assign the shift.

 

I'm using the List() function on a calculated field to pull the related data from Shot over into my calculation in the Shift table for use in this function. I'm finding that this is very slow, however, given that I have 3000+ Employee_Availability records and want to calculate 50-300 Shift records at any given time. So long story short, my current method works, but the use of List makes it slow (Using ExecuteSQL() instead of List() seems to be just as slow).

 

I'm trying to take a step back and think of an alternate solution to this problem which may be more elegant, or faster. Any ideas would be much appreciated.

 

Link to comment
Share on other sites

this should show everyone who is available on those dates and be very fast.

 

This would only match on employees whose availability is identical to the shift. But IIUC, the OP is looking for a degree of availability, since the problem of availability per se is already solved.

 

See if this helps you:

 

EDIT obviously not … removed file.

Link to comment
Share on other sites

I know I wanted to finish, I got sidetracked again "so I deleted my post"... . My finished solutions was going to use a repeating field for the relationship, but I like your solutions better.

Link to comment
Share on other sites

I'm working on a solution which I'm trying to optimize for speed

 

I don't think you have much room for optimization as such - it's more a question of which slow method is the least slow. Unless...

 

Are your shifts really so varied in start/end times? And even if they are, is there some method to the madness? Perhaps a way could be found to have the degree of availability for each shift on that day stored in the availability record itself.

 

 

@eos

I don't understand your file. Clearly, the relationship between shifts and availabilities is many-to-many; the result you are calculating is correct for the first related shift only.

Link to comment
Share on other sites

Thanks for the attachment, eos! However, I believe the calculation you're using in the Employee table to determine the amount of overlap would not cover situations where there's more than one shift in a given day. Since the calculation only has access to the first matched record in the relationship, subsequent shifts would see the data from the first shift displayed in the portal. I believe the calculation has to be done from the Shift side so that each shift can be evaluated in turn.

 

I came up with this calculation to determine the value of the overlap:

Max( 0; Min ( employeeEnd; shiftEnd ) - Max ( employeeStart; shiftStart ) )

This provides a concise way of figuring out the degree of availability, and is fast. Although what's not fast is grabbing the related data to put through this calculation.


Are your shifts really so varied in start/end times? And even if they are, is there some method to the madness? Perhaps a way could be found to have the degree of availability for each shift on that day stored in the availability record itself.

 

This is more of an assignment, not for any real-world scenario just yet, although I wanted to assume a round-the-clock scenario where there might be multiple overlapping shifts and shifts throughout the day. 

 

But you're right, I suppose this is a lot of data to crunch, so I shouldn't be surprised by the speed!

Link to comment
Share on other sites

I don't understand your file. Clearly, the relationship between shifts and availabilities is many-to-many; the result you are calculating is correct for the first related shift only.

 

What's not to understand? :sleep:  Clear though it may be, I simply overlooked this little fact because I tested on one shift record only and got carried away …

 

Thanks for the attachment, eos! However, I believe the calculation you're using in the Employee table to determine the amount of overlap would not cover situations where there's more than one shift in a given day. 

 

Yes, I gathered as much … good you found a solution!

Link to comment
Share on other sites

there might be multiple overlapping shifts and shifts throughout the day.

 

Having "multiple overlapping shifts and shifts throughout the day" is fine - but if I know in advance that on a Thursday there are 10 shifts with known start/end times, then I can calculate the degree of availability for each of the 10 shifts right there when the availability record is submitted. From this point on I would be working with stored data only, so the speed should pick up significantly.

If that's not possible, I would suggest you try it this way:

For each shift, find the related availability records. Next, populate a couple of global fields with the current shift's start and end times and let each availability record calculate the overlap. Finally, sort the found set by overlap and pick the winners. That should minimize the unstored calc effort to the eligible records only, one shift at a time. Also, do not put any unstored calculation on the layout.

Link to comment
Share on other sites

For each shift, find the related availability records. Next, populate a couple of global fields with the current shift's start and end times and let each availability record calculate the overlap. Finally, sort the found set by overlap and pick the winners. That should minimize the unstored calc effort to the eligible records only, one shift at a time. Also, do not put any unstored calculation on the layout.

 

Good method, comment! I didn't think of using global fields before.

 

Looks like using a scripted loop to set the globals and grab the ID of the related record with the highest availability runs the same speed as a calculation that grabs a List() of the related records and passes it to my custom function. So maybe I've hit the natural limit of how fast this is going to be :D

Link to comment
Share on other sites

Are you sure you are testing this properly? You need to make sure that nothing else is being evaluated other than what's essential to the tested process. Ideally, you'd use a new file without the custom function (or at least without the fields using it).

 

Sorting by unstored calc is bound to be slow - however, I would still expect it to be faster than "pseudo-sorting" by a custom function. Perhaps not much faster, but at least enough to be noticed.

 

 

---

Another thing you might try is replacing a number field's content with the overlap formula, then sort using that field (we are assuming here that none of the records will be locked by another user).

Edited by comment
Link to comment
Share on other sites

Are you sure you are testing this properly? You need to make sure that nothing else is being evaluated other than what's essential to the tested process. Ideally, you'd use a new file without the custom function (or at least without the fields using it).

 

Here's how I've got my original method set up:

 

  • A number field in Shift with an AutoEnter calc that sends the shift data + related availability data (using List() of a stored calc field in the Employee_Availability table that concatenates the required fields) to the custom function. The custom function returns the ID of the best availability record.
  • A script with a "Replace Field Contents" step that clears the result of the AutoEnter calc field, thus forcing it to recalculate for every record in the found set. I've added timers to this script so I can see how long it runs for.

 

I tried the Globals method by adding two Global fields to the Shift table: one for the start time, one for the end time. I created an unstored calc in the Employee_Availability table that calculates availability based on these globals. Then I sorted my Shift-->Employee_Availability relationship by descending availability. I tested it with a script that simply loops through each Shift record, sets the two global fields, then sets a "best_availability" field with the first ID it finds through the Shift-->Employee_Availability relationship, which has already been sorted by merit of setting the global fields and causing the unstored calc to update. 

 

Timing this script yields about the same result to within 1 second as the custom function method.

 

Link to comment
Share on other sites

Then I sorted my Shift-->Employee_Availability relationship by descending availability.

 

I can't be sure it makes a real difference, but I meant sorting the found set in the Employee_Availability table itself: Roughly:

 

Freeze Window

Loop

Set Field [ gStart ; Shift::Start ]

Set Field [ gEnd; Shift::End ]

Go to Related Record [ Employee_Availability; Show related only ]

Sort Records

Go to Record [ First ]

Set Field [ Shift::best_availability ; Employee_Availability,::EmpID ]

Go to Layout [ Shift ]

Go to Record [ Next, Exit after last ]

End Loop

 

Use empty layouts in Form view. Do not sort the relationship, otherwise the sort will happen twice.

Link to comment
Share on other sites

I implemented your script and gave it a shot, although it looks like it's considerably slower than the first two methods. My guess is because of the multiple layout switching involved. In the first script where I tried the globals idea, I just stayed on one layout and looped through the Shift records, relying on the unstored calc to do the work behind-the-scenes for each availability record.

 

Thanks for all the help, I've definitely learned a few things! 

Link to comment
Share on other sites

This topic is 3844 days old. Please don't post here. Open a new topic instead.

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.