Over the summer, I built the Chronos temporal toolkit for Oracle. And for the last while I’ve been porting it to Postgres and working on the documentation.
The SQL standard defines several data types to model instances of times. But rarely does anything happen instantaneously. Often what we need to model is a period of time; for instance an airline flight, a hotel stay, a bus route, an employee’s shift, a doctor’s appointment, a leave request. And yet there is no standard defined to handle this. So instead we use some combination of start and end time, an anchor time and interval, or an anchor time and a number representing seconds, minutes or days.
Dr Snodgrass wrote quite a bit about temporal relations in databases and in the early 90′s there was an effort to add temporal extensions to the SQL standard. Unfortunately, that never made it out of committee. However, they laid a solid foundation that we can implement in an extensible database like Oracle and Postgres. You can read Dr Snodgrass’ work here.
Postgres already has a temporal extension which provides the period data type. The biggest problem with temporal is that it really needs to be a contrib module. If it were, it would be compiled and distributed with the server. To install it, users would only have to run a simple sql script. Instead, we have to compile from source. Its not terribly hard on Linux but not something Windows users are used to or typically have the tools and or know how to do.
So the first thing I did was implement the period data type as a composite type. It is a bit slower for some things and doesn’t have GiST support, but it is fully compatible with the temporal period and can be installed by running a simple sql script. I recommend using the temporal period type because of the GiST index. But the rest of the Chronos toolkit doesn’t care which period type you use.
I’m not going to go into great detail here about the period data type. Suffice it to say that it allows us to do really cool things when modelling a timespan. We can test for overlap, adjacency, and do set operations like intersect and union.
But that is only scratching the surface. Each period is a set or interval (the math interval not the database interval). We’ve got the basic functionality down, but we need to expand our thinking to sets of sets. Joe Celko is an opinionated a$$… in my opinion But one of his catch phrases is “thinking in sets.” And that is just what we are going to do here.
So think about this problem: an employee enters a leave request with a start and end time. In order to figure out how much leave he should be charged, we need to find his scheduled shifts, subtract any existing leave requests and holidays, figure out if the employee needs to take a lunch each day and then tally the leave taken on a day by day basis (since the employee could have split shifts or be working swing shift).
The image below illustrates some possible scenarios. We may have a single shift (inclusion) and a single leave request (exclusion). There could be multiple leave requests for a single shift. And there could be multiple shifts for a single request. So we need to take the set of inclusions and a set of exclusions and somehow come up with the result set.
How do we solve the problem? The correct answer is, “find a different project to work on, and leave this $*(! to someone else.” But if you happen to be like me and are more stubborn than you are smart, then we’ll have to work through this.
If your first response is to pop in to some procedural language and start walking cursors, then you are not “thinking in sets.” If you’ve taken a moment to think about how this problem might be solved, then you have figured out that doing this with only the base data types provided by the SQL spec would be inordinately difficult. Being able to treat each shift, leave request and day as a single unit makes the problem manageable. And being able to treat all shifts, all leave request and all days as arrays of periods makes the problem downright simple.
The Oracle implementation of Chronos is pretty much complete. The Postgres side isn’t quite production ready. But the next time you’re working on a project that involves scheduling, time tracking or calendaring, check out the Chronos temporal toolkit. And forget the cursors, think in sets… of sets.