Scheduling with Date Mod Part 1

19 10 2009

One question that seems to come up a lot is how to model recurring schedules. That’s understandable, because schedules can get out of hand pretty quickly.

We need a template so that we can define a schedule and then some how project it out on a repeating basis. The single week schedules are pretty easy. We just record the day of week and for each schedule and day we look up that day of week to get the shift for that day.

But life is rarely that simple. Chances are that some employees will be working on a one week cycle while others are on a two or four (or more) week schedule.

This tutorial will be using data types for Postgres, but the concepts here can be applied to other databases. That said, we’ll be using tools provided by Chronos, a temporal toolkit for Oracle and Postgres. So if you don’t have that, you’ll have to do some of the heavy lifting yourself.

So assuming you’ve got Chronos already installed, lets get started. The first thing we need is a calendar table. In data warehousing, we almost always have a date dimension table. It is less common to find calendar tables in OLTP databases. But for any kind of scheduling application, a calendar table is invaluable.


CREATE TABLE info_calendar(
cal_date DATE PRIMARY KEY,
cal_period PERIOD,
wk_idx INT2,
wk_nidx INT2
);
--
-- Populate it from 2005 to 2015
-- You'll want a wider range for an actual application
--
INSERT INTO info_calendar (cal_date, cal_period, wk_idx, wk_nidx)
SELECT dt, period(dt), week_of_month(dt, 1), week_of_month(dt, -1)
FROM (
SELECT DATE '2005-01-01' + i AS dt
FROM generate_series(0, 3652) i
) sub;

So a couple things to point out here. One is to point out that a date is a special type of period that always starts at midnight and is 24 hours long. So we will convert each date to a period.

We used the week_of_month() function from Chronos to get week index from the beginning and the end of the month. We wont use it in this tutorial, but later we’ll use so we can schedule recurring meetings on the 3rd Tuesday of the month. For now lets just run a sample query to find all of the Thanksgivings in our calendar. Thanksgiving (in the US) is the 4th Thursday of November.

SELECT c.cal_date
FROM info_calendar c
WHERE EXTRACT(MONTH FROM cal_date) = 11
AND EXTRACT(DOW FROM cal_date) = 4
AND wk_idx = 4;

To be able to model these advanced schedules we’ll need to know how often they repeat, and we need an anchor date. In the next article, we’ll see how to put these pieces together.

About these ads

Actions

Information

14 responses

20 10 2009
greg

you skipped column list in INSERT, which is a really bad practice.
Also, this doesn’t really show usage of cal_period PERIOD

20 10 2009
Scott Bailey

Greg,

You are correct on both counts. On the column list, its not a big deal for a one off query. When building a DML interface you should ALWAYS include the column list. But perhaps all of my readers don’t know this and I should be leading by example. So I changed it.

Regarding the period column, we need a couple more pieces of the puzzle in order to show this off. There is more coming, hence the “part 1.”

20 10 2009
greg

oh, yes – sorry – I overlooked the part that says ‘part one’ in title.
Looking forward to that. I am very interested in applying your toolbox in future, as so far it is been all ‘manual labour’ here.

20 10 2009
Richard Huxton

“One is to point out that a date is a special type of period that always starts at midnight and is 24 hours long.”

Well, 23 to 25 hours long. Unless some country somewhere has a daylight-savings adjustment that overlaps with leap seconds in which case you’ll want to allow for up to 25 hours + 1 second.

All this is provided you aren’t spanning two or more locations or changing location of course. Ah, time and date handling! It’s right up there with locales in terms of oddities to handle.

Looking forward to more details in chapter 2.

20 10 2009
Jeff Davis

Is there a way to manipulate these schedules without trying to materialize all of the instances between two given years?

If we can represent a schedule in a more compact way, like “the 4th Thursday in every November”, I would think that we could avoid materialization.

20 10 2009
Scott Bailey

Jeff,

You absolutely can. You don’t HAVE to materialize a calendar table. But I STRONGLY recommend that you do. And I’m not alone there. Celko would tell you that you are a blathering idiot if you didn’t have a date relation. And many other authors who cover temporal topics typically start out by building a date relation. Having all of the dates available and INDEXED will greatly simplify your queries.

At a minimum, you should have a table with a single date column. But other columns that may prove useful are day_of_year, week_of_year, day_of_week, day_code, and quarter. In Postgres and Oracle those facts are pretty easy to get to. But in others (especially SQL Server) they are really hard. So I’ll typically create the table in Postgres and import it into SQL Server or MySQL.

FWIW creating the above table ended up using 408KB of disk space.

20 10 2009
Jeff Davis

I’m still a little skeptical, but I will take that into consideration.

My primary objection is that we are taking a nice, compact formula that is cheap to perform calculations on, and then materializing a part of it.

What about SRFs, like you’re using to populate the table to begin with?

20 10 2009
Scott Bailey

Jeff,

That’s a valid objection. Scanning the table for all Thanksgivings and running the equivalent query over the results of a SRF performed pretty evenly on my desktop. It was 0.9ms on the table and 3 ms for the function. But we’re splitting hairs here. There is little performance difference and 408KB is irrelevant. So it really comes down to a matter of personal preference. If you feel more comfortable using a table, do that; and if you want to create a SRF to generate dates that will work fine too.

Assuming you are using Postgres, most other databases will only have the table option.

21 10 2009
Scott Bailey

Jeff,

I created a SRF to generate dates and put it on the snippets wiki page.
http://wiki.postgresql.org/wiki/Date_Generator

20 10 2009
Scott Bailey

Richard,

Days are always 24 hours long but some times our timezones change. The timestamptz type is designed to handle that for us, so you’ll end up with the right answer. So for me, date 2009-11-01 would get converted to period [2009-11-01 00:00:00-07, 2009-11-02 00:00:00-08).

But you are correct, If you are going to build an application that needs to schedule employees across multiple timezones, then you would build the day periods on the fly in a view. I’m not going to address scheduling across multiple timezones in this series. But I can address it in a future series if there is enough interest. Honestly, if you belong to a mega corporation and need to schedule enterprise wide, you are probably going to purchase software for 6 or 7 figures anyhow.

Leap seconds are inserted/removed in either the last day of June or December and wouldn’t happen concurrently with a daylight savings time change which will happen some time in the spring or fall. And in any case, Postgres (and every other RDBMS that I work with) ignores the existence of leap seconds.

But you drive home a point that I often try to make. If you don’t think time is one of the most complicated things you’ve ever tried to model, then you haven’t given it enough thought.

21 10 2009
Richard Huxton

“Days are always 24 hours long but some times our timezones change. The timestamptz type is designed to handle that for us, so you’ll end up with the right answer. So for me, date 2009-11-01 would get converted to period [2009-11-01 00:00:00-07, 2009-11-02 00:00:00-08).”

As long as you are modelling nothing smaller than a day you can *pretend* that they are 24 hours, but they aren’t really. On 1st Nov with you, and 25th Oct here in London the day will be 25 hours long:

SELECT extract(epoch from
(’2009-11-02 00:00:00-08′::timestamptz – ’2009-11-01 00:00:00-07′::timestamptz)
)/3600 AS hours;

hours
——-
25

“But you drive home a point that I often try to make. If you don’t think time is one of the most complicated things you’ve ever tried to model, then you haven’t given it enough thought.”

No arguments there.

23 10 2009
Scott Bailey

Richard,

Everyone is well aware of what happens when we go on/off daylight savings time. The only “pretending” here is you pretending not to hear me when I told you that when you convert a date to a period it will correctly handle the timezone change. Either take my word for it, test it for yourself or assume I’m an idiot an move on. But I’m not going to entertain an argument with you.

That said, if the Earth ever decides to take 25 hours to complete a rotation, PLEASE contact me right away! Although when that happens, I doubt the database will be the biggest of our problems.

20 10 2009
Scheduling with Date Mod Part 2 « Flex and Specs()

[...] 20 10 2009 So as the title suggests, this is the second part of a series on scheduling. In the first part we did some basic setup. Now we are ready to dig in. The goal here is to be able to model any [...]

6 05 2014
agen betting bola

Pretty nice post. I just stumbled upon your blog and wished to say that
I have truly enjoyed surfing around your blog posts. In any case I will be subscribing to your rss feed and I hope you write again soon!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

%d bloggers like this: