Timespan – Creating custom data types in PostgreSQL

19 05 2009

Working with time has always been the bane of database development. And one that it seems every DMBS has implemented differently. Postges and MySQL have a DATE type that really just stores the date. Oracle also has a DATE type but it stores time to the second. The equivalent in MySQL is called DATETIME. Postgres doesn’t have a DATETIME, instead you have to use TIMESTAMP(0) (technically TIMESTAMP WITH OUT TIME ZONE, but I like the shorthand personally) And while Oracle and Postgres can handle any reasonable timestamp, MySQL’s TIMESTAMP is limited to the range of the Unix timestamp. Of course MS SQL Server deserves honorable mention as having the absolute worst handling of dates and times.

So after writing dozens of applications that deal with time, I (finally) had an epiphany. Most often I am modeling a span of time with start time and end time or start time and duration. And while a timestamp is ideal for storing a specific point in time (creation time, last login); it is less ideal for recording spans (employee shift, appointment, trip time). And depending on the application (and even the table) some times I would model it as a start time and end time, some times as a start time and duration in minutes or seconds, some times as a start time and interval.

But since Postgres and Oracle make it easy to create custom data types, lets simplify things by bundling the start and end times into a single timespan type. For now we’ll implement it in Postgres and later I’ll show it in Oracle.

There are several ways we could do this in Postgres, but lets stay away from creating types in C. Actually, we don’t even need to use pl/pgsql, as using straight sql seems to perform faster and has the power to do what we need.

CREATE TYPE timespan AS(
start_time  timestamp(0),
end_time    timestamp(0)
);

That was easy. Lets make a few constructors:

CREATE OR REPLACE FUNCTION timespan(
timestamp,
timestamp
) RETURNS timespan AS
$$
SELECT (LEAST($1,$2), GREATEST($1,$2))::timespan;
$$
LANGUAGE 'sql' IMMUTABLE STRICT
COST 1;

-- For time and duration in seconds
CREATE OR REPLACE FUNCTION timespan(
timestamp,
numeric
) RETURNS timespan AS
$$
SELECT CASE WHEN $2 >= 0 THEN
($1, $1 + ($2 || ' seconds')::interval)::timespan
WHEN $2 < 0 THEN
($1 - (ABS($2) || ' seconds')::interval, $1)::timespan
END;
$$
LANGUAGE 'sql' IMMUTABLE STRICT
COST 1;

-- For time and duration as interval
CREATE OR REPLACE FUNCTION timespan(
timestamp,
interval
) RETURNS timespan AS
$$
SELECT CASE WHEN $2 >= interval '0 seconds' THEN
($1, $1 + $2)::timespan
WHEN $2 < interval '0 seconds' THEN
timespan($1 + $2, $1)::timespan END;
$$
LANGUAGE 'sql' IMMUTABLE STRICT
COST 1;

Ok, no wizardry there either. The one thing to note is that we are ensuring that the start time actually does come before the end_time. But the purpose of doing all of this is so we can create functions that operate on timespans and determine if one contains another, how much they overlap, or get the intersection of two timespans. For those who have not used compound data types in SQL, we need to wrap it in parens before accessing an attribute or member function. Everything else should be straight forward.

CREATE OR REPLACE FUNCTION timespan_contained(timespan, timespan)
RETURNS boolean AS
$$
SELECT ($1).start_time >= ($2).start_time AND ($1).end_time <= ($2).end_time;
$$
LANGUAGE 'sql' IMMUTABLE;

CREATE OR REPLACE FUNCTION timespan_contains(timespan, timespan)
RETURNS boolean AS
$$
SELECT ($1).start_time <= ($2).start_time AND ($1).end_time >= ($2).end_time;
$$
LANGUAGE 'sql' IMMUTABLE;

CREATE OR REPLACE FUNCTION timespan_intersect(timespan, timespan)
RETURNS timespan AS
$$
SELECT CASE WHEN ($1).start_time <= ($2).end_time AND ($1).end_time >= ($2).start_time
THEN timespan(GREATEST(($1).start_time, ($2).end_time), LEAST(($1).end_time, ($2).end_time))
ELSE NULL END;
$$
LANGUAGE 'sql' IMMUTABLE;

CREATE OR REPLACE FUNCTION timespan_overlap(
timespan,
timespan
) RETURNS interval AS
$$
SELECT CASE WHEN ($1).start_time <= ($2).end_time AND ($1).end_time >= ($2).start_time
THEN LEAST(($1).end_time, ($2).end_time) - GREATEST(($1).start_time, ($2).start_time)
ELSE '0 seconds'::interval END;
$$
LANGUAGE 'sql' IMMUTABLE;

CREATE OR REPLACE FUNCTION to_interval(timespan)
RETURNS interval AS
$$
SELECT ($1).end_time - ($1).start_time;
$$
LANGUAGE 'sql' IMMUTABLE;

Finally lets add a few operators:

-- contains
CREATE OPERATOR @> (
LEFTARG = timespan,
RIGHTARG = timespan,
PROCEDURE = timespan_contains
);

-- is contained by
CREATE OPERATOR <@ (
LEFTARG = timespan,
RIGHTARG = timespan,
PROCEDURE = timespan_contained
);

-- overlap
CREATE OPERATOR && (
LEFTARG = timespan,
RIGHTARG = timespan,
PROCEDURE = timespan_overlap
);

So you are saying, “that’s a cool concept. But…” And the but’s may be that your schema has already been set in stone, or that you don’t feel comfortable storing your mission critical data in a non-standard data type. Those are perfectly good arguments. So what I did was create views that casts the time representations into timespans. Now its dead easy to do things that would otherwise be quite difficult.

* UPDATE*

So this video on temporal data in Postgres, made mention of a PgFoundry project called Temporal PostgreSQL.  The audio feed drops from the video after the first 10 minutes or so.  And it is not clear what the status of the temporal postgres project is. But you may want to check it out if you are interested in a C implementation instead.

About these ads

Actions

Information

3 responses

21 05 2009
Postgres OnLine Journal

Creating instance of custom type…

Someone asked me this recently and not playing with custom types much, I’m not sure this is
the best solution. Anyrate here it goes. Note this works for PostgreSQL 8.2 and above. Note sure about 8.1 etc.
I assume it probably does.

Let us say y…

8 04 2010
Anonymous Coward

I’d like to create a base type using a function written in PL/pgSQL rather than C. Some examples on the internet show it, but it doesn’t work.

21 05 2012
JoeB

One of the things that is happens when you know a lot is that you don’t know how much you know (and usually start understanding how much you don’t know… but that is another story). You obviously know a lot about custom data types. And although I know a lot about SQL, I have no clue what you just did. Other than based on what you said at the beginning it has something to do with a time duration data type. You didn’t explain anything about what you did… you just said, “hey, that was easy.” Everything is easy to those who know how. I wouldn’t have looked at your page if I knew how. I won’t look at any more of your pages because when I finished, I still didn’t know how. Books don’t just examples because examples without explanations are useless. Take from that what you will.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

%d bloggers like this: