sql

Before reading check out my previous post on (inheritance in SQL)[http://mycodesmells.com/post/how-to:-inheritance-in-sql].

We've taken a look at the ways to represent OOP inheritance in an SQL database: single table, multiple tables, and a reference table. We were using PostgreSQL for those examples, and this database allows us to use one more way of representing such a relation. Let's take a look on PostgreSQL's INHERITS.

INHERITS syntax

Defining parent and child tables using INHERITS syntax is as clear as it can be. First, we need to define all columns for the parent, then define only extra columns for a child, noting that it should inherit all columns from the first one:

CREATE TABLE player (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255),
    club VARCHAR(255),
    country CHAR(2)
);

CREATE TABLE nba_player (
    points_total INT,
) INHERITS (player);

We can verify their schemas:

dev=# \d+ player
                                                    Table "public.player"
 Column  |          Type          |                      Modifiers                      | Storage  | Stats target | Description 
---------+------------------------+-----------------------------------------------------+----------+--------------+-------------
 id      | integer                | not null default nextval('player_id_seq'::regclass) | plain    |              | 
 name    | character varying(255) |                                                     | extended |              | 
 club    | character varying(255) |                                                     | extended |              | 
 country | character(2)           |                                                     | extended |              | 
Indexes:
    "player_pkey" PRIMARY KEY, btree (id)
Child tables: nba_player

dev=# \d+ nba_player
                                                    Table "public.nba_player"
    Column    |          Type          |                      Modifiers                      | Storage  | Stats target | Description 
--------------+------------------------+-----------------------------------------------------+----------+--------------+-------------
 id           | integer                | not null default nextval('player_id_seq'::regclass) | plain    |              | 
 name         | character varying(255) |                                                     | extended |              | 
 club         | character varying(255) |                                                     | extended |              | 
 country      | character(2)           |                                                     | extended |              | 
 points_total | integer                |                                                     | plain    |              | 
Inherits: player

As you can see, everything is in place. Both tables even share a sequence responsible for setting an automatic id value, so that we won't have to worry about duplicate values in this column. Inserting data into each table is very simple, as long as we remember whether we add just a player, or an NBA player:

INSERT INTO player(name, club, country) VALUES('Cristiano Ronaldo', 'Real Madrid', 'PT');
INSERT INTO nba_player(name, club, country, points_total) VALUES('Pau Gasol', 'San Antonio Spurs', 'ES', 20001);

Now when we want to select all players, we get them from both tables:

# SELECT * FROM player;
 id |        name        |         club          | country 
----+--------------------+-----------------------+---------
  3 | Robert Lewandowski | Bayern Munich         | PL
  5 | Cristiano Ronaldo  | Real Madrid           | PT
  1 | LeBron James       | Cleveland Cavaliers   | US
  2 | Kevin Durant       | Golden State Warriors | US
  4 | Marcin Gortat      | Washington Wizards    | PL
  6 | Pau Gasol          | San Antonio Spurs     | ES
(6 rows)

We can obviously select all from nba_player table only, but we can also select all from player without including child tables using ONLY keyword:

# SELECT * from nba_player;
 id |     name      |         club          | country | points_total 
----+---------------+-----------------------+---------+--------------
  1 | LeBron James  | Cleveland Cavaliers   | US      |        28787
  2 | Kevin Durant  | Golden State Warriors | US      |        19121
  4 | Marcin Gortat | Washington Wizards    | PL      |         7028
  6 | Pau Gasol     | San Antonio Spurs     | ES      |        20001
(4 rows)


# select * from ONLY player;
 id |        name        |     club      | country 
----+--------------------+---------------+---------
  3 | Robert Lewandowski | Bayern Munich | PL
  5 | Cristiano Ronaldo  | Real Madrid   | PT
(2 rows)

What is also awesome about table inheritance is that all changes made using ALTER command are propagated to the child tables:

dev=# ALTER TABLE player RENAME COLUMN name TO full_name;
ALTER TABLE
dev=# \d+ player
                                                    Table "public.player"
 Column   |          Type          |                      Modifiers                      | Storage  | Stats target | Description 
-----------+------------------------+-----------------------------------------------------+----------+--------------+-------------
id        | integer                | not null default nextval('player_id_seq'::regclass) | plain    |              | 
full_name | character varying(255) |                                                     | extended |              | 
club      | character varying(255) |                                                     | extended |              | 
country   | character(2)           |                                                     | extended |              | 
Indexes:
    "player_pkey" PRIMARY KEY, btree (id)
Child tables: nba_player

dev=# \d+ nba_player
                                                    Table "public.nba_player"
    Column    |          Type          |                      Modifiers                      | Storage  | Stats target | Description 
--------------+------------------------+-----------------------------------------------------+----------+--------------+-------------
id           | integer                | not null default nextval('player_id_seq'::regclass) | plain    |              | 
full_name    | character varying(255) |                                                     | extended |              | 
club         | character varying(255) |                                                     | extended |              | 
country      | character(2)           |                                                     | extended |              | 
points_total | integer                |                                                     | plain    |              | 
Inherits: player

Downsides

Unfortunately, PostgreSQL is not as perfect as it may seem. There are a few problems we must be aware of.

Not sure if UNIQUE

There is no (easy at least) way to make sure a column is UNIQUE across both player and nba_player. If we set such a constraint on name in player we can still add another nba_player with the same name:

dev=# INSERT INTO player(name, club, country) VALUES('Pau Gasol', 'San Antonio Spurs', 'ES');
INSERT 0 1
dev=# INSERT INTO player(name, club, country) VALUES('Pau Gasol', 'San Antonio Spurs', 'ES');
ERROR:  duplicate key value violates unique constraint "player_name_key"
DETAIL:  Key (name)=(Pau Gasol) already exists.
dev=# INSERT INTO nba_player(name, club, country, points_total) VALUES('Pau Gasol', 'San Antonio Spurs', 'ES', 20001);
INSERT 0 1

We can add another UNIQUE constraint on name column in nba_player (which would require us to define it in both tables), but it would still work only within nba_player and we could have the same name value in for one row in each table.

References not propagated

If we add a REFERENCES constraint to one of the base columns, it is not propagated to the child tables:

CREATE TABLE country (
    code CHAR(2) UNIQUE,
    name VARCHAR(80)
);

CREATE TABLE player (
    id SERIAL PRIMARY KEY,
    name VARCHAR(255) UNIQUE,
    club VARCHAR(255),
    country CHAR(2) REFERENCES country (code)
);

CREATE TABLE nba_player (
    points_total INT,
    country CHAR(2)
) INHERITS (player);

Now we cannot insert a player with a non existing country.code, but we can do that for nba_player:

dev=# INSERT INTO player(name, club, country) VALUES('Cristiano Ronaldo', 'Real Madrid', 'PT');
ERROR:  insert or update on table "player" violates foreign key constraint "player_country_fkey"
DETAIL:  Key (country)=(PT) is not present in table "country".
dev=# INSERT INTO nba_player(name, club, country, points_total) VALUES('Pau Gasol', 'San Antonio Spurs', 'ES', 20001);
INSERT 0 1

Fortunately, there is a simple workaround for that, just redefine country column in the child table and add the REFERENCES constraint there as well.

Being referenced

The last problem we might be facing is having a parent table referenced in another table. If we create a new table called best_player which references a name column from player table:

CREATE TABLE best_player (
    club VARCHAR(255) PRIMARY KEY,
    player VARCHAR(255) REFERENCES player(name)
);

Then we would expect to work with both player and nba_player rows. Unfortunately, we'll be disappointed:

dev=# INSERT INTO best_player(club, player) VALUES('Bayern Munich', 'Robert Lewandowski');
INSERT 0 1
                                    ^
dev=# INSERT INTO best_player(club, player) VALUES('Cleveland Cavaliers', 'LeBron James');
ERROR:  insert or update on table "best_player" violates foreign key constraint "best_player_player_fkey"
DETAIL:  Key (player)=(LeBron James) is not present in table "player".

The worst thing is there is no workaround for this.

Summary

As you can see, PostgreSQL allows us to handle SQL inheritance in yet another way. There are some pros and cons of this approach, so if you are positive that those problems will not hound you in the future, you should give INHERITS syntax a try.