Memory Refresh
SQLAntipatterns



Clone tables





Context: Sales division needed to break down revenew by year

In this way, they could track recently active customers. They decided to add a series of new columns, each column's name indicating the year it covered. CREATE TABLE Customers ( customer_id NUMBER(9) PRIMARY KEY, contact_info VARCHAR(255), business_type VARCHAR(20), revenue NUMBER(9,2) ); ALTER TABLE Customers ADD (revenue2002 NUMBER(9,2)); ALTER TABLE Customers ADD (revenue2003 NUMBER(9,2)); ALTER TABLE Customers ADD (revenue2004 NUMBER(9,2)); Each year, they needed to add one more column. A database administrator was responsible for managing Oracle’s tablespaces. So each year, they had to have a series of meetings, schedule a data migration to restructure the tablespace, and add the new column. Ultimately they wasted a lot of time and money. Performance degrades for any database query as the volume of data goes up. Using indexes intelligently helps, but nevertheless the tables grow.

Antipattern: Clone Tables or Columns

In the television series Star Trek, tribbles are small furry animals kept as pets. Tribbles are very appealing at first, but soon they reveal their tendency to reproduce out of control, and managing the overpopulation of tribbles becomes a serious problem. We know from experience that querying a table with few rows is quicker than querying a table with many rows, all other things being equal. This leads to a common error that we must make every table contain fewer rows, no matter what. You discover that some 2010 bugs were entered in the Bugs_2009 by mistake. There’s no way to limit the data relative to the name of its table automatically, but you can declare a CHECK constraint in each of your tables. CREATE TABLE Bugs_2009 ( -- other columns date_reported DATE CHECK (EXTRACT(YEAR FROM date_reported) = 2009) ); If a dependent table like Comments references Bugs, the dependent table cannot declare a foreign key. A foreign key must specify a single table, but in this case the parent table is split into many. Legitimate Uses of the Antipattern: One good use of manually splitting tables is for archiving.

Solution: Partition and Normalize

There are better ways to improve performance if a table gets too large, instead of splitting the table manually. These include horizontal partitioning, vertical partitioning, and using dependent tables. Physically, the table is split, but you can still execute SQL statements against the table as though it were whole. Partitioning is not defined in the SQL standard. Nevertheless, some form of partitioning is now supported by every major brand of database. CREATE TABLE Bugs ( bug_id SERIAL PRIMARY KEY, -- other columns date_reported DATE ) PARTITION BY HASH ( YEAR(date_reported) ) PARTITIONS 4; Whereas horizontal partitioning splits a table by rows, vertical partitioning splits a table by columns. Splitting a table by columns can have advantages when some columns are bulky or seldom needed. BLOB and TEXT columns have variable size, and they may be very large. Many database brands automatically store columns with these data types separately from the other columns of a given row. This could lead to performance problems if you're in the habit of retrieving all columns using the * wildcard.