ExpertRefresh

SQL / EAV  

1) In EAV design




2) When using class table inheritance solution







Context

How do I count the number of rows by date? This is a simple task for a database programmer. SELECT date_reported, COUNT(*) FROM Bugs GROUP BY date_reported; However, this assumes two things: values are stored in the same column, and values can be compared to one another. What if the date is stored in the date_reported or report_date column? You may encounter these problems and others when you employ the antipattern known as Entity-Attribute-Value.

Antipattern: Attribute Table

The solution that appeals to some programmers is to create a second table, storing attributes as rows. CREATE TABLE Issues ( issue_id SERIAL PRIMARY KEY ); INSERT INTO Issues (issue_id) VALUES (1234); CREATE TABLE IssueAttributes ( issue_id BIGINT UNSIGNED NOT NULL, attr_name VARCHAR(100) NOT NULL, attr_value VARCHAR(100), PRIMARY KEY (issue_id, attr_name), FOREIGN KEY (issue_id) REFERENCES Issues(issue_id) ); This appears to be an improved design. However, the simple database structure doesn’t make up for the difficulty of using it.

Problems

Your boss needs to run a report of the bugs reported per day. In a conventional table design, the Issues table would have a simple attribute column such as date_reported. SELECT issue_id, date_reported FROM Issues; To get the same information using the EAV design, the query is more verbose and less clear. SELECT issue_id, attr_value AS 'date_reported' FROM IssueAttributes WHERE attr_name = 'date_reported'; When you use EAV, you sacrifice many advantages that a conventional database design would have given you. You can't make mandatory attributes. In a conventional database design, it would be simple to enforce a mandatory column by declaring the column NOT NULL. You can't use SQL data types. In a conventional database, you can prevent this if you declared the column with the DATE data type. INSERT INTO Issues (date_reported) VALUES ('banana'); -- ERROR! Some people try to extend the EAV design by defining a separate attr_value column for each SQL data type, leaving null in the unused columns. You can't enforce referential integrity. In a conventional database, you can restrict the range of some attributes by defining a foreign key to a lookup table.

Solution: Class table Inheritance

Mimics inheritance, as though tables were object-oriented classes. Create a single table for the base type, containing attributes common to all subtypes. Then create another table for each subtype. CREATE TABLE Issues ( issue_id SERIAL PRIMARY KEY, reported_by BIGINT UNSIGNED NOT NULL, product_id BIGINT UNSIGNED, priority VARCHAR(20), version_resolved VARCHAR(20), status VARCHAR(20), FOREIGN KEY (reported_by) REFERENCES Accounts(account_id), FOREIGN KEY (product_id) REFERENCES Products(product_id) ); CREATE TABLE Bugs ( issue_id BIGINT UNSIGNED PRIMARY KEY, severity VARCHAR(20), version_affected VARCHAR(20), FOREIGN KEY (issue_id) REFERENCES Issues(issue_id) ); CREATE TABLE FeatureRequests ( issue_id BIGINT UNSIGNED PRIMARY KEY, sponsor VARCHAR(50), FOREIGN KEY (issue_id) REFERENCES Issues(issue_id) ); If you have many subtypes or if you must support new attributes frequently, you can add a BLOB column to store data in a format such as XML or JSON, which encodes both the attribute names and their values.

Stuck

Unfortunately, sometimes you’re stuck with the EAV design, such as if you inherited a project and can’t change it. If this is the case, familiarize yourself with the trouble areas. Above all, don’t try to write queries that fetch entities as a single row as though data were stored in a conventional table.


References