Memory Refresh
SQLAntipatterns



Denormalization





Context: Bug tracking application for products

CREATE TABLE Products ( product_id SERIAL PRIMARY KEY, ... account_id BIGINT UNSIGNED, -- many-to-one relationship FOREIGN KEY (account_id) REFERENCES Accounts(account_id) ); Your original design allowed only one user to be the contact for each product. Next, you were requested to support assigning multiple users. Simple, change the database to store a list of user account identifiers separated by commas.

Antipattern: Comma separated list values

CREATE TABLE Products ( product_id SERIAL PRIMARY KEY, product_name VARCHAR(1000), account_id VARCHAR(100), -- comma-separated list ); SELECT * FROM Products WHERE account_id REGEXP '[[:<:]]12[[:>:]]'; Soon your boss approaches: "They tell me they can add five people only. If they try to add more, they get an error." Programmers commonly use comma-separated lists to avoid creating an intersection table for a many-to-many relationship. Pattern-matching expressions may return false matches and can’t benefit from indexes. You might improve performance for some kinds of queries by applying denormalization to your database organization. Be conservative if you decide to employ denormalization. Start by using a normalized database organization.

Solution: Intersection table

CREATE TABLE Contacts ( product_id BIGINT UNSIGNED NOT NULL, account_id BIGINT UNSIGNED NOT NULL, PRIMARY KEY (product_id, account_id), FOREIGN KEY (product_id) REFERENCES Products(product_id), FOREIGN KEY (account_id) REFERENCES Accounts(account_id) ); SELECT p.* FROM Products AS p JOIN Contacts AS c ON (p.account_id = c.account_id) WHERE c.account_id = 34; When the table has foreign keys referencing two tables, it's called an intersection table. Some people resist queries that contain a join, thinking that they perform poorly. However, this query uses indexes much better.