Comments

Context

Readers can add comments and even reply to each other. You choose a simple solution to track these reply: parent_id. This design is called Adjacent List. But it's hard to retrieve a long chain of replies in a single SQL query.
 
CREATE TABLE Comments (
    comment_id SERIAL PRIMARY KEY,
    parent_id BIGINT UNSIGNED,
    bug_id BIGINT UNSIGNED NOT NULL,
    author BIGINT UNSIGNED NOT NULL,
    comment_date DATETIME NOT NULL,
    comment TEXT NOT NULL,
    FOREIGN KEY (parent_id) REFERENCES Comments(comment_id),
    FOREIGN KEY (bug_id) REFERENCES Bugs(bug_id),
    FOREIGN KEY (author) REFERENCES Accounts(account_id)
);    

Antipattern

In Adjacent List design you cannot get all descendants. It can't be an antipattern when it’s the default choice of so many developers. Yet it fails to be a solution for one of the most common tasks. You cannot query all descendants. You can retrieve a comment and its immediate children using.
 
SELECT c1.*, c2.*
FROM Comments c1 LEFT OUTER JOIN Comments c2
    ON c2.parent_id = c1.comment_id;    
The following query retrieves a tree of depth up to four. This makes it hard to compute an aggregate such as COUNT( ).
 
SELECT c1.*, c2.*, c3.*, c4.*
FROM Comments c1 -- 1st level
    LEFT OUTER JOIN Comments c2
        ON c2.parent_id = c1.comment_id -- 2nd level
    LEFT OUTER JOIN Comments c3
        ON c3.parent_id = c2.comment_id -- 3rd level
    LEFT OUTER JOIN Comments c4
        ON c4.parent_id = c3.comment_id; -- 4th level    

Don’t over-engineer

For example, an inventory application. It never needed to create a grouping of equipment with a tree deeper than one.

Adjacent List

Solution 1

A UNIX path like /usr/local/lib/ is a Path Enumeration of the filesystem.
 
CREATE TABLE Comments (
    comment_id SERIAL PRIMARY KEY,
    path VARCHAR(1000),
    bug_id BIGINT UNSIGNED NOT NULL,
    author BIGINT UNSIGNED NOT NULL,
    comment_date DATETIME NOT NULL,
    comment TEXT NOT NULL,
    FOREIGN KEY (bug_id) REFERENCES Bugs(bug_id),
    FOREIGN KEY (author) REFERENCES Accounts(account_id)
);
For example, to find ancestors of comment #7, whose path is 1/4/6/7/, do this:
 
SELECT *
FROM Comments AS c
WHERE '1/4/6/7/' LIKE c.path || '%';        

Closure Table

Solution 2

Table TreePaths, with two columns, each of which is a foreign key to the Comments.
 
CREATE TABLE Comments (
    comment_id SERIAL PRIMARY KEY,
    bug_id BIGINT UNSIGNED NOT NULL,
    author BIGINT UNSIGNED NOT NULL,
    comment_date DATETIME NOT NULL,
    comment TEXT NOT NULL,
    FOREIGN KEY (bug_id) REFERENCES Bugs(bug_id),
    FOREIGN KEY (author) REFERENCES Accounts(account_id)
); 
 
CREATE TABLE TreePaths (
    ancestor BIGINT UNSIGNED NOT NULL,
    descendant BIGINT UNSIGNED NOT NULL,
    PRIMARY KEY(ancestor, descendant),
    FOREIGN KEY (ancestor) REFERENCES Comments(comment_id),
    FOREIGN KEY (descendant) REFERENCES Comments(comment_id)
);    
Store one row in this table for each pair of nodes. Also add a row for each node to reference itself. To retrieve descendants of comment #4, match rows in TreePaths.
 
SELECT c.*
FROM Comments AS c
       JOIN TreePaths AS t ON c.comment_id = t.descendant
WHERE t.ancestor = 4;    
To insert a new leaf node, first insert the self-referencing row.
 
INSERT INTO TreePaths (ancestor, descendant)
    SELECT t.ancestor, 8
    FROM TreePaths AS t
WHERE t.descendant = 5
    UNION ALL
    SELECT 8, 8;    
To delete a complete subtree.
 
DELETE FROM TreePaths
WHERE descendant IN (SELECT descendant
       FROM TreePaths
       WHERE ancestor = 4);    






Questions and answers:
Clink on Option to Answer




1. What antipattern uses parent_id?

  • a) Adjacent List
  • b) Path Enumeration

2. What's the main problem when using parent_id?

  • a) You can't retrieve all descendants
  • b) It's hard to retrieve its immediate children

3. Closure table solution

  • a) uses another table with ancestor / descendant
  • b) uses ancestor within the current table

4. In Closure table design

  • a) you need to add a row for each node to reference itself
  • b) you need to add a row for each ancestor and descendant


References: