I have two tables A and B.
Which design is preferable?
(All in one)
Table A: Article ID || Category || Sub-Category
Table A: Category || Sub - Category
Table B: Article ID || Sub - Category
Select article_id from tableA where article id = foo and
Category = bar and sub category = baz;
Select article_id from tableB inner join tableA
where tableA.sub-category = tableB.sub-category and tableA.category = Category;
ALL in ONE is direct to the point but SEPARATED is much more clean.
Which is faster and more recommended?
The first version is storing all the hierarchy information in a single record
The second version is pointing to the lowest level of the hierarchy and then referring through that to the higher level(s).
In general, a more normalized approach (the second approach) is the more "natural" way of expressing such a relationship. For example, what you are calling a "subcategory" could be a "product" and the "category" could be attributes of a "product". It makes a lot of sense to store the product in a separate table.
There is (at least) one situation when you do not want to do this. Sometimes, the relationships between the category and the subcategory change over time and you want to maintain the relationship at a given point in time. This is called a slowly changing dimension. In this case, you would want to capture all the information about the subcategory and category in a single record.
In other words, it is impossible to say which design is preferable overall. Typically, the second method (more normalized) solves more business problems. There are some circumstances where the first might be more appealing.
For the example query you show, it would likely be faster if you use a single table, and define a multi-column index over the three columns (
But keep in mind that you may want to run another query later against the same table(s), and it would benefit from a different organization and indexes. We decide on performance optimizations based on the queries, not the tables. So it pays to do some analysis about all the ways you will query the data.
PS: There is no
== operator in SQL.
please do not optimize prematurely. start with normalized tables or provide more real info!
If your categorization is so simple and so basic. i.e every article could be found in only one category and there is no hierarchy structure for your categories, you may use the first design of the one table. Otherwise, you have to use another design.