当前位置: 动力学知识库 > 问答 > 编程问答 >

Compact vs Loose Database

问题描述:

I have two tables A and B.

Which design is preferable?

(All in one)

Table A: Article ID || Category || Sub-Category

or (Separated)

Table A: Category || Sub - Category

Table B: Article ID || Sub - Category

Querying ALLINONE:

Select article_id from tableA where article id = foo and

Category = bar and sub category = baz;

Querying SEPARATED:

Select article_id from tableB inner join tableA

where tableA.sub-category = tableB.sub-category and tableA.category = Category;

ALL in ONE is direct to the point but SEPARATED is much more clean.

Which is faster and more recommended?

网友答案:

The first version is storing all the hierarchy information in a single record

The second version is pointing to the lowest level of the hierarchy and then referring through that to the higher level(s).

In general, a more normalized approach (the second approach) is the more "natural" way of expressing such a relationship. For example, what you are calling a "subcategory" could be a "product" and the "category" could be attributes of a "product". It makes a lot of sense to store the product in a separate table.

There is (at least) one situation when you do not want to do this. Sometimes, the relationships between the category and the subcategory change over time and you want to maintain the relationship at a given point in time. This is called a slowly changing dimension. In this case, you would want to capture all the information about the subcategory and category in a single record.

In other words, it is impossible to say which design is preferable overall. Typically, the second method (more normalized) solves more business problems. There are some circumstances where the first might be more appealing.

网友答案:

For the example query you show, it would likely be faster if you use a single table, and define a multi-column index over the three columns (article_id, category, subcategory).

But keep in mind that you may want to run another query later against the same table(s), and it would benefit from a different organization and indexes. We decide on performance optimizations based on the queries, not the tables. So it pays to do some analysis about all the ways you will query the data.

PS: There is no == operator in SQL.

网友答案:

please do not optimize prematurely. start with normalized tables or provide more real info!

网友答案:

If your categorization is so simple and so basic. i.e every article could be found in only one category and there is no hierarchy structure for your categories, you may use the first design of the one table. Otherwise, you have to use another design.

分享给朋友:
您可能感兴趣的文章:
随机阅读: