当前位置: 动力学知识库 > 问答 > 编程问答 >

database - Implications of SSTable immutability in Cassandra for disk usage

问题描述:

According to:

http://www.datastax.com/docs/1.0/ddl/column_family#about-column-family-compression

The reason RDBMSs see a performance degredation as a result of compression is because the data being over-written must be seeked on disk, decompressed, over-written, and then recompressed. On the other hand, Cassandra can see performance increase for reads and writes because the SSTable is immutable, so no records are ever over-written and the overhead is thus much smaller than for a compressed RDBMS.

I'm wondering, what are the implications of this over the long term, as a Cassandra data store continues to grow? It seems like the only consequence is an ever-growing need for more disk space, is this correct?

网友答案:

Periodically Cassandra will run a compaction process on your existing SSTables. Compaction merges multiple SSTables into one new larger SSTable, discarding obsoleted data. After compaction has occurred Cassandra will (eventually) delete the old SSTables.

So if the size of your data set is stable your SSTable size will not grow infinitely. The Cassandra wiki contains more information on compaction.

分享给朋友:
您可能感兴趣的文章:
随机阅读: