当前位置: 动力学知识库 > 问答 > 编程问答 >

c - If all data were put into memory, what's the fastest way to do a "SELECT ...WHERE ..." thing?

问题描述:

If all data were put into memory, which means the media speed is much more faster, what's the fastest way to do a "SELECT .. WHERE .." query (filter data)? So far the options in my mind:

1) b tree like algorithms, but it may still need index and larger space

2) fixed length array, smaller size but may be slower.

So are there any other better ways, if both speed and size are the concerns

网友答案:

It is dependent on the specific case you have - what operations you need fast, what is the exact size, and more. Some examples:

  1. For AND queries, a set of sorted lists is usually maintained (a list for each feature). This data structure is called an inverted index, and is used often by search engines to get the relevant documents from a given query. (Apache Lucene uses this data structure, for example).
  2. If arrays can be used - and iteration over the data is needed - it is a very efficient approach, since arrays are basically the most cache efficient data structure there is. Reading sequentially from an array is much faster in most cases then any other DS, since it gets you the fewest "hit misses", which are often the bottle neck when iterating your data.
  3. If your data is strings for example, and you are going to filter according to some string attributes (prefix for example) using a designed data structure for strings, such as a trie or a radix tree - might get you the best performance.

Buttom line: If you are going to do something custom made in order to enhance performance of the default libraries, you should consider the specific problem details before designing your data structure of choice.

分享给朋友:
您可能感兴趣的文章:
随机阅读: