当前位置: 动力学知识库 > 问答 > 编程问答 >

performance - c++: Select a subset of a std::vector, based predefined element indices

问题描述:

I am looking for an efficient way of either trimming or copying a subset of an existing std::vector. The criteria for elements to be eligible for the subset / remain is that their index is contained in a separate predefined std::vector.

e.g std::vector<String> Test = { "A", "B", "C", "D", "E"}

std::vector<int> SelectionV = {1,2,5}

Result = {"A", "B", "E"}

I will be doing this on a very large vector and probably on a regular basis, so am looking for as efficient method as possible.

An alternative that I am also considering, but again unsure of an efficient method is...

As the object Test is filled (in my case it is 3rd party defined object), it is the result of a single pass through using an iterator (no direct element access is possible). I was wondering if instead it is possible to only add to the Test vector elements that appear in the count defined in SelectionV

e.g

int count = 0

for (Iterator.begin, Iterator.end(), Iterator++) {

if (count is a number contained in selectionV)

add to Test

}

but I assume that will result in a pass through the selectionV on each iteration, which would be far less efficient than simply adding all elements and later selecting a subset.

Any help much appreciated.

网友答案:

You can also use the standard library:

std::vector<std::string> Result(SelectionV.size(), 0);

std::transform(SelectionV.begin(), SelectionV.end(), Result.begin(), [Test](size_t pos) {return Test[pos];});

网友答案:

You could sort your SelectionV vector by increasing order, and then you could rewrite your for loop something like:

int index = 0, nextInSelectionV = 0;
for (Iterator.begin; nextInSelectionV < SelectionV.lengh() && Iterator.end(); Iterator++) {
    if (index == SelectionV[nextInSelectionV]) {
        add to Test
        nextInSelectionV++;
    }
    index++;
}
网友答案:
  • It depends on how large Test is and how large SelectionV is (as a percentage of Test), and whether or not elements in SelectionV repeat. You could potentially optimize by calculating Not SelectionV instead.
  • Note that in your example, since SelectionV is an index, and not a value, lookup is already O(1) fast (that's already a huge plus).
  • If Test and SelectionV do not change, and if they are big, you can also divide up SelectionV by n threads and have each thread independently lookup values in Test and then combine the individual outputs later (not unlike a map-reduce). The drawback may be a loss in CPU cache hits.
  • On repeated calls, you may want to take the difference between your old SelectionV and new SelectionV and operate on this value. This type of cache optimization will work well for small number of changes between iterations.

Most importantly, make sure you really need to optimize this before you spend time to do it (and worse, complicate your code).

There is a very high likelihood that other parts of your app (eg I/O) could be magnitudes of times slower.

网友答案:

Perhaps the following could be useful for someone in the future:

template<typename T>
T vector_select(const std::vector<T>& vector, const std::size_t index)
{
  assert(index < vector.size());  
  return vector[index];
}

template<typename T>
class VectorSelector
{
public:
  VectorSelector(const std::vector<T>& v) : _v(&v) { }
  T operator()(const std::size_t index){ return vector_select(*_v, index); }
private:
  const std::vector<T>* _v;

};

template<typename T>
std::vector<T> vector_select(const std::vector<T>& vector,
                             const std::vector<std::size_t>& index)
{
  assert(*std::max_element(index.begin(), index.end()) < vector.size());
  std::vector<T> out(index.size());
  std::transform(index.begin(), index.end(), out.begin(),
                 VectorSelector<T>(vector));
  return out;
}
分享给朋友:
您可能感兴趣的文章:
随机阅读: