当前位置: 动力学知识库 > 问答 > 编程问答 >

lucene - How to find delta between two SOLR collections

问题描述:

We are using Lucid works Solr version 4.6.

Our source system basically stores data into two destination systems (one through real time and another thorough the batch mode). Data is ingested into Solr through the real time route.

We need to periodically synch the data ingested in Solr with the data ingested into the batch system.

The design we are currently trying to evaluate is to import the data from batch system into another Solr collection, but really not sure how to sync both collections (i.e the one with realtime data and second is through batch import).

I read through data import handlers but this will override the existing data in Solr. Is there any way in which we can identify the delta between the two collections and ingest that only.

网友答案:

There is no good way; there are a couple of things you can do:

  1. When data is coming into the real time system there is a an import timestamp. Then do a range query to pull in the new stuff. I think new versions of Solr already have a field for this.
  2. Log IDs of documents going into the first Solr and then index these.
  3. Separate queue for the other collection
分享给朋友:
您可能感兴趣的文章:
随机阅读: