当前位置: 动力学知识库 > 问答 > 编程问答 >

How to use Lucene to query ElasticSearch index

问题描述:

Can I use Lucene to query an ElasticSearch index?

Using ElasticSearch I created an index and inserted these three documents:

$ curl -XPOST localhost:9200/index1/type1 -d '{"f1":"dog"}'

$ curl -XPOST localhost:9200/index1/type2 -d '{"f2":"cat"}'

$ curl -XPOST localhost:9200/index1/type2 -d '{"f3":"horse"}'

So, I have one index, two types, and three documents. Now, I would like to search for these using standard Lucene. Using a hex editor, I identified which shard has the indexed documents, and I can successfully query that index. I can't figure out though, how to retrieve the field values from the matching document(s).

The following program successfully searches but is unable to retrieve results.

import org.apache.lucene.analysis.standard.StandardAnalyzer;

import org.apache.lucene.document.Document;

import org.apache.lucene.index.DirectoryReader;

import org.apache.lucene.index.IndexReader;

import org.apache.lucene.queryparser.classic.QueryParser;

import org.apache.lucene.search.IndexSearcher;

import org.apache.lucene.search.Query;

import org.apache.lucene.search.ScoreDoc;

import org.apache.lucene.search.TopScoreDocCollector;

import org.apache.lucene.store.Directory;

import org.apache.lucene.store.FSDirectory;

import org.apache.lucene.util.Version;

import java.io.File;

public class TestES {

void doWork(String[] args) throws Exception {

// Index reader for already created ElasticSearch index

String indx1 = "/path-to-index/elasticsearch-0.90.0.RC2-SNAPSHOT/data/elasticsearch/nodes/0/indices/index1/1/index";

Directory index = FSDirectory.open(new File(indx1));

IndexReader reader = DirectoryReader.open(index);

IndexSearcher searcher = new IndexSearcher(reader);

// Looks like the query is correct since we do get a hit

StandardAnalyzer analyzer = new StandardAnalyzer(Version.LUCENE_41);

Query q = new QueryParser(Version.LUCENE_41, "f2", analyzer).parse("cat");

TopScoreDocCollector collector = TopScoreDocCollector.create(10, true);

searcher.search(q, collector);

ScoreDoc[] hits = collector.topDocs().scoreDocs;

// We do get a hit, but results always displayed as null except for "_uid"

if (hits.length > 0) {

int docId = hits[0].doc;

Document d = searcher.doc(docId);

System.out.println("DocID " + docId + ", _uid: " + d.get("_uid") );

System.out.println("DocID " + docId + ", f2: " + d.get("f2") );

}

reader.close();

}

public static void main(String[] args) throws Exception {

TestES hl = new TestES();

hl.doWork(args);

}

}

Results:

DocID 0, _uid: type2#3K5QXeZhQnit9UXM9_4bng

DocID 0, f2: null

The _uid value above is correct.

Eclipse shows me that the variable Document d does have two fields:

  • stored,indexed,tokenized,omitNorms<_uid:type2#3K5QXeZhQnit9UXM9_4bng>
  • stored<_source:[7b 22 66 32 22 3a 22 63 61 74 22 7d]>

Unfortunately, d.get("_source") also returns null.

How can I retrieve the document fields for a matching query?

Thank you.

网友答案:

As stated in the comment, I needed to retrieve the field "_source" as a binary value. So this worked: d.getBinaryValue("_source") and it retrieved [7b 22 66 32 22 3a 22 63 61 74 22 7d] which is {"f2":"cat"}. Javanna, thanks for helping.

分享给朋友:
您可能感兴趣的文章:
随机阅读: