lucene-leveldb

lucene-leveldb

Lucene的索引通常存儲在文件系統(FSDirectory)或內存(RAMDirectory)。官方提供Directory接口,也就是說經過實現這個接口,能夠把索引存儲在其餘地方(MySQL,MongoDB,LevelDB......),甚至HDFS也沒問題。lucene-leveldb實現了把索引存儲到LevelDB,索引的性能跟RAMDirectory相差不大,代碼不是很複雜,僅僅提供實現思路,要在實際環境中應用還用不少優化的地方。java

環境/依賴:git

使用方式:github

Path path = Paths.get("db-data");

        File indexDir = path.toFile();

        if (indexDir.exists()) {
            TestUtils.deleteDir(indexDir);
        }

        Directory directory = new LeveldbDirectory(path);
        StandardAnalyzer analyzer = new StandardAnalyzer();
        IndexWriterConfig indexWriterConfig = new IndexWriterConfig(analyzer);
        IndexWriter writer =
                new IndexWriter(directory, indexWriterConfig);


        File resourceDir = new File(RAMDirectoryTest.class.getResource("/test-data-set").getPath());

        Long startTime = System.currentTimeMillis();
        TestUtils.indexTextFile(writer, resourceDir);
        writer.close();
        System.out.println("Index speed time : " + (System.currentTimeMillis() - startTime));

        DirectoryReader index = DirectoryReader.open(directory);

        IndexSearcher searcher = new IndexSearcher(index);

        Query query = new QueryParser("content", analyzer).parse("good");

        TopScoreDocCollector collector = TopScoreDocCollector.create(100);

        startTime = System.currentTimeMillis();
        searcher.search(query, collector);
        System.out.println("Search speed time : " + (System.currentTimeMillis() - startTime));
        ScoreDoc[] hits = collector.topDocs().scoreDocs;

        System.out.println("Found " + hits.length + " hits.");
        for (int i = 0; i < hits.length; ++i) {
            int docId = hits[i].doc;
            Document d = searcher.doc(docId);
            //System.out.println((i + 1) + ". " + d.get("fileName") + " score=" + hits[i].score);
        }

        directory.close();
本站公眾號
   歡迎關注本站公眾號,獲取更多信息