全文檢索的應用愈來愈普遍,幾乎成了互聯網應用的標配,商品搜索、日誌分析、歷史數據歸檔等等,各類場景都會涉及到大批量的數據,在全文檢索方面,方案無外乎Lucene、Solr、Elasticsearch三種應用的較爲普遍。es、solr的底層都依託於Lucene,但es比solr學習成本更低,因爲其提供的RESTful API簡單快捷,對互聯網應用開發而言更是如虎添翼。java
下面結合以實際案例,經過Java API的形式操做es數據集。程序員
框架選型基礎是Spring Boot + Spring-data-elasticsearch + elasticsearch。web
使用ElasticsearchRepository的形式來鏈接、維護ES數據集,ElasticsearchRepository中提供了簡單的操做索引數據的方法集合,繼承自ElasticsearchCrudRepository,涵蓋了CRUD、排序、分頁等常見的基本操做功能。spring
-
@NoRepositoryBean
apache -
public interface ElasticsearchRepository<T, ID extends Serializable> extends ElasticsearchCrudRepository<T, ID> {
tomcat -
<S extends T> S index(S var1);
架構 -
-
Iterable<T> search(QueryBuilder var1);
app -
-
Page<T> search(QueryBuilder var1, Pageable var2);
框架 -
-
Page<T> search(SearchQuery var1);
dom -
-
Page<T> searchSimilar(T var1, String[] var2, Pageable var3);
-
-
void refresh();
-
-
Class<T> getEntityClass();
-
}
從基本的pom配置開始
-
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
-
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
-
<modelVersion>4.0.0</modelVersion>
-
<groupId>com.esp.index.data</groupId>
-
<artifactId>esp-cube</artifactId>
-
<version>0.0.1-SNAPSHOT</version>
-
-
<parent>
-
<groupId>org.springframework.boot</groupId>
-
<artifactId>spring-boot-starter-parent</artifactId>
-
<version>1.5.2.RELEASE</version>
-
<relativePath /> <!-- lookup parent from repository -->
-
</parent>
-
-
<properties>
-
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
-
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
-
<java.version>1.7</java.version>
-
</properties>
-
-
<dependencies>
-
<dependency>
-
<groupId>org.springframework.boot</groupId>
-
<artifactId>spring-boot-starter-jdbc</artifactId>
-
<exclusions>
-
<exclusion>
-
<groupId>org.apache.tomcat</groupId>
-
<artifactId>tomcat-jdbc</artifactId>
-
</exclusion>
-
</exclusions>
-
</dependency>
-
<dependency>
-
<groupId>org.springframework.boot</groupId>
-
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
-
</dependency>
-
<dependency>
-
<groupId>org.springframework.boot</groupId>
-
<artifactId>spring-boot-starter-web</artifactId>
-
<exclusions>
-
<exclusion>
-
<artifactId>log4j-over-slf4j</artifactId>
-
<groupId>org.slf4j</groupId>
-
</exclusion>
-
</exclusions>
-
</dependency>
-
<dependency>
-
<groupId>org.springframework.boot</groupId>
-
<artifactId>spring-boot-starter</artifactId>
-
<exclusions>
-
<exclusion>
-
<groupId>org.springframework.boot</groupId>
-
<artifactId>spring-boot-starter-logging</artifactId>
-
</exclusion>
-
</exclusions>
-
</dependency>
-
<dependency>
-
<groupId>org.springframework.boot</groupId>
-
<artifactId>spring-boot-starter-test</artifactId>
-
<scope>test</scope>
-
</dependency>
-
<dependency>
-
<groupId>org.springframework.boot</groupId>
-
<artifactId>spring-boot-starter-log4j</artifactId>
-
<version>1.3.1.RELEASE</version>
-
</dependency>
-
</dependencies>
-
-
<build>
-
<finalName>esp-cube</finalName>
-
<plugins>
-
<plugin>
-
<groupId>org.springframework.boot</groupId>
-
<artifactId>spring-boot-maven-plugin</artifactId>
-
</plugin>
-
</plugins>
-
</build>
-
</project>
編寫本身的Resository操做類
-
public interface ArticleSearchRepository extends ElasticsearchRepository<Article, Long>{
-
List<Article> findByAbstractsAndContent(String abstracts, String content);
-
}
其中Article爲是與elasticsearch鏈接的實體類,相似於PO的概念,其中指定的索引名稱、類型名稱、及分片、副本數量等要素。
-
@Document(indexName = "article_index", type = "article", shards = 5, replicas = 1, indexStoreType = "fs", refreshInterval = "-1")
-
public class Article implements Serializable {
-
-
/**
-
* serialVersionUID:
-
*
-
* @since JDK 1.6
-
*/
-
private static final long serialVersionUID = 1L;
-
-
private Long id;
-
/** 標題 */
-
private String title;
-
/** 摘要 */
-
private String abstracts;
-
/** 內容 */
-
private String content;
-
/** 發表時間 */
-
@Field(format = DateFormat.date_time, index = FieldIndex.no, store = true, type = FieldType.Object)
-
private Date postTime;
-
/** 點擊率 */
-
private Long clickCount;
-
}
咱們須要定義域的實體和一個Spring data的基本的CRUD支持庫類。用id註釋定義標識符字段,若是你沒有指定ID字段,Elasticsearch不能索引你的文件。同時須要指定索引名稱類型,@Document註解也有助於咱們設置分片和副本數量。
接口類
-
public interface ArticleService {
-
-
/**
-
* saveArticle: 寫入<br/>
-
*
-
* @author guooo Date:2017年9月27日下午3:20:06
-
* @param article
-
* @return
-
* @since JDK 1.6
-
*/
-
long saveArticle(Article article);
-
-
/**
-
* deleteArticle: 刪除,並未真正刪除,只是查詢不到<br/>
-
*
-
* @author guooo Date:2017年9月27日下午3:20:08
-
* @param id
-
* @since JDK 1.6
-
*/
-
void deleteArticle(long id);
-
-
/**
-
* findArticle: <br/>
-
*
-
* @author guooo Date:2017年9月27日下午3:20:10
-
* @param id
-
* @return
-
* @since JDK 1.6
-
*/
-
Article findArticle(long id);
-
-
/**
-
* findArticlePageable: <br/>
-
*
-
* @author guooo Date:2017年9月27日下午3:20:13
-
* @return
-
* @since JDK 1.6
-
*/
-
List<Article> findArticlePageable();
-
-
/**
-
* findArticleAll: <br/>
-
*
-
* @author guooo Date:2017年9月27日下午3:20:15
-
* @return
-
* @since JDK 1.6
-
*/
-
List<Article> findArticleAll();
-
-
/**
-
* findArticleSort: <br/>
-
*
-
* @author guooo Date:2017年9月27日下午3:20:18
-
* @return
-
* @since JDK 1.6
-
*/
-
List<Article> findArticleSort();
-
-
/**
-
* search: <br/>
-
*
-
* @author guooo Date:2017年9月27日下午3:20:22
-
* @param content
-
* @return
-
* @since JDK 1.6
-
*/
-
List<Article> search(String content);
-
-
/**
-
* update: es沒有修改操做,結合save操做完成<br/>
-
*
-
* @author guooo Date:2017年9月27日下午3:20:25
-
* @param id
-
* @return
-
* @since JDK 1.6
-
*/
-
long update(long id);
-
}
接口實現
-
@Service
-
public class ArticleServiceImpl implements ArticleService {
-
-
final int page = 0;
-
final int size = 10;
-
-
/* 搜索模式 */
-
String SCORE_MODE_SUM = "sum"; // 權重分求和模式
-
Float MIN_SCORE = 10.0F; // 因爲無相關性的分值默認爲 1 ,設置權重分最小值爲 10
-
-
Pageable pageable = new PageRequest(page, size);
-
-
@Autowired
-
ArticleSearchRepository repository;
-
-
@Override
-
public long saveArticle(Article article) {
-
Article result = repository.save(article);
-
return result.getId();
-
}
-
-
@Override
-
public void deleteArticle(long id) {
-
repository.delete(id);
-
}
-
-
@Override
-
public Article findArticle(long id) {
-
return repository.findOne(id);
-
}
-
-
@Override
-
public List<Article> findArticlePageable() {
-
-
return repository.findAll(pageable).getContent();
-
}
-
-
@Override
-
public List<Article> findArticleAll() {
-
Iterable<Article> iterables = repository.findAll();
-
List<Article> articles = new ArrayList<>();
-
for (Article article : iterables) {
-
articles.add(article);
-
}
-
return articles;
-
}
-
-
@Override
-
public List<Article> findArticleSort() {
-
List<Order> orders = new ArrayList<>();
-
Order order = new Order(Direction.ASC, "clickCount");
-
orders.add(order);
-
Sort sort = new Sort(orders);
-
Iterable<Article> iterables = repository.findAll(sort);
-
List<Article> articles = new ArrayList<>();
-
for (Article article : iterables) {
-
articles.add(article);
-
}
-
return articles;
-
}
-
-
@Override
-
public List<Article> search(String content) {
-
return repository.findByAbstractsAndContent(content, content);
-
}
-
-
@Override
-
public long update(long id) {
-
Article article = repository.findOne(id);
-
article.setTitle("test");
-
Article retun = repository.save(article);
-
System.out.println(retun.getId()+"更新的數據");
-
return retun.getId();
-
}
-
}
是否是與JPA、hibernate操做數據集的手法很相似?
controller方法類:
-
@RestController
-
@RequestMapping(value = "/article")
-
public class APIArticleController {
-
-
@Autowired
-
ArticleService articleService;
-
-
-
@RequestMapping(value = "save", method = RequestMethod.POST)
-
public long save() {
-
for (int i = 10000; i < 12000; i++) {
-
Article article = new Article();
-
article.setClickCount(Long.valueOf(i + RandomUtils.nextInt(23, i)));
-
article.setAbstracts("個人一個測試" + i);
-
article.setContent(i + "這是第一個測試的內容@spring-data-elasticsearch");
-
article.setPostTime(new Date());
-
article.setId(Long.valueOf(RandomUtils.nextLong(i, i)));
-
long _id = articleService.saveArticle(article);
-
System.out.println(_id);
-
}
-
return 23;
-
}
-
-
@RequestMapping(value = "delete", method = RequestMethod.POST)
-
public void deleteArticle(long id) {
-
articleService.deleteArticle(id);
-
}
-
-
@RequestMapping(value = "findOne", method = RequestMethod.POST)
-
public Article findArticle(long id) {
-
return articleService.findArticle(id);
-
}
-
-
@RequestMapping(value = "findArticlePageable", method = RequestMethod.POST)
-
public List<Article> findArticlePageable() {
-
return articleService.findArticlePageable();
-
}
-
-
@RequestMapping(value = "findArticleAll", method = RequestMethod.POST)
-
public List<Article> findArticleAll() {
-
return articleService.findArticleAll();
-
}
-
-
@RequestMapping(value = "findArticleSort", method = RequestMethod.POST)
-
public List<Article> findArticleSort() {
-
return articleService.findArticleSort();
-
}
-
-
@RequestMapping(value = "search", method = RequestMethod.POST)
-
public List<Article> search(String content) {
-
return articleService.search(content);
-
}
-
-
@RequestMapping(value = "update", method = RequestMethod.POST)
-
public long update(long id) {
-
return articleService.update(id);
-
}
-
}
Spring Boot的啓動類及配置項,這裏略過,項目啓動後,可能過controller暴露出來的方法進行Article數據索引的CRUD操做。
擴展閱讀:
歪脖貳點零 ∣迭代當下 · 架構將來
程序員,除了編碼,生活還應該有沉澱!
長按,識別二維碼,加關注