Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索
ES常用查询以及使用Java Api Client进行检索
1. 检索需求
参照豆瓣阅读的列表页面
需求:
- 检索词需要在数据库中的题名、作者和摘要字段进行检索并进行高亮标红
- 返回的检索结果需要根据综合、热度最高、最近更新、销量最高、好评最多进行排序
- 分页数量为10,并且返回检索到的总数量
2. 建立测试环境
2.1 根据需求建立es字段
mapping.json
{ "mappings": { "properties": { "title": { "analyzer": "standard", "type": "text" }, "author": { "analyzer": "standard", "type": "text", "fields": { "keyword": { "type": "keyword" } } }, "contentDesc": { "analyzer": "standard", "type": "text" }, "wordCount": { "type": "double" }, "price": { "type": "double" }, "cover": { "type": "keyword" }, "heatCount": { "type": "integer" }, "updateTime": { "type": "date" } } } }
映射字段说明:
- id(长整型): 表示唯一标识的字段,类型为long
- title(文本类型): 用于存储文档标题的字段,类型为text。指定默认的标准分析器(analyzer)为standard
- author(文本类型): 存储文档作者的字段,同样是text类型。除了使用标准分析器外,还定义额外的关键字(keyword)字段,该关键字字段通常用于==精确匹配和聚合==操作。
- contentDesc(文本类型): 存储文档内容描述的字段,同样是text类型,使用标准分析器。
- wordCount(双精度浮点型): 存储文档字数的字段,类型为double。通常用于存储浮点数值。
- price(双精度浮点型): 存储文档价格的字段,同样是double类型。用于存储浮点数值,例如书籍的价格。
- cover(关键字类型): 存储文档封面的字段,类型为keyword。关键字字段通常用于精确匹配。
- heatCount(整型): 存储热度计数的字段,类型为integer。通常用于热度排序
- updateTime(日期类型): 存储文档更新时间的字段,类型为date。用于最近更新排序
2.2 创建索引和映射
2.3 增加测试数据
POST /douban/_doc/1001 { "title":"诗云", "author":"刘慈欣", "contentDesc":"伊依一行三人乘坐一艘游艇在南太平洋上做吟诗航行,平时难得一见的美洲大陆清晰地显示在天空中,在东半球构成的覆盖世界的巨大穹顶上,大陆好像是墙皮脱落的区域…", "wordCount":18707, "price":6.99, "cover":"https://pic.arkread.com/cover/ebook/f/19534800.1653698501.jpg!cover_default.jpg", "heatCount":201, "updateTime":"2023-12-20" } POST /douban/_doc/1002 { "title":"三体2·黑暗森林", "author":"刘慈欣", "contentDesc":"征服世界的中国科幻神作!包揽九项世界顶级科幻大奖!《三体》获得第73届“雨果奖”最佳长篇奖!", "wordCount":318901, "price":32.00, "cover":"https://pic.arkread.com/cover/ebook/f/110344476.1653700299.jpg!cover_default.jpg", "heatCount":545, "updateTime":"2023-12-25" } POST /douban/_doc/1003 { "title":"三体前传:球状闪电", "author":"刘慈欣", "contentDesc":"征服世界的中国科幻神作!包揽九项世界顶级科幻大奖!《三体》获得第73届“雨果奖”最佳长篇奖!", "wordCount":181119, "price":35.00, "cover":"https://pic.arkread.com/cover/ebook/f/116984494.1653699856.jpg!cover_default.jpg", "heatCount":765, "updateTime":"2022-11-12" } POST /douban/_doc/1004 { "title":"全频带阻塞干扰", "author":"刘慈欣", "contentDesc":"这是一个场面浩大而惨烈的故事。21世纪的某年,以美国为首的北约发起了对俄罗斯的全面攻击。在残酷的保卫战中,俄国的电子战设备无力抵挡美国的进攻", "wordCount":28382, "price":6.99, "cover":"https://pic.arkread.com/cover/ebook/f/19532617.1653698557.jpg!cover_default.jpg", "heatCount":153, "updateTime":"2021-03-23" }
3. 执行查询
3.1 主键查询
# 此种方式已过时,不推荐 GET /douban/_doc/1001 # 推荐此种方式 POST /douban/_search { "query": { "match": { "_id": 1001 } } }
3.2 全量查询
POST /douban/_search { "query": { "match_all": { } } }
3.3 分页查询
POST /douban/_search { "query": { "match_all": { } }, "from":1, "size":2 }
3.4 排序查询
POST /douban/_search { "query": { "match_all": { } }, "sort": [ { "price": { "order": "desc" } } ] }
3.5 全文检索
POST /douban/_search { "query": { "match": { "title":"三体球闪" } } }
检索结果:
3.6 高亮检索
POST /douban/_search { "query": { "match": { "title": "三体球闪" } }, "highlight": { "fields": { "title": { "pre_tags": [ "" ], "post_tags": [ "" ] } } } }
3.7 bool查询
题名进行全文检索包含‘三体球闪’,并且价格为‘35’的数据
POST /douban/_search { "query": { "bool": { "must": [ { "match": { "title": "三体球闪" } }, { "term": { "price": 35 } } ] } } }
3.7 多字段全文检索
对题名、作者、摘要进行全文匹配,同时根据三个字段进行高亮标红
POST /douban/_search { "query": { "multi_match": { "query": "三体球闪", "fields": [ "title", "author", "contentDesc" ] } }, "highlight": { "fields": { "title": {}, "author": {}, "contentDesc": {} } } }
3.8 综合检索
对题名、作者、摘要进行全文匹配,同时根据三个字段进行高亮标红
增加分页条件查询、增加更新日期降序排序、同时返回需要的必备字段
POST /douban/_search { "query": { "multi_match": { "query": "三体球闪", "fields": [ "title", "author", "contentDesc" ] } }, "from": 0, "size": 2, "_source": [ "title", "author", "price", "wordCount" ], "sort": [ { "updateTime": { "order": "desc" } } ], "highlight": { "fields": { "title": { }, "author": { }, "contentDesc": { } } } }
4. Spring项目集成elasticsearch
参考文档:[Installation | Elasticsearch Java API Client 7.17] | Elastic
4.1 创建Spring项目并引入es依赖
如果希望使用java8,就打开pom.xml修改parent版本和java.version的值,然后点击刷新maven
在Elasticsearch7.15版本之后,Elasticsearch官方将它的高级客户端RestHighLevelClient标记为弃用状态。同时推出了全新的Java API客户端Elasticsearch Java API Client,该客户端也将在Elasticsearch8.0及以后版本中成为官方推荐使用的客户端。
Api名称 介绍 TransportClient-废弃,8.x删除 基于TCP方式访问,只支持JAVA,7.x开始弃用,8.x删除. Rest Lower Level Rest Client 低等级RestApi,最小依赖。 Rest High Level Rest Client废弃,未说明删除时间 高等级的RestApi,基于低等级Api,7.15开始弃用,但没有说明会删除。用低等级Api替换。 RestClient 基于Http的Api形式,跨语言,推荐使用,底层基于低等级Api,7.15才开始提供 co.elastic.clients elasticsearch-java 7.17.11 com.fasterxml.jackson.core jackson-databind 2.12.3 jakarta.json jakarta.json-api 2.0.1
完整依赖如下:注意 properties中一定要加 7.17.11,否则会导致无法覆盖父引用中依赖
4.0.0 org.springframework.boot spring-boot-starter-parent 2.5.15 com.zhouquan client 0.0.1-SNAPSHOT client Demo project for Spring Boot 8 1.18.22 7.17.11 2.0.1 org.springframework.boot spring-boot-starter-web org.springframework.boot spring-boot-starter-test test co.elastic.clients elasticsearch-java 7.17.11 com.fasterxml.jackson.core jackson-databind 2.12.3 org.glassfish jakarta.json ${jakarta.version} org.projectlombok lombok ${lombok.version} commons-io commons-io 2.11.0 org.springframework.boot spring-boot-maven-plugin
4.2 增加es客户端配置类
交给spring进行管理,使用时通过@Resource private ElasticsearchClient client; 注入即可使用
@Configuration @Slf4j public class EsClient { @Resource private EsConfig esConfig; /** * Bean 定义,用于创建 ElasticsearchClient 实例。 * * @return 配置有 RestClient 和传输设置的 ElasticsearchClient 实例。 */ @Bean public ElasticsearchClient elasticsearchClient() { // 使用 Elasticsearch 集群的主机和端口配置 RestClient List clusterNodes = esConfig.getClusterNodes(); HttpHost[] httpHosts = clusterNodes.stream().map(HttpHost::create).toArray(HttpHost[]::new); // Create the low-level client RestClient restClient = RestClient.builder(httpHosts).build(); // JSON 序列化 ElasticsearchTransport transport = new RestClientTransport(restClient, new JacksonJsonpMapper()); ElasticsearchClient client = new ElasticsearchClient(transport); // 打印连接信息 log.info("Elasticsearch Client 连接节点信息:{}", Arrays.toString(httpHosts)); return client; } }
4.3 使用 Java API Client 创建索引
参考链接:Using the Java API Client
/** * 创建索引 */ @Test void createIndex() throws IOException { ClassLoader classLoader = ResourceLoader.class.getClassLoader(); InputStream input = classLoader.getResourceAsStream("mapping/douban.json"); CreateIndexRequest req = CreateIndexRequest.of(b -> b .index("douban_v1") .withJson(input) ); boolean created = client.indices().create(req).acknowledged(); log.info("是否创建成功:" + created); }
4.4 保存文档
实体类 DouBan.java
package com.zhouquan.client.entity; import lombok.AllArgsConstructor; import lombok.Data; import lombok.NoArgsConstructor; import java.util.Date; /** * @author ZhouQuan * @description todo * @date 2024-01-09 15:54 **/ @Data @AllArgsConstructor @NoArgsConstructor public class DouBan { private String id; private String title; private String author; private String contentDesc; private Integer wordCount; private Double price; private String cover; private Integer heatCount; private Date updateTime; }
4.4.1 索引单个文档
public String indexSingleDoc() { IndexResponse indexResponse; DouBan douBan = new DouBan("1211", "河边的错误", "余华", "内容简介", 50000, 52.5, "封面1", 74, new Date()); try { // 使用流式dsl保存 indexResponse = client.index(i -> i .index(indexName) .id(douBan.getId()) .document(douBan)); // 使用 Java API Client的静态of()方法 IndexRequest objectIndexRequest = IndexRequest.of(i -> i .index(indexName) .id(douBan.getId()) .document(douBan)); IndexResponse ofIndexResponse = client.index(objectIndexRequest); // 使用经典版本 IndexRequest.Builder objectBuilder = new IndexRequest.Builder(); objectBuilder.index(indexName); objectBuilder.id(douBan.getId()); objectBuilder.document(douBan); IndexResponse classicIndexResponse = client.index(objectBuilder.build()); // 异步保存 asyncClient.index(i -> i .index("douban") .id(douBan.getId()) .document(douBan) ).whenComplete((response, exception) -> { if (exception != null) { log.error("Failed to index", exception); } else { log.info("Indexed with version " + response.version()); } }); // 索引原始json数据 IndexResponse response = null; try { String jsonData = " {\"id\":\"1741\",\"title\":\"三体\",\"author\":\"刘慈欣\",\"contentDesc\":\"内容简介\",\"wordCount\":50000,\"price\":52.5}"; Reader input = new StringReader(jsonData); IndexRequest request = IndexRequest.of(i -> i .index("douban_v1") .withJson(input) ); response = client.index(request); log.info("Indexed with version " + response.version()); } catch (IOException e) { throw new RuntimeException(e); } } catch (IOException e) { throw new RuntimeException(e); } return Result.Created.equals(indexResponse.result()) + ""; }
4.4.2 批量索引文档
/** * 批量保存 * * @throws IOException */ @Test void bulkSave() throws IOException { DouBan douBan1 = new DouBan("1002", "题名1", "余华", "内容简介", 50000, 52.5, "封面1", 74, new Date()); DouBan douBan2 = new DouBan("1003", "题名2", "余华", "内容简介", 50000, 52.5, "封面1", 74, new Date()); DouBan douBan3 = new DouBan("1004", "题名3", "余华", "内容简介", 50000, 52.5, "封面1", 74, new Date()); List douBanList = new ArrayList(); douBanList.add(douBan1); douBanList.add(douBan2); douBanList.add(douBan3); BulkRequest.Builder br = new BulkRequest.Builder(); for (DouBan douBan : douBanList) { br.operations(op -> op .index(idx -> idx .index("products") .id(douBan.getId()) .document(douBan) ) ); } BulkResponse result = client.bulk(br.build()); if (result.errors()) { log.error("Bulk had errors"); for (BulkResponseItem item : result.items()) { if (item.error() != null) { log.error(item.error().reason()); } } } }
4.4.3 原始数据批量索引文档
/** * 原始json数据批量保存 * * @throws IOException */ @Test void rawDataBulkSave() throws IOException { File logDir = new File("D:\\IdeaProjects\\client\\src\\main\\resources\\data"); File[] logFiles = logDir.listFiles( file -> file.getName().matches("bulk*.*\\.json") ); BulkRequest.Builder br = new BulkRequest.Builder(); for (File file : logFiles) { FileInputStream input = new FileInputStream(file); BinaryData data = BinaryData.of(IOUtils.toByteArray(input), ContentType.APPLICATION_JSON); br.operations(op -> op .index(idx -> idx .index("douban_v1") .document(data) ) ); } BulkResponse result = client.bulk(br.build()); if (result.errors()) { List items = result.items(); items.forEach(x -> System.out.println(x.error())); } log.info("是否成功批量保存:" + !result.errors()); }
4.5 获取单个文档
// 根据id获取数据并装载为java对象 GetRequest getRequest = GetRequest.of(x -> x.index("douban_v1").id("1002")); GetResponse douBanGetResponse = client.get(getRequest, DouBan.class); DouBan source = douBanGetResponse.source(); GetResponse response = client.get(g -> g .index(indexName) .id(id), DouBan.class ); if (!response.found()) { throw new BusinessException("未获取到指定id的数据"); } DouBan douBan = response.source(); log.info("资料title: " + douBan.getTitle()); return douBan;
// 根据id获取原始JSON数据 GetResponse response1 = client.get(g -> g .index(indexName) .id(id), ObjectNode.class ); if (response1.found()) { ObjectNode json = response1.source(); String name = json.get("title").asText(); log.info(" title " + name); } else { log.info("data not found"); } return null;
4.6 文档检索
4.6.1 普通的搜索查询
public List search(String searchText) { SearchResponse response = null; try { response = client.search(s -> s .index(indexName) .query(q -> q .match(t -> t .field("title") .query(searchText) ) ), DouBan.class ); } catch (IOException e) { throw new RuntimeException(e); } TotalHits total = response.hits().total(); boolean isExactResult = total.relation() == TotalHitsRelation.Eq; if (isExactResult) { log.info("There are " + total.value() + " results"); } else { log.info("There are more than " + total.value() + " results"); } List hits = response.hits().hits(); List list = new ArrayList(); for (Hit hit : hits) { DouBan DouBan = hit.source(); list.add(DouBan); log.info("Found DouBan " + DouBan.getTitle() + ", score " + hit.score()); } return list; }
4.6.2 嵌套搜索查询
public List search2(String searchText, Double price) { Query titleQuery = MatchQuery.of(m -> m .field("title") .query(searchText) )._toQuery(); Query rangeQuery = RangeQuery.of(r -> r .field("price") .gte(JsonData.of(price)) )._toQuery(); try { SearchResponse search = client.search(s -> s .index(indexName) .query(q -> q .bool(b -> b .must(titleQuery) .must(rangeQuery) ) ) , DouBan.class ); // 解析检索结果 List douBanList = new ArrayList(); List hits = search.hits().hits(); for (Hit hit : hits) { DouBan douBan = hit.source(); douBanList.add(douBan); } return douBanList; } catch (Exception e) { throw new RuntimeException(e); } }
4.6.3 模板搜索
// 创建模板,返回搜索请求正文的存储脚本 client.putScript(r -> r .id("query-script") .script(s -> s .lang("mustache") .source("{\"query\":{\"match\":{\"{{field}}\":\"{{value}}\"}}}") )); // 执行请求 SearchTemplateResponse response = client.searchTemplate(r -> r .index("douban_v1") .id("query-script") .params("field", JsonData.of("title")) .params("value", JsonData.of("题名")), DouBan.class ); // 结果解析 List hits = response.hits().hits(); for (Hit hit: hits) { DouBan DouBan = hit.source(); log.info("Found DouBan " + DouBan.getTitle() + ", score " + hit.score()); }
4.7 文档聚合
Query query = MatchQuery.of(t -> t .field("title") .query(searchText))._toQuery(); Aggregation authorAgg = AggregationBuilders.terms().field("author").build()._toAggregation(); SearchResponse response = null; response = client.search(s -> s .index(indexName) .query(query) .aggregations("author", authorAgg), DouBan.class );
文章版权声明:除非注明,否则均为主机测评原创文章,转载或复制请以超链接形式并注明出处。