Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

07-14 1488阅读

ES常用查询以及使用Java Api Client进行检索

1. 检索需求

参照豆瓣阅读的列表页面

需求:

  • 检索词需要在数据库中的题名、作者和摘要字段进行检索并进行高亮标红
  • 返回的检索结果需要根据综合、热度最高、最近更新、销量最高、好评最多进行排序
  • 分页数量为10,并且返回检索到的总数量

    Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

    2. 建立测试环境

    2.1 根据需求建立es字段

    mapping.json

     {
      "mappings": {
        "properties": {
          "title": {
            "analyzer": "standard",
            "type": "text"
          },
          "author": {
            "analyzer": "standard",
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword"
              }
            }
          },
          "contentDesc": {
            "analyzer": "standard",
            "type": "text"
          },
          "wordCount": {
            "type": "double"
          },
          "price": {
            "type": "double"
          },
          "cover": {
            "type": "keyword"
          },
          "heatCount": {
            "type": "integer"
          },
          "updateTime": {
            "type": "date"
          }
        }
      }
    }
    

    映射字段说明:

    1. id(长整型): 表示唯一标识的字段,类型为long
    2. title(文本类型): 用于存储文档标题的字段,类型为text。指定默认的标准分析器(analyzer)为standard
    3. author(文本类型): 存储文档作者的字段,同样是text类型。除了使用标准分析器外,还定义额外的关键字(keyword)字段,该关键字字段通常用于==精确匹配和聚合==操作。
    4. contentDesc(文本类型): 存储文档内容描述的字段,同样是text类型,使用标准分析器。
    5. wordCount(双精度浮点型): 存储文档字数的字段,类型为double。通常用于存储浮点数值。
    6. price(双精度浮点型): 存储文档价格的字段,同样是double类型。用于存储浮点数值,例如书籍的价格。
    7. cover(关键字类型): 存储文档封面的字段,类型为keyword。关键字字段通常用于精确匹配。
    8. heatCount(整型): 存储热度计数的字段,类型为integer。通常用于热度排序
    9. updateTime(日期类型): 存储文档更新时间的字段,类型为date。用于最近更新排序

    2.2 创建索引和映射

    Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

    2.3 增加测试数据

     POST /douban/_doc/1001
     {
        "title":"诗云",
        "author":"刘慈欣",
        "contentDesc":"伊依一行三人乘坐一艘游艇在南太平洋上做吟诗航行,平时难得一见的美洲大陆清晰地显示在天空中,在东半球构成的覆盖世界的巨大穹顶上,大陆好像是墙皮脱落的区域…",
        "wordCount":18707,
        "price":6.99,
        "cover":"https://pic.arkread.com/cover/ebook/f/19534800.1653698501.jpg!cover_default.jpg",
        "heatCount":201,
        "updateTime":"2023-12-20"
     }
     
      POST /douban/_doc/1002
     {
        "title":"三体2·黑暗森林",
        "author":"刘慈欣",
        "contentDesc":"征服世界的中国科幻神作!包揽九项世界顶级科幻大奖!《三体》获得第73届“雨果奖”最佳长篇奖!",
        "wordCount":318901,
        "price":32.00,
        "cover":"https://pic.arkread.com/cover/ebook/f/110344476.1653700299.jpg!cover_default.jpg",
        "heatCount":545,
        "updateTime":"2023-12-25"
     }
     
      POST /douban/_doc/1003
     {
        "title":"三体前传:球状闪电",
        "author":"刘慈欣",
        "contentDesc":"征服世界的中国科幻神作!包揽九项世界顶级科幻大奖!《三体》获得第73届“雨果奖”最佳长篇奖!",
        "wordCount":181119,
        "price":35.00,
        "cover":"https://pic.arkread.com/cover/ebook/f/116984494.1653699856.jpg!cover_default.jpg",
        "heatCount":765,
        "updateTime":"2022-11-12"
     }
     
      POST /douban/_doc/1004
     {
        "title":"全频带阻塞干扰",
        "author":"刘慈欣",
        "contentDesc":"这是一个场面浩大而惨烈的故事。21世纪的某年,以美国为首的北约发起了对俄罗斯的全面攻击。在残酷的保卫战中,俄国的电子战设备无力抵挡美国的进攻",
        "wordCount":28382,
        "price":6.99,
        "cover":"https://pic.arkread.com/cover/ebook/f/19532617.1653698557.jpg!cover_default.jpg",
        "heatCount":153,
        "updateTime":"2021-03-23"
     }
    

    3. 执行查询

    3.1 主键查询

    # 此种方式已过时,不推荐
    GET /douban/_doc/1001
    # 推荐此种方式
    POST /douban/_search
    {
        "query": {
            "match": {
                "_id": 1001
            }
        }
    }
    

    3.2 全量查询

    POST /douban/_search
    {
        "query": {
            "match_all": { 
            }
        }
    }
    

    3.3 分页查询

    POST /douban/_search
    {
        "query": {
            "match_all": { 
            }
        },
        "from":1,
        "size":2
    }
    

    3.4 排序查询

    POST /douban/_search
    {
      "query": {
        "match_all": {
        }
      },
      "sort": [
        {
          "price": { 
            "order": "desc" 
          }
        }
      ]
    }
    

    3.5 全文检索

    POST /douban/_search
    {
      "query": {
        "match": {
          "title":"三体球闪"
      }
     }
    }
    

    检索结果:

    Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

    3.6 高亮检索

    POST /douban/_search
    {
        "query": {
            "match": {
                "title": "三体球闪"
            }
        },
        "highlight": {
            "fields": {
                "title": {
                    "pre_tags": [
                        ""
                    ],
                    "post_tags": [
                        ""
                    ]
                }
            }
        }
    }
    

    Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

    3.7 bool查询

    题名进行全文检索包含‘三体球闪’,并且价格为‘35’的数据

    POST /douban/_search
    {
        "query": {
            "bool": {
                "must": [
                    {
                        "match": {
                            "title": "三体球闪"
                        }
                    },
                    {
                        "term": {
                            "price": 35
                        }
                    }
                ]
            }
        }
    }
    

    3.7 多字段全文检索

    对题名、作者、摘要进行全文匹配,同时根据三个字段进行高亮标红

    POST /douban/_search
    {
        "query": {
            "multi_match": {
                "query": "三体球闪",
                "fields": [
                    "title",
                    "author",
                    "contentDesc"
                ]
            }
        },
        "highlight": {
            "fields": {
                "title": {},
                "author": {},
                "contentDesc": {}
            }
        }
    }
    

    Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

    3.8 综合检索

    对题名、作者、摘要进行全文匹配,同时根据三个字段进行高亮标红

    增加分页条件查询、增加更新日期降序排序、同时返回需要的必备字段

    POST /douban/_search
    {
        "query": {
            "multi_match": {
                "query": "三体球闪",
                "fields": [
                    "title",
                    "author",
                    "contentDesc"
                ]
            }
        },
        "from": 0,
        "size": 2,
        "_source": [
            "title",
            "author",
            "price",
            "wordCount"
        ],
        "sort": [
            {
                "updateTime": {
                    "order": "desc"
                }
            }
        ],
        "highlight": {
            "fields": {
                "title": {
                },
                "author": {
                },
                "contentDesc": {
                }
            }
        }
    }
    

    Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

    4. Spring项目集成elasticsearch

    参考文档:[Installation | Elasticsearch Java API Client 7.17] | Elastic

    4.1 创建Spring项目并引入es依赖

    Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

    如果希望使用java8,就打开pom.xml修改parent版本和java.version的值,然后点击刷新maven

    Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

    在Elasticsearch7.15版本之后,Elasticsearch官方将它的高级客户端RestHighLevelClient标记为弃用状态。同时推出了全新的Java API客户端Elasticsearch Java API Client,该客户端也将在Elasticsearch8.0及以后版本中成为官方推荐使用的客户端。

    Api名称介绍
    TransportClient-废弃,8.x删除基于TCP方式访问,只支持JAVA,7.x开始弃用,8.x删除.
    Rest Lower Level Rest Client低等级RestApi,最小依赖。
    Rest High Level Rest Client废弃,未说明删除时间高等级的RestApi,基于低等级Api,7.15开始弃用,但没有说明会删除。用低等级Api替换。
    RestClient基于Http的Api形式,跨语言,推荐使用,底层基于低等级Api,7.15才开始提供
    	co.elastic.clients
    	elasticsearch-java
    	7.17.11
    
    
    	com.fasterxml.jackson.core
    	jackson-databind
    	2.12.3
    
    
    
        jakarta.json
        jakarta.json-api
        2.0.1
    
    

    完整依赖如下:注意 properties中一定要加 7.17.11,否则会导致无法覆盖父引用中依赖

    
        4.0.0
        
            org.springframework.boot
            spring-boot-starter-parent
            2.5.15
            
        
        com.zhouquan
        client
        0.0.1-SNAPSHOT
        client
        Demo project for Spring Boot
        
            8
            1.18.22
            7.17.11
            2.0.1
        
        
            
                org.springframework.boot
                spring-boot-starter-web
            
            
                org.springframework.boot
                spring-boot-starter-test
                test
            
            
                co.elastic.clients
                elasticsearch-java
                7.17.11
            
            
                com.fasterxml.jackson.core
                jackson-databind
                2.12.3
            
            
            
                org.glassfish
                jakarta.json
                ${jakarta.version}
            
            
                org.projectlombok
                lombok
                ${lombok.version}
            
            
            
                commons-io
                commons-io
                2.11.0
            
        
        
            
                
                    org.springframework.boot
                    spring-boot-maven-plugin
                
            
        
    
    

    4.2 增加es客户端配置类

    交给spring进行管理,使用时通过@Resource private ElasticsearchClient client; 注入即可使用

    @Configuration
    @Slf4j
    public class EsClient {
        @Resource
        private EsConfig esConfig;
        /**
         * Bean 定义,用于创建 ElasticsearchClient 实例。
         *
         * @return 配置有 RestClient 和传输设置的 ElasticsearchClient 实例。
         */
        @Bean
        public ElasticsearchClient elasticsearchClient() {
            // 使用 Elasticsearch 集群的主机和端口配置 RestClient
            List clusterNodes = esConfig.getClusterNodes();
            HttpHost[] httpHosts = clusterNodes.stream().map(HttpHost::create).toArray(HttpHost[]::new);
            // Create the low-level client
            RestClient restClient = RestClient.builder(httpHosts).build();
            // JSON 序列化
            ElasticsearchTransport transport = new RestClientTransport(restClient, new JacksonJsonpMapper());
            ElasticsearchClient client = new ElasticsearchClient(transport);
            // 打印连接信息
            log.info("Elasticsearch Client 连接节点信息:{}", Arrays.toString(httpHosts));
            return client;
        }
    }
    

    4.3 使用 Java API Client 创建索引

    参考链接:Using the Java API Client

    Elasticsearch基础篇(八):常用查询以及使用Java Api Client进行检索

    /**
     * 创建索引
     */
    @Test
    void createIndex() throws IOException {
        ClassLoader classLoader = ResourceLoader.class.getClassLoader();
        InputStream input = classLoader.getResourceAsStream("mapping/douban.json");
        CreateIndexRequest req = CreateIndexRequest.of(b -> b
                .index("douban_v1")
                .withJson(input)
        );
        boolean created = client.indices().create(req).acknowledged();
        log.info("是否创建成功:" + created);
    }
    

    4.4 保存文档

    实体类 DouBan.java

    package com.zhouquan.client.entity;
    import lombok.AllArgsConstructor;
    import lombok.Data;
    import lombok.NoArgsConstructor;
    import java.util.Date;
    /**
     * @author ZhouQuan
     * @description todo
     * @date 2024-01-09 15:54
     **/
    @Data
    @AllArgsConstructor
    @NoArgsConstructor
    public class DouBan {
        private String id;
        private String title;
        private String author;
        private String contentDesc;
        private Integer wordCount;
        private Double price;
        private String cover;
        private Integer heatCount;
        private Date updateTime;
    }
    

    4.4.1 索引单个文档

    public String indexSingleDoc() {
            IndexResponse indexResponse;
            DouBan douBan = new DouBan("1211", "河边的错误", "余华", "内容简介", 50000, 52.5, "封面1", 74, new Date());
            try {
                // 使用流式dsl保存
                indexResponse = client.index(i -> i
                        .index(indexName)
                        .id(douBan.getId())
                        .document(douBan));
                // 使用 Java API Client的静态of()方法
                IndexRequest objectIndexRequest = IndexRequest.of(i -> i
                        .index(indexName)
                        .id(douBan.getId())
                        .document(douBan));
                IndexResponse ofIndexResponse = client.index(objectIndexRequest);
                // 使用经典版本
                IndexRequest.Builder objectBuilder = new IndexRequest.Builder();
                objectBuilder.index(indexName);
                objectBuilder.id(douBan.getId());
                objectBuilder.document(douBan);
                IndexResponse classicIndexResponse = client.index(objectBuilder.build());
                // 异步保存
                asyncClient.index(i -> i
                        .index("douban")
                        .id(douBan.getId())
                        .document(douBan)
                ).whenComplete((response, exception) -> {
                    if (exception != null) {
                        log.error("Failed to index", exception);
                    } else {
                        log.info("Indexed with version " + response.version());
                    }
                });
                // 索引原始json数据
                IndexResponse response = null;
                try {
                    String jsonData = " {\"id\":\"1741\",\"title\":\"三体\",\"author\":\"刘慈欣\",\"contentDesc\":\"内容简介\",\"wordCount\":50000,\"price\":52.5}";
                    Reader input = new StringReader(jsonData);
                    IndexRequest request = IndexRequest.of(i -> i
                            .index("douban_v1")
                            .withJson(input)
                    );
                    response = client.index(request);
                    log.info("Indexed with version " + response.version());
                } catch (IOException e) {
                    throw new RuntimeException(e);
                }
            } catch (IOException e) {
                throw new RuntimeException(e);
            }
            return Result.Created.equals(indexResponse.result()) + "";
        }
    

    4.4.2 批量索引文档

    /**
     * 批量保存
     *
     * @throws IOException
     */
    @Test
    void bulkSave() throws IOException {
        DouBan douBan1 = new DouBan("1002", "题名1", "余华", "内容简介", 50000, 52.5, "封面1", 74, new Date());
        DouBan douBan2 = new DouBan("1003", "题名2", "余华", "内容简介", 50000, 52.5, "封面1", 74, new Date());
        DouBan douBan3 = new DouBan("1004", "题名3", "余华", "内容简介", 50000, 52.5, "封面1", 74, new Date());
        List douBanList = new ArrayList();
        douBanList.add(douBan1);
        douBanList.add(douBan2);
        douBanList.add(douBan3);
        BulkRequest.Builder br = new BulkRequest.Builder();
        for (DouBan douBan : douBanList) {
            br.operations(op -> op
                    .index(idx -> idx
                            .index("products")
                            .id(douBan.getId())
                            .document(douBan)
                    )
            );
        }
        BulkResponse result = client.bulk(br.build());
        if (result.errors()) {
            log.error("Bulk had errors");
            for (BulkResponseItem item : result.items()) {
                if (item.error() != null) {
                    log.error(item.error().reason());
                }
            }
        }
    }
    

    4.4.3 原始数据批量索引文档

    /**
     * 原始json数据批量保存
     *
     * @throws IOException
     */
    @Test
    void rawDataBulkSave() throws IOException {
        File logDir = new File("D:\\IdeaProjects\\client\\src\\main\\resources\\data");
        File[] logFiles = logDir.listFiles(
                file -> file.getName().matches("bulk*.*\\.json")
        );
        BulkRequest.Builder br = new BulkRequest.Builder();
        for (File file : logFiles) {
            FileInputStream input = new FileInputStream(file);
            BinaryData data = BinaryData.of(IOUtils.toByteArray(input), ContentType.APPLICATION_JSON);
            br.operations(op -> op
                    .index(idx -> idx
                            .index("douban_v1")
                            .document(data)
                    )
            );
        }
        BulkResponse result = client.bulk(br.build());
        if (result.errors()) {
            List items = result.items();
            items.forEach(x -> System.out.println(x.error()));
        }
        log.info("是否成功批量保存:" + !result.errors());
    }
    

    4.5 获取单个文档

    // 根据id获取数据并装载为java对象
    GetRequest getRequest = GetRequest.of(x -> x.index("douban_v1").id("1002"));
    GetResponse douBanGetResponse = client.get(getRequest, DouBan.class);
    DouBan source = douBanGetResponse.source();
    GetResponse response = client.get(g -> g
                    .index(indexName)
                    .id(id),
            DouBan.class
    );
    if (!response.found()) {
        throw new BusinessException("未获取到指定id的数据");
    }
    DouBan douBan = response.source();
    log.info("资料title: " + douBan.getTitle());
    return douBan;
    
    // 根据id获取原始JSON数据
    GetResponse response1 = client.get(g -> g
                    .index(indexName)
                    .id(id),
            ObjectNode.class
    );
    if (response1.found()) {
        ObjectNode json = response1.source();
        String name = json.get("title").asText();
        log.info(" title " + name);
    } else {
        log.info("data not found");
    }
    return null;
    

    4.6 文档检索

    4.6.1 普通的搜索查询

    public List search(String searchText) {
        SearchResponse response = null;
        try {
            response = client.search(s -> s
                            .index(indexName)
                            .query(q -> q
                                    .match(t -> t
                                            .field("title")
                                            .query(searchText)
                                    )
                            ),
                    DouBan.class
            );
        } catch (IOException e) {
            throw new RuntimeException(e);
        }
        TotalHits total = response.hits().total();
        boolean isExactResult = total.relation() == TotalHitsRelation.Eq;
        if (isExactResult) {
            log.info("There are " + total.value() + " results");
        } else {
            log.info("There are more than " + total.value() + " results");
        }
        List hits = response.hits().hits();
        List list = new ArrayList();
        for (Hit hit : hits) {
            DouBan DouBan = hit.source();
            list.add(DouBan);
            log.info("Found DouBan " + DouBan.getTitle() + ", score " + hit.score());
        }
        return list;
    }
    

    4.6.2 嵌套搜索查询

    public List search2(String searchText, Double price) {
        Query titleQuery = MatchQuery.of(m -> m
                .field("title")
                .query(searchText)
        )._toQuery();
        Query rangeQuery = RangeQuery.of(r -> r
                .field("price")
                .gte(JsonData.of(price))
        )._toQuery();
        try {
            SearchResponse search = client.search(s -> s
                            .index(indexName)
                            .query(q -> q
                                    .bool(b -> b
                                            .must(titleQuery)
                                            .must(rangeQuery)
                                    )
                            )
                    ,
                    DouBan.class
            );
            // 解析检索结果
            List douBanList = new ArrayList();
            List hits = search.hits().hits();
            for (Hit hit : hits) {
                DouBan douBan = hit.source();
                douBanList.add(douBan);
            }
            return douBanList;
        } catch (Exception e) {
            throw new RuntimeException(e);
        }
    }
    

    4.6.3 模板搜索

    // 创建模板,返回搜索请求正文的存储脚本
    client.putScript(r -> r
            .id("query-script")
            .script(s -> s
                    .lang("mustache")
                    .source("{\"query\":{\"match\":{\"{{field}}\":\"{{value}}\"}}}")
            ));
    // 执行请求
    SearchTemplateResponse response = client.searchTemplate(r -> r
                    .index("douban_v1")
                    .id("query-script")
                    .params("field", JsonData.of("title"))
                    .params("value", JsonData.of("题名")),
            DouBan.class
    );
    // 结果解析
    List hits = response.hits().hits();
    for (Hit hit: hits) {
        DouBan DouBan = hit.source();
        log.info("Found DouBan " + DouBan.getTitle() + ", score " + hit.score());
    }
    

    4.7 文档聚合

    Query query = MatchQuery.of(t -> t
            .field("title")
            .query(searchText))._toQuery();
    Aggregation authorAgg = AggregationBuilders.terms().field("author").build()._toAggregation();
    SearchResponse response = null;
    response = client.search(s -> s
                    .index(indexName)
                    .query(query)
                    .aggregations("author", authorAgg),
            DouBan.class
    );
    
VPS购买请点击我

文章版权声明:除非注明,否则均为主机测评原创文章,转载或复制请以超链接形式并注明出处。

目录[+]