分布式搜索引擎-ElasticSearch(下集)
个人简介
作者是一个来自河源的大三在校生,以下笔记都是作者自学之路的一些浅薄经验,如有错误请指正,将来会不断的完善笔记,帮助更多的Java爱好者入门。
@[toc]
分布式搜索引擎-ElasticSearch(下集)
- 注意:ElasticSearch版本为7.6.1
什么是ElasticSearch
ElasticSearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java开发的,并作为Apache许可条款下的开放源码发布,是当前流行的企业级搜索引擎。设计用于云计算中,能够达到实时搜索,稳定,可靠,快速,安装使用方便。
我们建立一个网站或应用程序,并要添加搜索功能,但是想要完成搜索工作的创建是非常困难的。我们希望搜索解决方案要运行速度快,我们希望能有一个零配置和一个完全免费的搜索模式,我们希望能够简单地使用JSON通过HTTP来索引数据,我们希望我们的搜索服务器始终可用,我们希望能够从一台开始并扩展到数百台,我们要实时搜索,我们要简单的多租户,我们希望建立一个云的解决方案。因此我们利用Elasticsearch来解决所有这些问题及可能出现的更多其它问题。摘选自《百度百科》
分页
GET goods/_search
{
"query": {
"match_all": {}
}
, "sort": [
{
"od": {
"order": "desc"
}
}
]
, "from" : 0
, "size": 2
}
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "4",
"_score" : null,
"_source" : {
"title" : "IQOONEO5",
"content" : "IQOONEO5 高通骁龙870Soc ,",
"price" : "2499",
"od" : 4
},
"sort" : [
4
]
},
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "3",
"_score" : null,
"_source" : {
"title" : "小米11",
"content" : "小米11 高通骁龙888Soc ,1亿像素",
"price" : "4500",
"od" : 3
},
"sort" : [
3
]
}
]
}
}
字段高亮(highlight)
可以选择一个或者多个字段高亮,然后被选择的这些字段如果被条件匹配到则会默认加em标签
GET goods/_search
{
"query": {
"match": {
"title": "华为P40"
}
},
"highlight": {
"fields": {
"title": {}
}
}
}
结果
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 2.7309713,
"hits" : [
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "1",
"_score" : 2.7309713,
"_source" : {
"title" : "华为P40",
"content" : "华为P40 8+256G,麒麟990Soc,贼牛逼",
"price" : "4999",
"od" : 1
},
"highlight" : {
"title" : [
"<em>华</em><em>为</em><em>P40</em>"
]
}
},
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.5241971,
"_source" : {
"title" : "华为Mate30",
"content" : "华为Mate30 8+128G,麒麟990Soc",
"price" : "3998",
"od" : 2
},
"highlight" : {
"title" : [
"<em>华</em><em>为</em>Mate30"
]
}
}
]
}
}
默认是em标签,我们可以更改他的前缀和后缀,利用前端的知识
GET goods/_search
{
"query": {
"match": {
"title": "华为P40"
}
},
"highlight": {
"pre_tags": "<span style='color: red'>",
"post_tags": "</span>" ,
"fields": {
"title": {}
}
}
}
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 2.7309713,
"hits" : [
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "1",
"_score" : 2.7309713,
"_source" : {
"title" : "华为P40",
"content" : "华为P40 8+256G,麒麟990Soc,贼牛逼",
"price" : "4999",
"od" : 1
},
"highlight" : {
"title" : [
"<span style='color: red'>华</span><span style='color: red'>为</span><span style='color: red'>P40</span>"
]
}
},
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.5241971,
"_source" : {
"title" : "华为Mate30",
"content" : "华为Mate30 8+128G,麒麟990Soc",
"price" : "3998",
"od" : 2
},
"highlight" : {
"title" : [
"<span style='color: red'>华</span><span style='color: red'>为</span>Mate30"
]
}
}
]
}
}
模仿百度搜索高亮
例如百度搜索华为P40,不仅仅是title会高亮,content也会高亮,所以我们可以用multi_match+highlight实现
GET goods/_search
{
"query": {
"multi_match": {
"query": "华为P40",
"fields": ["title","content"]
}
}
, "highlight": {
"pre_tags": "<span style='color: red'>",
"post_tags": "</span>",
"fields": {
"title": {},
"content": {}
}
}
}
{
"took" : 8,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 2.8157697,
"hits" : [
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "1",
"_score" : 2.8157697,
"_source" : {
"title" : "华为P40",
"content" : "华为P40 8+256G,麒麟990Soc,贼牛逼",
"price" : "4999",
"od" : 1
},
"highlight" : {
"title" : [
"<span style='color: red'>华</span><span style='color: red'>为</span><span style='color: red'>P40</span>"
],
"content" : [
"<span style='color: red'>华</span><span style='color: red'>为</span><span style='color: red'>P40</span> 8+256G,麒麟990Soc,贼牛逼"
]
}
},
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.8023796,
"_source" : {
"title" : "华为Mate30",
"content" : "华为Mate30 8+128G,麒麟990Soc",
"price" : "3998",
"od" : 2
},
"highlight" : {
"title" : [
"<span style='color: red'>华</span><span style='color: red'>为</span>Mate30"
],
"content" : [
"<span style='color: red'>华</span><span style='color: red'>为</span>Mate30 8+128G,麒麟990Soc"
]
}
}
]
}
}
bool查询(用作于多条件查询)
类似于MYSQL的and or
重点:must 代表and ,should 代表 or
must(and)的使用:
下面我们在must里面给了两个条件,如果这里是must,那就必须两个条件都要满足
GET goods/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "华为"
}
},
{
"match": {
"content": "MATE30"
}
}
]
}
}
}
结果:
{
"took" : 10,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 2.9512205,
"hits" : [
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "2",
"_score" : 2.9512205,
"_source" : {
"title" : "华为Mate30",
"content" : "华为Mate30 8+128G,麒麟990Soc",
"price" : "3998",
"od" : 2
}
}
]
}
}
should(or)的使用:
should里面同样有两个条件,但是只要满足一个就可以了
GET goods/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"title": "华为"
}
},
{
"match": {
"content": "MATE30"
}
}
]
}
}
}
结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 2.9512205,
"hits" : [
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "2",
"_score" : 2.9512205,
"_source" : {
"title" : "华为Mate30",
"content" : "华为Mate30 8+128G,麒麟990Soc",
"price" : "3998",
"od" : 2
}
},
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.5241971,
"_source" : {
"title" : "华为P40",
"content" : "华为P40 8+256G,麒麟990Soc,贼牛逼",
"price" : "4999",
"od" : 1
}
}
]
}
}
过滤器,区间条件(filter range)
比如我们要实现,输入title=xx,我们如果想得到price>4000作为一个条件,可以用到这个。
GET goods/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"title": "小米"
}
}
],"filter": {
"range": {
"price": {
"gt": 4000
}
}
}
}
}
}
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 2.4135482,
"hits" : [
{
"_index" : "goods",
"_type" : "_doc",
"_id" : "3",
"_score" : 2.4135482,
"_source" : {
"title" : "小米11",
"content" : "小米11 高通骁龙888Soc ,1亿像素",
"price" : "4500",
"od" : 3
}
}
]
}
}
查看整个es的索引信息
GET _cat/indices?v
elasticsearch的Java Api
准备阶段
1.导入elasticsearch高级客户端依赖和elasticsearch依赖(注意版本要和本机的es版本一致),我们本机现在用的是7.6.1的es
<!-- 导入java elastic 两个依赖-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<!-- 这个版本要和你本机的elasticsearch版本一致-->
<version>7.6.1</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<!-- 这个版本要和你本机的elasticsearch版本一致-->
<version>7.6.1</version>
</dependency>
<!-- 引入fastjson-->
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>1.2.75</version>
</dependency>
2.打开RestHighLevelClient的构造器:
public RestHighLevelClient(RestClientBuilder restClientBuilder) {
this(restClientBuilder, Collections.emptyList());
}
我们发现需要传入一个RestClientBuilder,但是这个对象我们需要通过RestClient来得到,而不是RestClientBuilder
3.打开RestClient:
public static RestClientBuilder builder(HttpHost... hosts) {
if (hosts == null || hosts.length == 0) {
throw new IllegalArgumentException("hosts must not be null nor empty");
}
List<Node> nodes = Arrays.stream(hosts).map(Node::new).collect(Collectors.toList());
return new RestClientBuilder(nodes);
}
我们发现RestClient的builder可以得到RestClientBuilder,然后我们点进去看HttpHost:
public HttpHost(String hostname, int port, String scheme) { //es所在主机名,es的端口号,协议(默认http)
this.hostname = (String)Args.containsNoBlanks(hostname, "Host name");
this.lcHostname = hostname.toLowerCase(Locale.ROOT);
if (scheme != null) {
this.schemeName = scheme.toLowerCase(Locale.ROOT);
} else {
this.schemeName = "http";
}
this.port = port;
this.address = null;
}
4.然后我们就配置好了如下:
HttpHost httpHost = new HttpHost("localhost",9200,"http");
RestClientBuilder restClientBuilder = RestClient.builder(httpHost);
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(restClientBuilder);
5.为了方便,我们可以把这个RestHighLevelClient交给SpringIOC容器管理,后面我们自动注入即可
@Configuration
public class esConfig {
@Bean
public RestHighLevelClient restHighLevelClient(){
HttpHost httpHost = new HttpHost("localhost",9200,"http");
RestClientBuilder builder = RestClient.builder(httpHost);
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
return restHighLevelClient;
}
}
索引操作
java elasticsearch api操作索引都是用restHighLevelClient.indices().xxxxx()的格式
创建索引
//创建索引
@Test
public void createIndex() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
//new一个创建索引请求,并传入一个创建的索引名称
CreateIndexRequest createIndexRequest = new CreateIndexRequest("java01");
//向es发送创建索引请求。
CreateIndexResponse createIndexResponse = restHighLevelClient.indices().create(createIndexRequest, RequestOptions.DEFAULT);
restHighLevelClient.close();
}
删除索引
//删除索引
@Test
public void deleteIndex() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
//new一个删除索引请求,并传入需要删除的索引名称
DeleteIndexRequest deleteIndexRequest = new DeleteIndexRequest("java01");
//resthighLevelClient发送删除索引请求
restHighLevelClient.indices().delete(deleteIndexRequest,RequestOptions.DEFAULT);
restHighLevelClient.close();
}
检查索引是否存在
//检查索引是否存在
@Test
public void indexExsit() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
GetIndexRequest getIndexRequest = new GetIndexRequest("goods");
boolean exists = restHighLevelClient.indices().exists(getIndexRequest, RequestOptions.DEFAULT);
System.out.println(exists);
}
文档操作
创建指定id的文档
//创建文档
@Test
public void createIndexDoc() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
IndexRequest indexRequest = new IndexRequest("hello");
//指定文档id
indexRequest.id("1");
/**
* public IndexRequest source(Map<String, ?> source, XContentType contentType) throws ElasticsearchGenerationException {
* try {
* XContentBuilder builder = XContentFactory.contentBuilder(contentType);
* builder.map(source);
* return this.source(builder);
* } catch (IOException var4) {
* throw new ElasticsearchGenerationException("Failed to generate [" + source + "]", var4);
* }
* }
* source有很多种方法,哪种都可以,我现在选的是Map的方法添加key:value
*/
Map<String,Object> source=new HashMap<>();
source.put("a_age","50");
source.put("a_address","广州");
//在es里面,一切皆为JSON,我们要把Map用fastjson转换成JSON字符串,XContentType指定为JSON类型
indexRequest.source(JSON.toJSONString(source), XContentType.JSON);
IndexResponse response = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
System.out.println("response:"+response);
System.out.println("status:"+response.status());
}
删除指定id的文档
//删除文档
@Test
public void deleteDoc() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
DeleteRequest deleteRequest = new DeleteRequest("hello");
deleteRequest.id("1");
DeleteResponse delete = restHighLevelClient.delete(deleteRequest, RequestOptions.DEFAULT);
System.out.println(delete.status());
}
修改指定id的文档
//修改文档
@Test
public void updateDoc() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
/**
* 通过下面的方法去调用
* public UpdateRequest(String index, String id) {
* super(index);
* this.refreshPolicy = RefreshPolicy.NONE;
* this.waitForActiveShards = ActiveShardCount.DEFAULT;
* this.scriptedUpsert = false;
* this.docAsUpsert = false;
* this.detectNoop = true;
* this.id = id;
* }
*/
UpdateRequest updateRequest = new UpdateRequest("hello","1");
Map<String,Object> source=new HashMap<>();
source.put("a_address","河源");
updateRequest.doc(JSON.toJSONString(source),XContentType.JSON);
UpdateResponse response = restHighLevelClient.update(updateRequest, RequestOptions.DEFAULT);
System.out.println(response.status());
}
获取指定id的文档
//获取文档
@Test
public void getDoc() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
GetRequest getRequest = new GetRequest("hello");
getRequest.id("1");
GetResponse response = restHighLevelClient.get(getRequest, RequestOptions.DEFAULT);
String sourceAsString = response.getSourceAsString();
System.out.println(sourceAsString);
}
搜索(匹配全文match_all)
//搜索(匹配全文match_all)
@Test
public void search_matchAll() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
/**
* public SearchRequest(String... indices) {
* this(indices, new SearchSourceBuilder());
* }
*/
SearchRequest searchRequest = new SearchRequest("hello");
//相当于文本
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
MatchAllQueryBuilder matchAllQueryBuilder = QueryBuilders.matchAllQuery();
searchSourceBuilder.query(matchAllQueryBuilder); //相当于search的query
searchRequest.source(searchSourceBuilder);
SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] hits = search.getHits().getHits();
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsString());
}
}
搜索(模糊查询match)
//模糊搜索match
@Test
public void search_match() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
SearchRequest searchRequest = new SearchRequest();
//查询文本
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("a_address", "广州");
searchSourceBuilder.query(matchQueryBuilder);
searchRequest.source(searchSourceBuilder);
SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] hits = search.getHits().getHits();
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsString());
}
}
搜索(多字段搜索multi_match)
//搜索(多字段搜索multi_match)
@Test
public void search_term() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
SearchRequest searchRequest = new SearchRequest("goods");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.multiMatchQuery("华为","title","content"));
searchRequest.source(searchSourceBuilder);
SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] hits = search.getHits().getHits();
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsString());
}
}
搜索(筛选字段fetchSource)
fetchsource方法相当于_source
//fetchsource实现筛选字段(_source)
@Test
public void search_source() throws IOException {
RestClientBuilder builder = RestClient.builder(new HttpHost("localhost", 9200, "http"));
RestHighLevelClient restHighLevelClient = new RestHighLevelClient(builder);
SearchRequest searchRequest = new SearchRequest("goods");
SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
searchSourceBuilder.query(QueryBuilders.matchAllQuery());
/**
* public SearchSourceBuilder fetchSource(@Nullable String[] includes, @Nullable String[] excludes) {
* FetchSourceContext fetchSourceContext = this.fetchSourceContext != null ? this.fetchSourceContext : FetchSourceContext.FETCH_SOURCE;
* this.fetchSourceContext = new FetchSourceContext(fetchSourceContext.fetchSource(), includes, excludes);
* return this;
* }
*
*/
String[] includes={"title"}; //包含
String[] excludes={}; //排除
searchSourceBuilder.fetchSource(includes,excludes);
searchRequest.source(searchSourceBuilder);
SearchResponse search = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
SearchHit[] hits = search.getHits().getHits();
for (SearchHit hit : hits) {
System.out.println(hit.getSourceAsString());
}
}
分页、排序、字段高亮
我们要把下面的es命令行代码转换成Java代码
GET goods/_search
{
"query": {
"match": {
"title": "华为"
}
},"sort": [
{
"od": {
"order": "desc"
}
}
]
,"from": 0,
"size": 1,
"highlight": {
"pre_tags": "<span style='color:red'>",
"post_tags": "</span>",
"fields": {
"title": {}
}
}
}
- 点赞
- 收藏
- 关注作者
评论(0)