,本篇介绍,使用python库进行聚合查询操作。
7.3、聚合查询
高阶概念
- Buckets(桶/集合):满足特定条件的文档的集合
-
Metrics(指标):对桶内的文档进行统计计算(例如最小值,求和,最大值等)
- 新建一张测试表
1 PUT cars 2 { 3 "mappings": { 4 "transactions":{ 5 "properties": { 6 "price":{ 7 "type": "integer" 8 }, 9 "color":{10 "type": "text",11 "fielddata": true12 },13 "make":{14 "type": "text",15 "fielddata": true16 },17 "sold":{18 "type": "date",19 "format": "yyyy-MM-dd"20 }21 }22 }23 }24 }
插入数据
1 POST /cars/transactions/_bulk 2 { "index": { "_index": "cars", "_type": "transactions"}} 3 { "price" : 10000, "color" : "red", "make" : "honda", "sold" : "2014-10-28" } 4 { "index": { "_index": "cars", "_type": "transactions"}} 5 { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } 6 { "index": { "_index": "cars", "_type": "transactions"}} 7 { "price" : 30000, "color" : "green", "make" : "ford", "sold" : "2014-05-18" } 8 { "index": { "_index": "cars", "_type": "transactions"}} 9 { "price" : 15000, "color" : "blue", "make" : "toyota", "sold" : "2014-07-02" } 10 { "index": { "_index": "cars", "_type": "transactions"}} 11 { "price" : 12000, "color" : "green", "make" : "toyota", "sold" : "2014-08-19" }12 { "index": { "_index": "cars", "_type": "transactions"}} 13 { "price" : 20000, "color" : "red", "make" : "honda", "sold" : "2014-11-05" } 14 { "index": { "_index": "cars", "_type": "transactions"}} 15 { "price" : 80000, "color" : "red", "make" : "bmw", "sold" : "2014-01-01" } 16 { "index": { "_index": "cars", "_type": "transactions"}} 17 { "price" : 25000, "color" : "blue", "make" : "ford", "sold" : "2014-02-12" }
- 查询哪个颜色的汽车销量最好(按颜色分类)
1 GET cars/transactions/_search 2 { 3 "size": 0, 4 "aggs": { 5 "popular_colors": { 6 "terms": { 7 "field": "color" 8 } 9 }10 }11 }
1 s = Search(index='cars')2 a = A("terms", field="color")3 s.aggs.bucket("popular_color", a)4 response = s.execute()
或者
1 s.aggs.bucket("popular_color", "terms", field="color")
- 查询每种颜色车的平均价格
1 GET cars/transactions/_search 2 { 3 "size": 0, 4 "aggs": { 5 "colors": { 6 "terms": { 7 "field": "color" 8 }, 9 "aggs": {10 "avg_price": {11 "avg": {12 "field": "price"13 }14 }15 }16 }17 }18 }
1 s = Search(index='cars')2 a1 = A("terms", field="color")3 a2 = A("avg", field="price")4 s.aggs.bucket("colors", a1).metric("avg_price", a2)5 response = s.execute()
或者
1 s = Search(index='cars')2 s.aggs.bucket("colors", "terms", field="color").metric("avg_price", "avg", field="price")3 response = s.execute()
- 先按颜色分,再按品牌分,再求每种品牌的均价
1 GET cars/transactions/_search 2 { 3 "size": 0, 4 "aggs": { 5 "colors": { 6 "terms": { 7 "field": "color" 8 }, 9 "aggs": {10 "make": {11 "terms": {12 "field": "make"13 },14 "aggs": {15 "avg_price": {16 "avg": {17 "field": "price"18 }19 }20 }21 }22 }23 }24 }25 }
1 s = Search(index='cars')2 s.aggs.bucket("colors", "terms", field="color")3 s.aggs["colors"].bucket("make", "terms", field="make")4 s.aggs["colors"].aggs["make"].metric("avg_price", "avg", field="price")5 response = s.execute()
- 先按颜色分,再按品牌分,再求每种品牌的最高和最低价
1 GET cars/transactions/_search 2 { 3 "size": 0, 4 "aggs": { 5 "colors": { 6 "terms": { 7 "field": "color" 8 }, 9 "aggs": {10 "make": {11 "terms": {12 "field": "make"13 },14 "aggs": {15 "min_price": {16 "min": {17 "field": "price"18 }19 },20 "max_price": {21 "max": {22 "field": "price"23 }24 }25 }26 }27 }28 }29 }30 }
1 s = Search(index='cars')2 s.aggs.bucket("colors", "terms", field="color")3 s.aggs["colors"].bucket("make", "terms", field="make")4 s.aggs["colors"].aggs["make"].metric("min_price", "min", field="price")5 s.aggs["colors"].aggs["make"].metric("max_price", "max", field="price")6 response = s.execute()
- 未完待续...
- 新建一张测试表