In elasticsearch, how to group by value inside nested array -
say, have following documents:
1st doc:
{ productname: "product1", tags: [ { "name":"key1", "value":"value1" }, { "name":"key2", "value":"value2" } ] }
2nd doc:
{ productname: "product2", tags: [ { "name":"key1", "value":"value1" }, { "name":"key2", "value":"value3" } ] }
i know if want group productname, use terms
aggregation
"terms": { "field": "productname" }
which give me 2 buckets 2 different keys "product1", "product2".
however, should query if group tag key? i.e. group tag name==key1, expecting 1 bucket key="value1"; while if group tag name==key2, expecting result 2 buckets keys "value2", "value3".
what should query if group 'value' inside nested array not group 'key'? suggestion?
it sounds nested terms aggregation you're looking for.
with 2 documents posted, query:
post /test_index/_search { "size": 0, "aggs": { "product_name_terms": { "terms": { "field": "product_name" } }, "nested_tags": { "nested": { "path": "tags" }, "aggs": { "tags_name_terms": { "terms": { "field": "tags.name" } }, "tags_value_terms": { "terms": { "field": "tags.value" } } } } } }
returns this:
{ "took": 67, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 2, "max_score": 0, "hits": [] }, "aggregations": { "product_name_terms": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [] }, "nested_tags": { "doc_count": 4, "tags_name_terms": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "key1", "doc_count": 2 }, { "key": "key2", "doc_count": 2 } ] }, "tags_value_terms": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "value1", "doc_count": 2 }, { "key": "value2", "doc_count": 1 }, { "key": "value3", "doc_count": 1 } ] } } } }
here code used test it:
http://sense.qbox.io/gist/a9a172f41dbd520d5e61063a9686055681110522
edit: filter nested value
as per comment, if want filter nested results value (of nested results), can add "layer" of aggregation making use of filter aggregation follows:
post /test_index/_search { "size": 0, "aggs": { "nested_tags": { "nested": { "path": "tags" }, "aggs": { "filter_tag_name": { "filter": { "term": { "tags.name": "key1" } }, "aggs": { "tags_name_terms": { "terms": { "field": "tags.name" } }, "tags_value_terms": { "terms": { "field": "tags.value" } } } } } } } }
which returns:
{ "took": 10, "timed_out": false, "_shards": { "total": 5, "successful": 5, "failed": 0 }, "hits": { "total": 2, "max_score": 0, "hits": [] }, "aggregations": { "nested_tags": { "doc_count": 4, "filter_tag_name": { "doc_count": 2, "tags_name_terms": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "key1", "doc_count": 2 } ] }, "tags_value_terms": { "doc_count_error_upper_bound": 0, "sum_other_doc_count": 0, "buckets": [ { "key": "value1", "doc_count": 2 } ] } } } } }
here's updated code:
http://sense.qbox.io/gist/507c3aabf36b8f6ed8bb076c8c1b8552097c5458
Comments
Post a Comment