To better understand how to search through array data in your ElasticSearch JSON document, it is important to know how ElasticSearch stores arrays behind the scenes.
Given a document which contains a ‘names’ array with a list of different name properties(e.g. firstname, lastname,…) using the following mapping:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"names": { | |
"properties": { | |
"active": { | |
"type": "boolean" | |
}, | |
"companyname": { | |
"type": "text", | |
"fields": { | |
"keyword": { | |
"type": "keyword", | |
"ignore_above": 256 | |
} | |
} | |
}, | |
"lastname": { | |
"type": "text", | |
"fields": { | |
"keyword": { | |
"type": "keyword", | |
"ignore_above": 256 | |
} | |
} | |
}, | |
"firstname": { | |
"type": "text", | |
"fields": { | |
"keyword": { | |
"type": "keyword", | |
"ignore_above": 256 | |
} | |
} | |
}, | |
"initials": { | |
"type": "text", | |
"fields": { | |
"keyword": { | |
"type": "keyword", | |
"ignore_above": 256 | |
} | |
} | |
} | |
} | |
} |
With the following sample data:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"names": [ | |
{ | |
"firstname": "James", | |
"lastname": "Bond" | |
}, | |
{ | |
"firstname": "Ethan", | |
"lastname": "Hunt" | |
} | |
] | |
} |
When ElasticSearch indexes this document, it is stored like this:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"names.firstname": ["James","Ethan"], | |
"names.lastname": ["Bond","Hunt"] | |
} |
Lucene has no concept of inner objects, so Elasticsearch flattens object hierarchies into a simple list of field names and values.
If you don’t want this behavior, you have to map the array as a nested object. Internally ElasticSearch maps these array objects as separate documents and does a child query.