New indexed documents in ElasticSearch are not searchable until a refresh occurs. By default, every shard is refreshed once every second ‒ defined by a dynamic index level setting named refresh_interval. This forces Elasticsearch to create a new segment every second.
During bulk indexing it is recommended to increase this value. This allows larger segments to flush and decreases future merge pressure. (Replace $INDEX$ with your index name).
PUT /$INDEX$/_settings
{
"index" : {
"refresh_interval" : "60s"
}
}
What also can help is setting the index.number_of_replicas to 0. This is a tradeoff, as the loss any shard will cause data loss, but at the same time indexing will be faster since documents will be indexed only once.
PUT /$INDEX$/_settings
{
"index" : {
"number_of_replicas" : "0"
}
}
Once the initial loading is finished, you can set index.refresh_interval and index.number_of_replicas back to their original values:
PUT /$INDEX$/_settings
{
"index" : {
"refresh_interval" : "1s",
"number_of_replicas" : "1"
}
}