New indexed documents in ElasticSearch are not searchable until a refresh occurs. By default, every shard is refreshed once every second ‒ defined by a dynamic index level setting named refresh_interval
. This forces Elasticsearch to create a new segment every second.
During bulk indexing it is recommended to increase this value. This allows larger segments to flush and decreases future merge pressure. (Replace $INDEX$ with your index name).
PUT /$INDEX$/_settings
{
"index"
: {
"refresh_interval"
: "60s"
}
}
What also can help is setting the index.number_of_replicas
to 0
. This is a tradeoff, as the loss any shard will cause data loss, but at the same time indexing will be faster since documents will be indexed only once.
PUT /$INDEX$/_settings
{
"index"
: {
"number_of_replicas"
: "0"
}
}
Once the initial loading is finished, you can set index.refresh_interval
and index.number_of_replicas
back to their original values:
PUT /$INDEX$/_settings
{
"index"
: {
"refresh_interval"
: "1s",
"number_of_replicas"
: "1"
}
}