Setting up ElasticSearch for Linux systems (advanced)
If you are using a Linux system, typically on a server, you need to manage extra setup to have a performance gain or to resolve production problems with many indices.
Getting ready
You need a working ElasticSearch installation.
How to do it...
For improving the performance on Linux systems, we will perform the steps given as follows:
- First you need to change the current limit for the user who runs the ElasticSearch server. In these examples, we call the user as
elasticsearch
. - To allow elasticsearch to manage a large number of files, you need to increment the number of file descriptors (number of files) that a user can have. To do so, you must edit your
/etc/security/limits.conf
and add the following lines at the end:elasticsearch - nofile 999999 elasticsearch - memlock unlimited
Then a machine restart is required to be sure that changes are taken.
- For controlling the memory swapping, you need to set up this parameter in
elasticsearch.yml
:bootstrap.mlockall: true
- To fix the memory usage size of ElasticSearch server, we need to set up the same value
ES_MIN_MEM
andES_MAX_MEM
in$ES_HOME/bin/elasticsearch.in.sh
. You can otherwise set upES_HEAP_SIZE
that automatically initializesES_MIN_MEM
andES_MAX_MEM
to sameES_HEAP_SIZE
provided value.
How it works...
The standard limit of file descriptors (max number of open files for a user) is typically 1024. When you store a lot of records in several indices, you run out of file descriptors very quickly, so your ElasticSearch server becomes unresponsive and your indices may become corrupted, losing your data.
Changing the limit to a very high number means that your ElasticSearch doesn't hit the maximum number of open files.
The other settings for the memory prevent ElasticSearch from swapping the memory and give a performance boost in the production environment. These settings are required because during indexing and searching, ElasticSearch creates and destroys a lot of objects in memory. This large number of create/destroy actions fragments the memory, reducing the performances. If you don't set bootstrap.mlockall: true
, ElasticSearch dumps the memory on disk and defragments it back in memory. With this setting, the defragmentation step is done in memory with huge performance boost.
There's more...
This recipe covers two common errors that happen in production:
- "Too many open files", that can corrupt your indices and your data
- Slow performance in search and indexing due to garbage collector