Index Management

NOTE: The below and following sections on Index Management are superficial information with suggested examples. For a proper review please engage with SIEMonster support to establish a working plan to manage the index lifecycle policies based on your explicit needs.

Document Purpose

This document provides all relevant information about managing indices and their health in the SIEMonster environment. The basic requirements for optimal system health is to specify the retention period by policy. Some example policies have been included. Policies perform more than one action based on how they were formulated. It is recommended to use the policy’s supplied and modify them for your own needs.

Shards and Replicas Background

A basic understanding of sharding is required to perform the required maintenance on the environment. An index is split into a number of pieces named shards, the combination of the shards holds the relevant data that is displayed when querying the index. The system has been optimally configured to limit shards to 1000 per data node. It is important to keep an eye on the shard count as the system won’t allow creation of a new index if the maximum shard count is reached. For the CE and Professional Editions this limit means only 333 indices can be on the data node at any given time as each index has 3 shards. So 333 x 3 = 999. On an Enterprise edition a larger number of data nodes can be added which means that each data node can hold an additional 333 indices. This does not mean that you now have 665 to 667 days of indices. The Enterprise edition has a copy of each index distributed between the data nodes. This is to facilitate searches of large volumes of data without experiencing timeouts. Once you add a third or more data nodes, the indices will become more distributed and provide an increased count of unique indices.

Index Management

To properly maintain the indices, a component named Index Management can be utilized to automatically address the maintenance items such as snapshots and deletion beyond retention period. For the automation to work it is important that the following items are in place:

A repository (NFS or S3, S3 is preferred)
- This will require either an additional volume be added to the pods that contain the Elasticsearch instance or credentials added for the S3 access. This will also require an S3 or block storage area that has been set up and configured with sufficient privileges for the snapshot activities
A retention policy that includes snapshot configurations and deletion configurations.