Skip to main content

Components

We recommend using a combination of Elasticsearch’s native features to ensure you do not accumulate too many open indexes by backing up your indexes to S3 in your own AWS account:

Configuring a snapshot repository in S3

Step 1: Create an S3 bucket. We will use “aptible_logs” as the bucket name for this example. Step 2: Create a dedicated user to minimize the permissions of the access key, which will be stored in the database. Elasticsearch recommends creating an IAM policy with the minimum access level required. They provide a recommended policy here. Step 3: Register the snapshot repository using the Elasticsearch API directly because the Kibana UI does not provide you a way to specify your IAM keypair. In this example, we’ll call the repository “s3_repository” and configure it to use the “aptible_logs” bucket created above:
curl -X PUT "https://username:password@localhost:9200/_snapshot/s3_repository?pretty" -H 'Content-Type: application/json' -d'
{
  "type": "s3",
  "settings": {
    "bucket" : "aptible_logs",
    "access_key": "AWS_ACCESS_KEY_ID",
    "secret_key": "AWS_SECRET_ACCESS_KEY",
    "protocol": "https",
    "server_side_encryption": true
  }
}
'
Be sure to provide the correct username, password, host, and port needed to connect to your database, likely as provided by the database tunnel, if you’re connecting that way. The full documentation of available options is here.

Backing up your indexes

To backup your indexes, use Elasticsearch’s Snapshot Lifecycle Management to automate daily backups of your indexes. In Kibana, you’ll find these settings under Elasticsearch Management > Snapshot and Restore. Snapshots are incremental, so you can set the schedule as frequently as you like, but at least daily is recommended. You can find the full documentation for creating a policy here.

Limiting the live retention

Now that you have a Snapshot Lifecycle policy configured to backup your data to S3, the final step is to ensure you delete indexes after a specific time in Elasticsearch. Deleting indexes will ensure both RAM and disk space requirements are relatively fixed, given a fixed volume of logs. For example, you may keep only 30 days in Elasticsearch, and if you need older indexes, you can retrieve them by restoring the snapshot from S3. Step 1: Create a new policy by navigating to Elasticsearch Management > Index Lifecycle Policies. Under “Hot phase”, disable rollover - we’re already creating a new index daily, which should be sufficient. Enable the “Delete phase” and set it for 30 days from index creation (or to your desired live retention). Step 2: Specify to Elasticsearch which new indexes you want this policy to apply automatically. In Kibana, go to Elasticsearch Management > Index Management, then click Index Templates. Create a new template using the Index pattern logstash-*. You can leave all other settings as default. This template will ensure all new daily indexes get the lifecycle policy applied.
{
  "index.lifecycle.name": "rotation"
}
Step 3: Apply the lifecycle policy to any existing indexes. Under Elasticsearch Management > Index Management, select one by one each logstash-* index, click Manage, and then Apply Lifecycle Policy. Choose the policy you created earlier. If you want to apply the policy in bulk, you’ll need to use the update settings API directly.

Snapshot Lifecycle Management as an alternative to Aptible backups

Aptible database backups allow for the easy restoration of a backup to an Aptible database using a single CLI command. However, the data retained with Snapshot Lifecycle Management is sufficient to restore the Elasticsearch database in the event of corruption, and you can configure Elasticsearch take much more frequent backups.