Relying on great tools is essential to small teams like Postmark where we need to balance maintaining our infrastructure and creating new features. Infrastructure tooling creates a foundation that gives us the time and confidence to push Postmark forward to add more value for our customers.
What is Curator? #
Curator is a tool from Elastic (the company behind Elasticsearch) to help manage your Elasticsearch cluster. One of Postmark’s main usages for Elasticsearch is for storing the emails that you have sent and received. We use one Elasticsearch index per day and keep a rolling 45 day window of history. This means that every day we need to create, backup, and delete some indices. Curator helps make this process automated and repeatable.
Curator is written in Python, so it is well supported almost all operating systems. Installation is a breeze with a
pip install elasticsearch-curator. That provides you with the curator command that you can use. There’s also a Python API that you can access from your Python programs, but we only use the command line interface.
Removing time-series indices #
Elasticsearch is a great choice for storing time-series data for a number of reasons. The features that Postmark uses include templates that automatically create indices and aliases that seamlessly search across many indices.
But a piece of the puzzle that Elasticsearch doesn’t solve out of the box is how to remove data. For our message activity, a message should be searchable for up to 45 days after it is sent. This logic is very simple to implement using Curator. Our indices are named as
eventtype_YYYYMMDD so we have to tell Curator about our format and what indices to remove.
curator --host <ip address> delete indices --time-unit days --older-than 45 --timestring '%Y%m%d'
And that will delete the indices older than 45 days!
But, automation to delete data should be heavily verified. Luckily, Curator provides a dry-run flag to just output what Curator would have executed.
curator --dry-run --host <ip address> delete indices --time-unit days --older-than 45 --timestring '%Y%m%d'
Managing snapshots #
Another task that Curator helps us automate is using Elasticsearch snapshots.
curator --host <ip address> snapshot --repository <repository name> indices --all-indices
This will create a snapshot of all your indices with a name such as
curator-20151208153000, which works fine for our use case (of course you can customize the name and date format).
You’ll want to remove snapshots after a certain time as well. One problem that we ran into was that snapshot performance was reduced dramatically as the number of snapshots grew.
Luckily, this was fixed in subsequent versions, but you’ll still want to remove snapshots that you don’t need any more.
curator --host <ip address> delete snapshots --repository <repository name> --older-than 10 --time-unit days
Optimizing static indices #
For indices that aren’t being actively written to, you can optimize them to reduce and merge the segments that represent the index’s data on disk. Optimizing an index is a similar concept to defragmenting your hard drive. When an index is being written to, the segment merge process happens automatically so you don’t want to explicitly call optimize on an active index. With date-based indices, only the current index is being written to, so it is safe to optimize older indices.
Performing this operation with curator is simple as well.
curator --host <ip address> optimize indices --time-unit days --older-than 2 --timestring '%Y%m%d'
The importance of tooling #
Tooling is an essential part of operations and managing a product. There needs to be tooling to automate tasks or else your team (and product) will just tread water. Most of the time we write this tooling ourselves. But the fact that Curator is provided by Elastic is an incredible time saver for our team. Curator does more than what is described above and you can read the great documentation to discover all it can do.