Menu
Menu

Simple Cassandra Database Monitoring

Sunday, July 23, 2017

Monitoring a database is very important as it can alert us about a potential calamity before it is about to happen. For example the disk storage space of the database server could be about to fill up and would result in database not able to store more data. With monitoring we can set simple triggers to alert us whenever say the disk space of the database server is about to reach 80% and then we can rectify the situation before it causes any problem. Many similar metrics should be tracked to ensure smooth functioning of the database.

I recently started working with Cassandra and as with any production database deployments, I was looking for ways to monitor my database cluster. As I was just starting up with Cassandra, I didn't want a very big architecture just for monitoring as it would be have been more things to manage. I was looking for a very simple initial way to monitor my Cassandra cluster without using any external services and additional servers. That's when I started exploring the JMX interface exposed by Cassandra.

Each of the Cassandra nodes in the cluster exposes a JMX interface on port 7199 (default) to which a client can connect and query for objects called MBeans. Each MBean object provides us with some unique information like performance, resource usage or problems associated with our application. So I decided to make a simple client application which would connect to each of the nodes in our Cassandra cluster and fetch certain predefined MBean objects to check whether all is well or not.

The application is called Cassandra Monitor and can be found on github.

It is a Java application which connects to each of the nodes and fetches the MBeans to evaluate. Firstly and most importantly It tracks whether any of the nodes in the cluster is down. It then evaluates each of the MBean value with a certain threshold value specified by the user. If the value crosses a certain threshold then the user is alerted through Slack notifications. Slack notifications make for an easy way to alert the users. One can easily extend the application to maybe send an Email instead of a Slack notification.

Some of the important metrics that I like to track are -

Read rate in last 5 minutes -

{
      "objectName": "org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency",
      "attribute": "FiveMinuteRate"
}

Write rate in last 5 minutes -

{
      "objectName": "org.apache.cassandra.metrics:type=ClientRequest,scope=Write,name=Latency",
      "attribute": "FiveMinuteRate"
}

Total disk space used -

{
      "objectName": "org.apache.cassandra.metrics:type=ColumnFamily,name=TotalDiskSpaceUsed",
      "attribute": "Value"
}

Free RAM -

{
      "objectName": "java.lang:type=OperatingSystem",
      "attribute": "FreePhysicalMemorySize"
}