Full Monitoring System: Installing Graphite, collectd and StatsD – Part 1

Full monitoring system - Tutorial

Introduction

Collecting data (stats about servers, applications, daily site traffic, etc) is the first step when troubleshooting site issues, looking to implement performance improvements or engaging in analysis of any kind. Of course, raw data alone is not enough to develop any improvement or conclusions, so the second step is data organization. Together, these two steps result in changes to server configuration, developing conclusions and implementing improvements.
There are a lot of tools which can be used for this purpose, but in this series of tutorials we are going to look at three that can be combined nicely to automate your full monitoring system: Graphite, StatsD and collectd. Here are the basics:

  • Graphite is a graphing library, made up of several components, used to render visual representations of collected data;
  • collectd is a system statistics daemon for collecting data periodically, which also provides mechanisms to store the values in a variety of ways;
  • StatsD is a statistics aggregator that helps organize arbitrary data.

In this first guide (part 1) we’ll look at installing Graphite on a Ubuntu 16.04 Server. Stay tuned to hear about the others so that you can integrate all three into a fully automated monitoring system!

Installing Graphite

First of all, update the server:

# apt update

Next, install two components of Graphite: graphite-web and graphite-carbon; these are both present in Ubuntu repositories:

# apt install graphite-web graphite-carbon

The installation process will ask whether Carbon should remove the database files when deciding to purge the installation. Choosing ‘No’ will save the stats.
This call will install Graphite completely, but now you will want to do a little additional configuration.

Configure a database for Django

Graphite-web is a Django Python application which needs to store its data somewhere, even if the Graphite data is handled by Carbon and the whisper library.
By default, it is configured for using SQLite3, but it’s preferable to use PostgreSQL.

To install PostgreSQL:

# apt install postgresql libpq-dev python-psycopg2

After installation is complete, create a new user and database for Graphite. To do this, execute:

# systemctl start postgresql
# sudo -u postgres psql

The PostgreSQL shell will be started.

Next, create a database user account:

postgres=# CREATE USER graphiteusr WITH PASSWORD 'user_strong_password';

Then, create the new database that will be used by Graphite:

postgres=# CREATE DATABASE graphite_db WITH OWNER graphiteusr;
postgres=# \q

The database is now ready to use.

Configure Graphite Web

In order to use the database you just created, you’ll need to configure the Graphite Web Application. To do this, edit its configuration file:

# $EDITOR /etc/graphite/local_settings.py

For security reasons, you’ll want to uncomment the

SECRET_KEY

line and then replace this parameter’s value with a line that would be very difficult to guess:

SECRET_KEY = 'my_very_long_and_difficult_key'

Next, configure the timezone as this will affect the time displayed on your graphs. In my case, for instance, it would be:

TIME_ZONE = 'Europe/Rome'

Configure authentication for saving graph data, uncommenting the following line:

USE_REMOTE_USER_AUTHENTICATION = True

Then fill up the DATABASE section, as follow:

DATABASES = {
    'default': {
        'NAME': 'graphite_db',
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'USER': 'graphiteusr',
        'PASSWORD': 'user_strong_password',
        'HOST': '127.0.0.1',
        'PORT': ''
    }
}

Lastly, you’ll want to save, exit and sync the database.
First:

# graphite-manage migrate auth

Then:

# graphite-manage syncdb

During this process, create a new superuser account for the database.

Configure Carbon

Carbon is the Graphite storage backend. You’ll want to configure it to start at boot time by editing one configuration file:

# $EDITOR /etc/default/graphite-carbon

Making the following change:

CARBON_CACHE_ENABLED=true

Save, exit and edit the Carbon configuration file:

# $EDITOR /etc/carbon/carbon.conf

Configure log rotation:

ENABLE_LOGROTATION = True

Save and exit.

Configure Carbon Storage Schemas

Configuring Storage Schemas is necessary for telling Carbon how long to store values, and also how detailed these values have to be:

# $EDITOR /etc/carbon/storage-schemas.conf

Inside this file there are entries, like:

[carbon]
pattern = ^carbon\.
retentions = 1m:1h

Create a new entry:

[test]
pattern = ^test\.
retentions = 20s:20m,1m:1h,5m:1d

This will match any metrics beginning with

test.

, and will store data three times in various levels of detail. The first archive definition will create a data point every 20 seconds, and it will store the values for 20 minutes. The second will create a data point every minute and store values for one hour. The last will create a point every 5 minutes, storing it for one day.
Save and close the file.

Storage Aggregation

Aggregation happens when Graphite makes a less detailed version of a metric and its default behavior is to take an average of the collected data points. To modify the way in which Carbon aggregate points, edit the 

storage-aggregation.conf

file. First:

# cp /usr/share/doc/graphite-carbon/examples/storage-aggregation.conf.example /etc/carbon/storage-aggregation.conf

Then:

# $EDITOR /etc/carbon/storage-aggregation.conf

This file is structured similarly to the previous one, with entries like:

[default_average]
pattern = .*
xFilesFactor = 0.5
aggregationMethod = average

The

xFilesFactor

is used to specify the minimum percentage of values that Carbon should have to do the aggregation. 0.5 means that it will need 50% of the more detailed data points for creating an aggregated point.

Note: sending Graphite data points more frequently than the shortest archive interval length could cause data lost.

Save these changes and exit. Then:

# systemctl start carbon-cache

Configure Apache

An Apache Web Server is needed in order to use the Graphite Web interface.
In order to configure Apache, disable the default virtual host file:

# a2dissite 000-default

and then:

# cp /usr/share/graphite-web/apache2-graphite.conf /etc/apache2/sites-available

Enable this file, executing:

# a2ensite apache2-graphite

Reload Apache:

# systemct reload apache2

Conclusions

At this point, Graphite is correctly installed and configured. It’s possible to test it by using your web browser to access http://localhost.
Of course, this is just the first step in configuring a full monitoring data system, because Graphite itself can only do a handful of tasks.
The next guides will talk about integrating it with collectd and StatsD for automating your whole monitoring system.