Sensu Deployment

In this guide I want to suggest best practices for deploying sensu to monitor servers health. You can read about sensu and what it does in sensu web site.

Note: I’m using the word template in this guide but there isn’t really a template object in sensu, it because I’m used to work with templates in other monitoring systems. When I talk about template here I mean a set of check commands that use the same subscription (very similar to templates in other monitor systems)

My best pratice list to deploy sensu:

Sensu Flow

 

  1. Installation:

    1. First of all to install sensu use configuration management tool like chef. I used chef in my deployment and I wrote cookbook that use sensu-chef and uchiwa cookbook.
      If you have small environment you can install all sensu services and prerequisite on the same server: redis, rabbitmq, sensu-server and sensu-api. For medium to large environments you can use different servers for each role and scale horizontally if you need.
    2. Create template configuration file for each role in your environments. For each role in my environment I created a recipe that configure all checks and metrics commands for that role.
    3. Use chef roles, environments and server name for node subscriptions. This will help you assign checks and handlers per environment, role or specific node.
    4. For dashboard install uchiwa the community project for sensu dashboard
  2. Templates:

    1. Base OS – I created template that I called Base OS for basic operating system checks on my servers. Like: CPU, Memory, Disks utilization, Disk IO and networking. This is the first template that I use on all of my servers.
    2. For each server role that I have I created new template. Examples: couchbase checks, redis checks, mysql checks, web server checks, java checks and more
  3. Handlers:

    1. Mail – For sending emails alerts I used the mailer plugin. You can configure with this plugin default email addresses that will get all the notifications and email addresses per subscriptions.
    2. Graphite – I use graphite and metric checks to see trends of my servers and to get as much visibility that I can. To send data to graphite I use WizardVan: sensu metrics relay. What good in WizardVan is that it open persistent tcp connection to graphite server and can handle high volume of metrics data
    3. ELK – I use ELK with sensu-plugins-logstash to see history of events. sensu logstash plugin send sensu events into ELK and then you can search for sensu events that happened in the past instead of trying to read them from sensu logs
    4. Remediation – For auto response that runs commands like service restart when a warning or critical events occur I’m using remediation plugin
    5. Automatic deregister sensu client – If you are using dynamic environment with autoscale like in aws you can use handler that delete your client from sensu registry if it’s been stopped or terminated. Example plugins: ec2-node, chef
    6. Default – I created default handler in chef that combine the following handlers: debug, mailer and logstash. The default handler will run all these handlers on checks that didn’t define any specific handler.