Monitor server resources with checks

Sensu checks are commands or scripts the Sensu agent executes that output data and produce an exit code to indicate a state.

You can use checks to monitor server resources, services, and application health, such as remaining disk space and whether NGINX is running. This guide includes two check examples to help you monitor server resources: specifically, CPU usage and NGINX status.

Requirements

To follow this guide, install the Sensu backend, make sure at least one Sensu agent is running, and configure sensuctl to connect to the backend as the admin user.

Configure a Sensu entity

Every Sensu agent has a defined set of subscriptions that determine which checks the agent will execute. For an agent to execute a specific check, you must specify the same subscription in the agent configuration and the check definition. To run the CPU and NGINX webserver checks, you’ll need a Sensu entity with the subscriptions system and webserver.

NOTE: In production, your CPU and NGINX servers would be different entities, with the system subscription specified for the CPU entity and the webserver subscription specified for the NGINX entity. To keep things streamlined, this guide uses one entity to represent both.

To add the system and webserver subscriptions to the entity the Sensu agent is observing, first find your agent entity name:

sensuctl entity list

The ID is the name of your entity.

Replace <ENTITY_NAME> with the name of your agent entity in the following sensuctl command. Run:

sensuctl entity update <ENTITY_NAME>
  • For Entity Class, press enter.
  • For Subscriptions, type system,webserver and press enter.

Confirm both Sensu services are running:

systemctl status sensu-backend && systemctl status sensu-agent

The response should indicate active (running) for both the Sensu backend and agent.

Register dynamic runtime assets

You can write shell scripts in the command field of your check definitions, but we recommend using existing check plugins instead. Check plugins must be available on the host where the agent is running for the agent to execute the check. This guide uses dynamic runtime assets to manage plugin installation.

Register the sensu/check-cpu-usage asset

The sensu/check-cpu-usage dynamic runtime asset includes the check-cpu-usage command, which your CPU check will rely on.

To register the sensu/check-cpu-usage dynamic runtime asset, run:

sensuctl asset add sensu/check-cpu-usage:0.2.2 -r check-cpu-usage

The response will confirm that the asset was added:

fetching bonsai asset: sensu/check-cpu-usage:0.2.2
added asset: sensu/check-cpu-usage:0.2.2

You have successfully added the Sensu asset resource, but the asset will not get downloaded until
it's invoked by another Sensu resource (ex. check). To add this runtime asset to the appropriate
resource, populate the "runtime_assets" field with ["check-cpu-usage"].

This example uses the -r (rename) flag to specify a shorter name for the dynamic runtime asset: check-cpu-usage.

Register the sensu/sensu-processes-check asset

Then, use this command to register the sensu/sensu-processes-check dynamic runtime asset, which you’ll use later for your webserver check:

sensuctl asset add sensu/sensu-processes-check:0.2.0 -r sensu-processes-check

To confirm that both dynamic runtime assets are ready to use, run:

sensuctl asset list

The response should list the renamed check-cpu-usage and sensu-processes-check dynamic runtime assets:

          Name                                                 URL                                         Hash    
──────────────────────── ─────────────────────────────────────────────────────────────────────────────── ──────────
  check-cpu-usage         //assets.bonsai.sensu.io/.../check-cpu-usage_0.2.2_windows_amd64.tar.gz         900cfdf  
  check-cpu-usage         //assets.bonsai.sensu.io/.../check-cpu-usage_0.2.2_darwin_amd64.tar.gz          db81ee7  
  check-cpu-usage         //assets.bonsai.sensu.io/.../check-cpu-usage_0.2.2_linux_armv7.tar.gz           400aacc  
  check-cpu-usage         //assets.bonsai.sensu.io/.../check-cpu-usage_0.2.2_linux_arm64.tar.gz           bef7802  
  check-cpu-usage         //assets.bonsai.sensu.io/.../check-cpu-usage_0.2.2_linux_386.tar.gz             a2dcb53  
  check-cpu-usage         //assets.bonsai.sensu.io/.../check-cpu-usage_0.2.2_linux_amd64.tar.gz           2453973  
  sensu-processes-check   //assets.bonsai.sensu.io/.../sensu-processes-check_0.2.0_windows_amd64.tar.gz   42e2d71  
  sensu-processes-check   //assets.bonsai.sensu.io/.../sensu-processes-check_0.2.0_darwin_amd64.tar.gz    957c008  
  sensu-processes-check   //assets.bonsai.sensu.io/.../sensu-processes-check_0.2.0_linux_armv7.tar.gz     20cc5b1  
  sensu-processes-check   //assets.bonsai.sensu.io/.../sensu-processes-check_0.2.0_linux_arm64.tar.gz     c68b5f0  
  sensu-processes-check   //assets.bonsai.sensu.io/.../sensu-processes-check_0.2.0_linux_386.tar.gz       4c47caa  
  sensu-processes-check   //assets.bonsai.sensu.io/.../sensu-processes-check_0.2.0_linux_amd64.tar.gz     70e830f

Because plugins are published for multiple platforms, including Linux and Windows, the output will include multiple entries for each of the dynamic runtime assets.

NOTE: Sensu does not download and install dynamic runtime asset builds onto the system until they are needed for command execution.

Create a check to monitor a server

Now that the dynamic runtime assets are registered, create a check named check_cpu that runs the command check-cpu-usage -w 75 -c 90 with the check-cpu-usage dynamic runtime asset at an interval of 60 seconds for all entities subscribed to the system subscription. This check generates a warning event (-w) when CPU usage reaches 75% and a critical alert (-c) at 90%.

sensuctl check create check_cpu \
--command 'check-cpu-usage -w 75 -c 90' \
--interval 60 \
--subscriptions system \
--runtime-assets check-cpu-usage

You should receive a confirmation message:

Created

To view the complete resource definition for check_cpu, run:

sensuctl check info check_cpu --format yaml
sensuctl check info check_cpu --format wrapped-json

The sensuctl response will include the complete check_cpu resource definition in the specified format:

---
type: CheckConfig
api_version: core/v2
metadata:
  name: check_cpu
spec:
  check_hooks: null
  command: check-cpu-usage -w 75 -c 90
  env_vars: null
  handlers: []
  high_flap_threshold: 0
  interval: 60
  low_flap_threshold: 0
  output_metric_format: ""
  output_metric_handlers: null
  pipelines: []
  proxy_entity_name: ""
  publish: true
  round_robin: false
  runtime_assets:
  - check-cpu-usage
  secrets: null
  stdin: false
  subdue: null
  subscriptions:
  - system
  timeout: 0
  ttl: 0
{
  "type": "CheckConfig",
  "api_version": "core/v2",
  "metadata": {
    "name": "check_cpu"
  },
  "spec": {
    "check_hooks": null,
    "command": "check-cpu-usage -w 75 -c 90",
    "env_vars": null,
    "handlers": [],
    "high_flap_threshold": 0,
    "interval": 60,
    "low_flap_threshold": 0,
    "output_metric_format": "",
    "output_metric_handlers": null,
    "pipelines": [],
    "proxy_entity_name": "",
    "publish": true,
    "round_robin": false,
    "runtime_assets": [
      "check-cpu-usage"
    ],
    "secrets": null,
    "stdin": false,
    "subdue": null,
    "subscriptions": [
      "system"
    ],
    "timeout": 0,
    "ttl": 0
  }
}

Validate the CPU check

The Sensu agent uses WebSocket to communicate with the Sensu backend, sending event data as JSON messages. As your checks run, the Sensu agent captures check standard output (stdout) or standard error (stderr). This data will be included in the JSON payload the agent sends to your Sensu backend as the event data.

It might take a few moments after you create the check for the check to be scheduled on the entity and the event to return to the Sensu backend. Use sensuctl to view the event data and confirm that Sensu is monitoring CPU usage:

sensuctl event list

The response should list the check_cpu check, returning an OK status (0)

     Entity        Check                                                                                                      Output                                                                                                    Status   Silenced             Timestamp                             UUID                  
─────────────── ─────────── ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── ──────── ────────── ─────────────────────────────── ───────────────────────────────────────
  sensu-centos   check_cpu   check-cpu-usage OK: 1.02% CPU usage | cpu_idle=98.98, cpu_system=0.51, cpu_user=0.51, cpu_nice=0.00, cpu_iowait=0.00, cpu_irq=0.00, cpu_softirq=0.00, cpu_steal=0.00, cpu_guest=0.00, cpu_guestnice=0.00        0   false      2021-10-06 19:25:43 +0000 UTC   xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  

Create a check to monitor a webserver

In this section, you’ll create a check to monitor an NGINX webserver, similar to the CPU check you created in the previous section but using the webserver subscription rather than system.

Install and configure NGINX

The webserver check requires a running NGINX service, so you’ll need to install and configure NGINX.

NOTE: You may need to install and update the EPEL repository with sudo yum install epel-release and sudo yum update before you can install NGINX.

Install NGINX:

sudo yum install nginx

Enable and start the NGINX service:

systemctl enable nginx && systemctl start nginx

Verify that NGINX is serving webpages:

curl -sI http://localhost

The response should include HTTP/1.1 200 OK to indicate that NGINX processed your request as expected:

HTTP/1.1 200 OK
Server: nginx/1.20.1
Date: Wed, 06 Oct 2021 19:35:14 GMT
Content-Type: text/html
Content-Length: 4833
Last-Modified: Fri, 16 May 2014 15:12:48 GMT
Connection: keep-alive
ETag: "xxxxxxxx-xxxx"
Accept-Ranges: bytes

With your NGINX service running, you can configure the webserver check.

Create the webserver check definition

Create a check that uses sensu-processes-check in the command to search for the string nginx. The nginx_service check will run at an interval of 15 seconds and determine whether the nginx service is among the running processes for all entities subscribed to the webserver subscription.

To create the nginx_service check, run the following command:

cat << EOF | sensuctl create
---
type: CheckConfig
api_version: core/v2
metadata:
  name: nginx_service
spec:
  command: >
    sensu-processes-check
    --search
    '[{"search_string": "nginx"}]'
  subscriptions:
  - webserver
  interval: 15
  publish: true
  runtime_assets:
  - sensu-processes-check
EOF
cat << EOF | sensuctl create
{
  "type": "CheckConfig",
  "api_version": "core/v2",
  "metadata": {
    "name": "nginx_service"
  },
  "spec": {
    "command": "sensu-processes-check --search '[{\"search_string\": \"nginx\"}]'\n",
    "subscriptions": [
      "webserver"
    ],
    "interval": 15,
    "publish": true,
    "runtime_assets": [
      "sensu-processes-check"
    ]
  }
}
EOF

You should receive a confirmation message:

Created

To view the complete resource definition for nginx_service, run:

sensuctl check info nginx_service --format yaml
sensuctl check info nginx_service --format wrapped-json

The sensuctl response will include the complete nginx_service resource definition in the specified format:

---
type: CheckConfig
api_version: core/v2
metadata:
  name: nginx_service
spec:
  check_hooks: null
  command: |
        sensu-processes-check --search '[{"search_string": "nginx"}]'
  env_vars: null
  handlers: []
  high_flap_threshold: 0
  interval: 15
  low_flap_threshold: 0
  output_metric_format: ""
  output_metric_handlers: null
  pipelines: []
  proxy_entity_name: ""
  publish: true
  round_robin: false
  runtime_assets:
  - sensu-processes-check
  secrets: null
  stdin: false
  subdue: null
  subscriptions:
  - webserver
  timeout: 0
  ttl: 0
{
  "type": "CheckConfig",
  "api_version": "core/v2",
  "metadata": {
    "name": "nginx_service"
  },
  "spec": {
    "check_hooks": null,
    "command": "sensu-processes-check --search '[{\"search_string\": \"nginx\"}]'\n",
    "env_vars": null,
    "handlers": [],
    "high_flap_threshold": 0,
    "interval": 15,
    "low_flap_threshold": 0,
    "output_metric_format": "",
    "output_metric_handlers": null,
    "pipelines": [],
    "proxy_entity_name": "",
    "publish": true,
    "round_robin": false,
    "runtime_assets": [
      "sensu-processes-check"
    ],
    "secrets": null,
    "stdin": false,
    "subdue": null,
    "subscriptions": [
      "webserver"
    ],
    "timeout": 0,
    "ttl": 0
  }
}

Validate the webserver check

It might take a few moments after you create the check for the check to be scheduled on the entity and the event to return to the Sensu backend. Use sensuctl to view event data and confirm that Sensu is monitoring the NGINX webserver status:

sensuctl event list

The response should list the nginx_service check, returning an OK status (0):

     Entity          Check                                       Output                                   Status   Silenced             Timestamp                             UUID                  
─────────────── ─────────────── ──────────────────────────────────────────────────────────────────────── ──────── ────────── ─────────────────────────────── ───────────────────────────────────────
  sensu-centos   nginx_service   OK       | 2 >= 1 (found >= required) evaluated true for "nginx"              0   false      2021-11-08 16:59:34 +0000 UTC   xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  
                                 Status - OK     

Simulate a critical event

To manually generate a critical event for your nginx_service check, stop the NGINX service. Run:

systemctl stop nginx

When you stop the service, the check will generate a critical event. After a few moments, run:

sensuctl event list

The response should list the nginx_service check, returning a CRITICAL status (2):

     Entity          Check                                       Output                                   Status   Silenced             Timestamp                             UUID                  
─────────────── ─────────────── ──────────────────────────────────────────────────────────────────────── ──────── ────────── ─────────────────────────────── ───────────────────────────────────────
  sensu-centos   nginx_service   CRITICAL | 0 >= 1 (found >= required) evaluated false for "nginx"             2   false      2021-11-08 17:02:04 +0000 UTC   xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx  
                                 Status - CRITICAL             

Restart the NGINX service to clear the event:

systemctl start nginx

After a moment, you can verify that the event cleared:

sensuctl event list

The response should list the nginx_service check with an OK status (0).

What’s next

Now that you know how to create checks to monitor CPU usage and NGINX webserver status, read the checks reference and assets reference for more detailed information. Or, learn how to collect and analyze metrics or monitor external resources with proxy checks and entities.

You can also create pipelines to send alerts to email, PagerDuty, or Slack based on the status events your checks are generating. Or, send status and metrics data to Sumo Logic. Read the pipelines reference for information about configuring observability event processing workflows with event filters, mutators, and handlers.

To share, reuse, and maintain the checks you created in this guide just like you would code, save the check definitions to a file and start building a monitoring as code repository.

Learn more about the dynamic runtime assets this guide uses: sensu/check-cpu-usage and sensu/sensu-processes-check.