Welcome to the conn-check documentation.

Contents:

About conn-check

conn-check allows for checking connectivity with external services.

You can write a config file that defines services that you need to have access to, and conn-check will check connectivity with each.

It supports various types of services, all of which allow for basic network checks, but some allow for confirming credentials work also.

Configuration

The configuration is done via a yaml file. The file defines a list of checks to do:

- type: tcp
  host: localhost
  port: 80
- type: tls
  host: localhost
  port: 443
  disable_tls_verification: false

Each check defines a type, and then options as appropriate for that type.

For a step by step guide on configuring conn-check for your application see the tutorial.

Check Types

tcp

A simple tcp connectivity check.

host
The host.
port
The port.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).

tls

A check that uses TLS (ssl is a deprecated alias for this type).

host
The host.
port
The port.
disable_tls_verification
Optional flag to disable verification of TLS certs and handshake. Default: false.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).

udp

Check that sending a specific UDP packet gets a specific response.

host
The host.
port
The port.
send
The string to send.
expect
The string to expect in the response.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).

http

Check that a HTTP/HTTPS request succeeds (https also works).

url
The URL to fetch.
method
Optional HTTP method to use. Default: “GET”.
expected_code
Optional status code that defines success. Default: 200.
proxy_url
Optional HTTP/HTTPS proxy URL to connect via, including protocol, if set proxy_{host,port} are ignored.
proxy_host
Optional HTTP/HTTPS proxy to connect via.
proxy_port
Optional port to use with proxy_host. Default: 8000.
headers:
Optional headers to send, as a dict of key-values. Multiple values can be given as a list/tuple of lists/tuples, e.g.: [('foo', 'bar'), ('foo', 'baz')]
body:
Optional raw request body string to send.
disable_tls_verification:
Optional flag to disable verification of TLS certs and handshake. Default: false.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).
allow_redirects
Optional flag to Follow 30x redirects. Default: false.
params
Optional dict of params to URL encode and pass in the querystring.
cookies
Optional dict of cookies to pass in the request headers.
auth
Optional basic HTTP auth credentials, as a tuple/list: (username, password).
digest_auth
Optional digest HTTP auth credentials, as a tuple/list: (username, password).

amqp

Check that an AMQP server can be authenticated against.

host
The host.
port
The port.
username
The username to authenticate with.
password
The password to authenticate with.
use_tls
Optional flag whether to connect with TLS. Default: true.
vhost
Optional vhost name to connect to. Default ‘/’.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).

postgres

Check that a PostgreSQL db can be authenticated against (postgresql also works).

host
The host.
port
The port.
username
The username to authenticate with.
password
The password to authenticate with.
database
The database to connect to.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).

redis

Check that a redis server is present, optionally checking authentication.

host
The host.
port
The port.
password
Optional password to authenticatie with.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).

memcache

Check that a memcached server is present (memcached also works).

host
The host.
port
The port.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).

mongodb

Check that a MongoDB server is present (mongo also works).

host
The host.
port
Optional port. Default: 27017.
username
Optional username to authenticate with.
password
Optional password to authenticate with.
database
Optional database name to connect to, if not set the test database will be used, if this database does not exist (or is not available to the user) you will need to provide a database name.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).

smtp

Check that we can reach, authenticate with and send an email using an SMTP server.

Note 1: if this check succeeds an email is actually sent to the email defined in to_address, be careful how this is check is configured so it doesn’t unintentionally spam anyone.

Note 2: only EHLO/HELO over a TLS connection is supported with the use_tls flag, this check cannot currently create new TLS connection using the STARTTLS Extension.

host
The host.
port
The port, normally 465 for TLS and 25 for plaintext.
username
Username to authenticate with.
password
Password to authenticate with.
from_address:
Email address to send from.
to_address:
Email address to send to.
message:
Optional email body.
subject:
Optional email subject.
helo_fallback:
Optional flag that defines whether to fall back to HELO if the EHLO extended command set fails.
use_tls:
Optional flag to enable TLS security on connection. Default: true.
timeout
Optional connection timeout in seconds. default: 5 (or value from --connect-timeout).

Timeouts

By default conn-check’s global timeout (max-timeout) is set to 9 seconds, this is because when used with Nagios the maximum timeout for NRPE commands is 10 seconds, so we need to ensure checks execute with enough time to output any errors (if you hit the NRPE timeout no output will be returned, just a socket error from Nagios).

If you need longer timeouts you can always set max-timeout yourself (it is settings, but accepts floats for sub-second values).

You can also set a different connect timeout, which is time taken to open an inidivual connection (without doing anything else) per check, which is set globally with --connect-timeout, or per check using the timeout argumnent that most check types accept.

Tags

Every check type also supports a tags field, which is a list of tags that can be used with the --include-tags and --exclude-tags arguments to conn-check.

Example YAML:

- type: http
  url: http://google.com/
  tags:
    - external

To run just “external” checks:

conn-check --include-tags=external ...

To run all the checks except external:

conn-check --exclude-tags=external

Buffered/Ordered output

conn-check normally executes with output to STDOUT buffered so that the output can be ordered, with failed checks being printed first, grouping by destination etc.

If you’d rather see results as they available you can use the -U/--unbuffered-output option to disable buffering.

Generating firewall rules

conn-check includes the conn-check-export-fw utility which takes the same arguments as conn-check but runs using --dry-run mode and outputs a set of egress firewall rules in an easy to parse YAML format, for example:

# Generated from the conn-check demo.yaml file
egress:
- from_host: mydevmachine
  ports: [8080]
  protocol: udp
  to_host: localhost
- from_host: mydevmachine
  ports: [80, 443]
  protocol: tcp
  to_host: login.ubuntu.com
- from_host: mydevmachine
  ports: [6379, 11211]
  protocol: tcp
  to_host: 127.0.0.1

You can then use this output to generate your environments firewall rules (e.g. with EC2 security groups, OpenStack Neutron, iptables etc.).

conn-check-convert-fw is a utility that does just this, it accepts multiple firewall rule YAML files, merges/de-dupes them, and outputs commands for AWS, Openstack Neutron, OpenStack Nova (client), iptables, and ufw (mostly for testing purposes).

It is designed for this workflow:

  • On each host you run conn-check from, you run conn-check-export-fw to generate a YAML file containing egress firewall rules.
  • Each of these files is transfered to a host with the correct DNS entries for the egress hosts.
  • On this host conn-check-convert-fw is run to generate a set of commands for your firewall.
  • These commands are audited by a human / possibly merged with other rules, such as adding ingress rules, and then run to update your environment’s firewall.

Building wheels

To allow for easier/more portable distribution of this tool you can build conn-check and all its dependencies as Python wheels:

make clean-wheels
make build-wheels
make build-wheels-extra EXTRA=amqp
make build-wheels-extra EXTRA=redis

The build-wheels make target will build conn-check and its base dependencies, but to include the optional extra dependencies for other checks such as amqp, redis or postgres you need to use the build-wheels-extra target with the EXTRA env value.

By default all the wheels will be placed in ./wheels.

Automatically generating conn-check YAML configurations

The conn-check-configs package contains utilities/libraries for generating checks from existing application configurations and environments, e.g. from Django settings modules and Juju environments.

Tutorial Part 1: Checking connections for a basic web app

Hello World

Suppose you have the basic webapp HWaaS (Hello World as a Service, naturally).

It returns a different translation of “Hello World” on every request, and accepts new translations via POST requests.

  • The translations are stored in a PostgreSQL database.
  • memcached is used to keep a cache of pre-rendered “Hello World” HTML pages.
  • Optionally requests are sent to the Google Translate API to get an automatically translated version of the page in the user’s language if they push a certain button and a translation in their language isn’t available in the PostgreSQL DB.
  • The Squid HTTP proxy is sat between it and the Translate API to cache requests (varied by language), to avoid hitting Google’s rate limiting.

Why use conn-check?

Our HWaaS example service depends on not only 3 internal services, but also a completely external service (the Google Translate API), and any number of issues from network routing, firewall configuration and bad service configuration to external outages could cause issues after a new deployment (or at any time really, but we’ll address that later in Nagios).

conn-check can verify connections to these dependencies using not just basic TCP/UDP connects, but also service specific ones, with authentication where needed, timeouts, and even permissions (e.g. can user A access DB schema B).

Yet another YAML file

conn-check is configured using a YAML file containing a list of checks to perform in parallel (by default, but this too is configurable with a CLI option).

Here’s an example file (it could be called hwaas-cc.yaml):

- type: postgresql
  host: gibson.hwaas.internal
  port: 5432
  username: hwaas
  password: 123456asdf
  database: hwaas_production
- type: memcached
  host: freeside.hwaas.internal
  port: 11211
- type: http
  url: https://www.googleapis.com/language/translate/v2?q=Hello%20World&target=de&source=en&key=BLAH
  proxy_host: countzero.hwaas.internal
  proxy_port: 8080
  expected_code: 200

Let’s examine those checks..

PostgreSQL

- type: postgresql
  host: gibson.hwaas.internal
  port: 5432
  username: hwaas
  password: 123456asdf
  database: hwaas_production

type: This one doesn’t require much explanation, except the fact that you can use either postgresql` or postgres (many checks have aliases), see the readme..

host, port: The host to connect to is always, understandably, required, but if not supplied the default psql port of 5432 will be used.

username, password: Auth details are required and important when used with…

database: This is the psql schema to attempt to switch to use, and username has permission to access.

memcached

- type: memcached
  host: freeside.hwaas.internal
  port: 11211

type: memcache or memcached are valid, see the readme.

host, port: If port isn’t supplied the memcached default 11211 is used instead.

HTTP

- type: http
  url: https://www.googleapis.com/language/translate/v2?q=Hello%20World&target=de&source=en&key=BLAH
  proxy_host: countzero.hwaas.internal
  proxy_port: 8080
  expected_code: 200

type: http or https are valid, see the readme.

url: As we’re doing a simple GET to the Translate API I’ve included the key in the querystring, but you could also include auth defailts as HTTP headers using the headers check option.

proxy_host, proxy_port: We supply the host/port to our Squid proxy here, we could also use the proxy_url check option instead to define the proxy as a standard HTTP URL (makes it possible to define a HTTPS proxy).

expected_code: This is the status code we expect to get back from the service if the request was successful, anything other than 200 in this case will cause the check to fail.

Using conn-check with Nagios

conn-check output tries to stay as close as possible to the Nagios plugin guidelines so that it can be used as a regular Nagios check for more constant monitoring of your service deployment (not just ad-hoc at deploy time).

Example NRPE config files, assuming conn-check is system installed:

# /etc/nagios/nrpe.d/check_conn_check.cfg
command[conn_check]=/usr/bin/conn-check --max-timeout=9  --exclude-tags=no-nagios /var/conn-check/hwaas-cc.yaml


# /var/lib/nagios/export/service__hwaas_conn_check.cfg
define service {
    use                             active-service
    host_name                       hwaas-web1.internal
    service_description             connection checks with conn-check
    check_command                   check_nrpe!conn_check
    servicegroups                   web,hwaas
}

A few arguments to note:

--max-timeout=10: This sets the global timeout to 10 seconds, which means it will error if the total time for all checks combined goes above 9s, which will execute under the default max time allowed by Nagios for a plugin to run, 10s.

This way we still get all the individual check results back even if one of them went above the threshold.

--exclude-tags=no-nagios: Although optional, this allows you to exclude any check tagged with no-nagios, which is especially handy for checks to external/third-party services that you don’t want to be hit constantly by Nagios.

For example if we didn’t want Nagios to hit Google every few minutes:

- type: http
  url: https://www.googleapis.com/language/translate/v2?q=Hello%20World&target=de&source=en&key=BLAH
  proxy_host: countzero.hwaas.internal
  proxy_port: 8080
  expected_code: 200
  tags: [no-nagios]

Tutorial Part 2: Auto-generating conn-check config for a Django app

Hello World (again)

Let’s assume that you’ve actually created the Hello World service from part 1 as a Django app, and you think to yourself:

“Hang on, aren’t all these connections I want conn-check to check for me already defined in my Django settings module?”

conn-check-configs

Yes, yes they are, and with the handy-dandy conn-check-configs package you can automatically generate conn-check config YAML from a range of standard Django settings values (in theory from other environments too, such as Juju, but for now just Django).

exempli gratia

Given the following settings.py in our HWaaS app:

INSTALLED_APPS = [
    'hwaas'
]
DATABASES = {
    'default': {
            'ENGINE': 'django.db.backends.postgresql_psycopg2',
            'HOST': 'gibson.hwass.internal',
            'NAME': 'hwaas_production',
            'PASSWORD': '123456asdf',
            'PORT': 11211,
            'USER': 'hwaas',
}
CACHES = {
    'default': {
        'LOCATION': 'freeside.hwaas.internal:11211',
        'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache',
    },
}
PROXY_HOST = 'countzero.hwaas.internal'
PROXY_PORT = 8080
TRANSLATE_API_KEY = 'BLAH'

We can create a settings-to-conn-check.py script with the least possible effort like so:

#!/usr/bin/env python
from conn_check_configs.django import run

if __name__ == '__main__':
    run()

This will output postgresql and memcached checks to similar our hand-written config:

$ chmod +x settings-to-conn-check.py
$ ./settings-to-conn-check.py -f cc-config.yaml -m hwaas.settings
$ cat cc-config.yaml
- type: postgresql
  database: hwaas_production
  host: gibson.hwaas.internal
  port: 5432
  username: hwaas
  password: 123456asdf
- type: memcached
  host: freeside.hwaas.internal
  port: 11211

Customising generated checks

In order to generate the checks we need for Squid / Google Translate API, we can add some custom callbacks:

#!/usr/bin/env python
from conn_check_configs.django import run, EXTRA_CHECK_MAKERS


def make_proxied_translate_check(settings, options):
    checks = []
    if settings['PROXY_HOST']:
        checks.append({
            'type': 'http',
            'url': 'https://www.googleapis.com/language/translate/v2?q='
                   'Hello%20World&target=de&source=en&key={}'.format(
                       settings['TRANSLATE_API_KEY']),
            'proxy_host': settings['PROXY_HOST'],
            'proxy_port': int(settings.get('PROXY_PORT', 8080)),
            'expected_code': 200,
        })
    return checks

EXTRA_CHECK_MAKERS.append(make_proxied_translate_check)


if __name__ == '__main__':
    run()

In the above we define a callable which takes 2 params, settings which is a wrapper around the Django settings module, and options which is an object containing the command line arguments that were passed to the script.

The settings module is not the direct settings module but a dict-like wrapper so that you can access the settings just a like a dict (using indices, .get method, etc.)

To ensure make_proxied_translate_check is collected and called by the main run function we add it to the EXTRA_CHECK_MAKERS list.

The above generates our required HTTP check:

- type: http
  url: https://www.googleapis.com/language/translate/v2?q=Hello%20World&target=de&source=en&key=BLAH
  proxy_host: countzero.hwaas.internal
  proxy_port: 8080
  expected_code: 200

A note on statstd checks

Getting more operational visibility on how HWaaS runs would be great, wouldn’t it?

So let’s add some metrics collection using StatsD, and as luck would have it we can get a lot for nearly free with the django-statsd, after adding it to our dependencies we update our settings.py to include:

INSTALLED_APPS = [
    'hwaas'
    'django_statsd',
]
MIDDLEWARE_CLASSES = [
    'django_statsd.middleware.GraphiteMiddleware',
]
STATSD_CLIENT = 'django_statsd.clients.normal'
STATSD_HOST = 'bigend.hwaas.internal'
STATSD_PORT = 10021

Note: You don’t actually need the django-statsd app to have conn-check-configs generate statsd checks, only the use of STATSD_HOST and STATSD_PORT in your settings module matters.

Another run of our settings-to-conn-check.py script will result in the extra statsd check:

- type: udp
  host: bigend.hwaas.internal
  port: 10021
  send: conncheck.test:1|c
  expect:

As you can see this is just a generic UDP check that attempts to send an incremental counter metric to the statsd host.

Unfortunately the fire-and-forget nature of this use of statsd/UDP will not error in a number of common situations (the simplest being that statsd is not running on the target host, or even a routing issue along the way).

It will catch simple problems such as not being able to open up the local UDP port to send from, but that’s usually not enough.

If you use a third-party implementation of statsd, such as txStatsD then you might have the ability to define a pair of health check strings, for example by changing the send and expect values in the STATSD_CHECK dict we can send and expect different strings:

#!/usr/bin/env python
from conn_check_configs.django import run, STATSD_CHECK

STATSD_CHECK['send'] = 'Hakuna'
STATSD_CHECK['expect'] = 'Matata'

if __name__ == '__main__':
    run()

Which generates this check:

- type: udp
  host: bigend.hwaas.internal
  port: 10021
  send: Hakuna
  expect: Matata

In the above we would configure our txStatD (for example) instance to respond to the string Hakuna with the string Matata, which would catch pretty much all the possible issues with contacting our metrics service.

Tutorial Part 3: Adding conn-check to Juju deployed services

Juju

Juju is an open source service orientated framework and deployment toolset from Canonical, given conn-check is also by Canonical you might expect there is an easy yet flexible way to add conn-check to your Juju environment.

You’d be right…

Adding conn-check charm support to your apps charm

The conn-check charm is a subordinate charm that can be added alongside your applications charm, and will install/configure conn-check on your application units.

To enable support for the conn-check subordinate in your applications charm you need to implement the conn-check-relation-changed hook, e.g.:

#!/bin/bash
set -e
CONFIG_PATH=/var/conn-check.yaml

juju-log "Writing conn-check config to ${CONFIG_PATH}"
/path/to/hwaas/settings-to-conn-check.py -f ${CONFIG_PATH} -m hwaas.settings

# Ensure conn-check and nagios can both access the config file
chown conn-check:nagios ${CONFIG_PATH}
chmod 0660 ${CONFIG_PATH}

# Set the config path, we could also tell the conn-check charm
# to write the config file for us by setting the "config" option
# but this is deprecated in favour of writing the file ourselves
# and setting "config_path"
relation-set config_path="${CONFIG_PATH}"

You may note that we set the user to conn-check and the group to nagios, you can actually get away with just setting the group to nagios as this will give both conn-check and nagios access to the config file, but you might as well set the user anyway otherwise it’s likely to be root.

You’ll also need to tell Juju your charm provides the conn-check relation in your metadata.yaml:

provides:
    conn-check:
        interface: conn-check
        scope: container

When deploying conn-check with your service you then deploy the subordinate, relate it to your service (you can also optionally set it as a Nagios provider):

$ juju deploy cs:~ubuntuone-hackers/trusty/conn-check my-service-conn-check
$ juju set my-service-conn-check revision=108 # pin to the rev of conn-check you want to use
$ juju add-relation my-service my-service-conn-check

Nagios

The conn-check charm provides the nrpe-external-master relation which means it can act as a Nagios plugin executor, so if you have a Nagios master in your environment for monitoring then conn-check can be regularly run along with your other monitoring checks to ensure your environments connections are as you expect them to be.

To set this up you need to relate the deployed subordinate to your servie nrpe:

$ # assuming something like:
$ # juju deploy nagios nagios-master
$ # juju deploy nrpe my-service-nrpe
$ # juju add-relation my-service:monitors my-service-nrpe:monitors
$ juju add-relation my-service-conn-check my-service-nrpe

For more details on Juju and Nagios you can see this handy blog post.

Actions

To manually run conn-check on all units, or a single unit, you can use the supplied run-check and run-nagios-check actions:

$ # all checks on all units
$ juju run --service my-service-conn-check 'actions/run-check'
$ # all checks on just unit 0
$ juju run --service my-service-conn-check/0 'actions/run-check'
$ # nagios (not including no-nagios) checks on all units
$ juju run --service my-service-conn-check 'actions/run-nagios-check'
$ # nagios (not including no-nagios) checks on just unit 0
$ juju run --service my-service-conn-check/0 'actions/run-nagios-check'

Note: before Juju 1.21 there is a bug which prevents juju-run from working with subordinate charms, you can work around this with juju-ssh:

$ # all checks on just unit 0
$ juju ssh my-service-conn-check/0 'juju-run my-service-conn-check/0 actions/run-check'

ChangeLog for conn-check

1.3.2 (2016-08-15)

  • Changed global and connect timeouts to 9s and 5s, respectively, to ensure execution within Nagios’ 10s NRPE timeout, out of the box.
  • Changed Twisted version pinning for wheel to <16 due to incompatibility with latest TLS connection handling.

1.3.1 (2015-08-11)

  • Added guards for port numbers and the HTTP checks expected_code to cast any given value to an int.

1.3.0 (2015-07-15)

  • Added new conn-check-convert-fw tool to generate aws/neutron/nova/iptables rule commands from YAML exported by conn-check-export-fw.

1.2.0 (2015-06-19)

  • Added new smtp check to test auth/sending with SMTP servers.

1.1.0 (2015-06-05)

  • Added new conn-check-export-fw tool to export firewall egress rules in a YAML format.
  • Refactored CLI command handling code to make it easier to extend/override.

1.0.18 (2015-04-13)

1.0.17 (2015-04-08)

  • Unpin python-requests for wider distribution (e.g. precise).

1.0.16 (2015-03-06)

  • Add –include-tags and –exclude-tags args with support for the tags YAML check field.

1.0.15 (2015-02-24)

  • Package manifest fixes for debian package release.

1.0.13 (26-11-2014)

  • Output is not buffered and ordered, with FAILED checks first, skipped last, and each check grouped by {type}:{host/url}.
  • TCP subchecks triggered by a HTTP check are prefixed as such.
  • There is now a -U/–unbuffered-output option to disable buffered/ordered output and write out to STDOUT as soon as a result is collected.

1.0.12 (17-11-2014)

  • Command aliasing refactored, and more aliases added.

1.0.11 (04-11-2014)

  • Disabled 30x redirects in HTTP checks by default, fixing regression introduced by requests switch.
  • Added python-requests specific options for proxy, param, cookie and auth control in HTTP checks.

1.0.10 (30-10-2014)

  • Added a mongodb check type.

1.0.9 (23-10-2014)

  • Added –max-timeout CLI option to restrict maximum execution time.
  • Added connection timeouts to HTTP and PostgreSQL checks.
  • Added –connect-timeout CLI option to set global connection timeout.
  • Added timeout option to each individual check to override global connection timeout.

1.0.8 (22-10-2014)

  • Switched to using txrequests for HTTP requests with better proxy support.
  • Fixed UDP checks targetting host rather than IP if available.
  • Fixed initial TCP check for HTTP checks targetting upstream instead of proxy.

1.0.7 (09-10-2014)

  • Fixed HTTP proxy error in HTTP checks due to typo.

1.0.6 (06-10-2014)

  • Fixed dependencies when installing from local dir.
  • Made improvements to readme.

1.0.5 (03-10-2014)

  • Added optional headers and body arguments to HTTP checks.

1.0.4 (29-09-2014)

  • Added HTTP proxy support to http checks
  • Fixed issue with loading duplicate SSL CA certificates, and added flag to load from a custom dir

1.0.3 (24-09-2014)

  • Removed config_generators module to it’s own package: conn-check-configs

1.0.2 (22-09-2014)

  • Added a script to auto-generate conn-check config YAML from a Django settings module

1.0.1 (18-09-2014)

  • Trivial release to fix setup.py tags

1.0.0 (18-09-2014)

  • Initial release
  • Broken free of UbuntuOne
  • Nagios compatible output
  • YAML configuration

Indices and tables