Prometheus discovery with Brightbox server groups

Prometheus: Titan god of fire and distributed metrics collection.

Prometheus is a metrics gathering system used for monitoring and alerting. It discovers your systems, pulls in metrics from them and can generate alerts when certain thresholds or patterns are met. For example, it can alert you when a filesystem is getting full, or your web service response times have increased unexpectedly.

It’s been gathering momentum in the Kubernetes-world lately but isn’t exclusive to that ecosystem and can be put to great use everywhere you need to monitor systems.

As opposed to systems like statsd or collectd, Prometheus works primarily on a pull model and connects out to your systems (which it calls targets) to read metrics. So you need to ensure that Prometheus knows about your servers (discovery) and is able to connect to them (access).

Dynamic discovery and access

Prometheus can be statically configured with each server you want to monitor, but modern cloud-driven systems are often dynamic and auto-discovery of servers is preferable. You can easily get this working on Brightbox, while automatically setting up firewalling too: the trick is to use a server group to setup the firewall access for Prometheus, and then have Prometheus use that same server group’s DNS name to discover the servers.

Then all you have to do to start monitoring a server with Prometheus is to put it in a specific server group, and the rest happens automatically.

Setup a Prometheus server

Firstly let’s quickly setup a Prometheus server. We’ll use our CLI here, but you can do all this with Brightbox Manager GUI, or Terraform if you prefer.

Create a server group and associated firewall policy for your Prometheus servers:

$ brightbox group create --name="prometheus servers"

Creating a new server group

 id         server_count  name              
 grp-czxmr  0             prometheus servers

And create some firewall rules to allow you to SSH in and allow the server to connect out:

$ brightbox firewall-policies create --name="prometheus servers" grp-czxmr

 id         server_group  name              
 fwp-wkigv  grp-czxmr     prometheus servers

$ brightbox firewall-rules create --source=any --protocol=tcp --dport=22 fwp-wkigv

 id         protocol  source  sport  destination  dport  icmp_type  description
 fwr-rynxl  tcp       any     -      -            22     -                     

$ brightbox firewall-rules create --destination=any fwp-wkigv

 id         protocol  source  sport  destination  dport  icmp_type  description
 fwr-cckb8  -         -       -      any          -      -                     

Build an Ubuntu 20.04 (Focal) server in that group:

$ brightbox images | grep focal

 img-uitch  brightbox  official  2020-09-08  public   2252   ubuntu-focal-20.04-amd64-server (x86_64)         

$ brightbox server create --server-groups=grp-czxmr --name="prometheus server" img-uitch

Creating a 1gb.ssd (typ-w0hf9) server with image ubuntu-focal-20.04-amd64-server (img-uitch) in groups grp-czxmr

 id         status    type     zone   created_on  image_id   cloud_ip_ids  name             
 srv-vzcpl  creating  1gb.ssd  gb1-a  2020-09-14  img-uitch                prometheus server

SSH in (if you don’t have IPv6 then you’ll need to create and map a Cloud IP to the server first to give it a public IPv4 address) and install the Prometheus server package:

$ ssh ubuntu@ipv6.srv-vzcpl.gb1.brightbox.com

ubuntu@srv-vzcpl:~$ sudo apt-get install -qy prometheus

Setup a targets server group and firewall policy

Now to setup the targets. I’m going to assume you’re using the Prometheus Node exporter (install the prometheus-node-exporter package) which listens on TCP port 9100 but if you’re using other exporters, setup the appropriate firewall rules.

So, create another server group and an associated firewall policy:

$ brightbox group create --name="prometheus targets"

Creating a new server group

 id         server_count  name              
 grp-4zxb6  0             prometheus targets

$ brightbox firewall-policies create --name="prometheus targets" grp-4zxb6

 id         server_group  name              
 fwp-rfzwh  grp-4zxb6     prometheus targets

Then create a firewall rule to allow access to your exporter from your Prometheus servers (use the “Prometheus servers” server group created earlier):

$ brightbox firewall-rules create --source=grp-czxmr --protocol=tcp --dport=9100 fwp-rfzwh

 id         protocol  source     sport  destination  dport  icmp_type  description
 fwr-6chx0  tcp       grp-czxmr  -      -            9100   -                     

Configure Prometheus to discovery targets by server group DNS

Now hop back onto your Prometheus server to edit the config /etc/prometheus/prometheus.yml and tell it to check the target server group DNS for new targets:

  - job_name: grp-4zxb6
      - port: 9100
        type: 'A'
          - grp-4zxb6.gb1.brightbox.com

and reload prometheus:

$ sudo service prometheus reload

At this point, Prometheus will start periodically resolving the grp-4zxb6.gb1.brightbox.com DNS name to check for new members and will start monitoring them.

Start monitoring a target

So, to start monitoring a server, just add it to the “prometheus targets” group. The appropriate firewall rules will be immediately applied so Prometheus can reach it, and Prometheus will discover the new server within a minute or so and start collecting metrics.

You can run some queries the Prometheus server itself to confirm:

Check system load:

$ promtool query instant http://localhost:9090 'node_load1{group="grp-4zxb6.gb1.brightbox.com"}'

node_load1{instance="", job="grp-4zxb6"} => 0 @[1600101658.181]
node_load1{instance="", job="grp-4zxb6"} => 0 @[1600101658.181]

Check root filesystem free space:

$ promtool query instant http://localhost:9090 'node_filesystem_free{job_name="grp-4zxb6",mountpoint="/"}'

node_filesystem_free{device="/dev/vda1", fstype="ext4", instance="", job="grp-4zxb6", mountpoint="/"} => 59861299200 @[1600100976.016]
node_filesystem_free{device="/dev/vda1", fstype="ext4", instance="", job="grp-4zxb6", mountpoint="/"} => 59625115648 @[1600100976.016]


As Prometheus is using our group DNS system and obtaining A records, the instance labels have IP addresses rather than the more convenient server identifiers. Prometheus can instead query SRV records to retrieve server names, so we’re looking at adding support for that in future.

And remember you can customize labels in the job using relabel configs. We use that to label things as staging or production, and sometimes to add explicit project and role labels:

  - job_name: grp-4zxb6
      - port: 9100
        type: 'A'
          - grp-4zxb6.gb1.brightbox.com
      - target_label: environment
        replacement: staging
      - target_label: project
        replacement: myapp
      - target_label: role
        replacement: webserver

More on Prometheus

Now, Prometheus can do a lot more; it keeps time-series metric data so you can see the data over time, but you’ll probably want a UI for that - take a look at Grafana.

And Prometheus is usually paired with the Alert Manager, to periodically run definied queries and send alerts when certain conditions are met. There are packages for that in Ubuntu too, so it’s not too hard to get up and running.

So there is lots more for you to explore.

Give Prometheus a go on Brightbox

If you want to play with Prometheus, you can sign up for Brightbox in just a couple of minutes and use your ÂŁ50 free credit to give it a go.

Managed services

If instead you want us to run Prometheus for you, or anything else for that matter, we offer managed services and hands-on support. Drop us a line.

