Setting Up A Comprehensive Monitoring System With Prometheus

Monitoring and alerting is a key component of any system. It is very important to identify and rectify any errors or issues in a live system to ensure a reliable and continuous user experience. So this is the first of several articles to implement a complete monitoring and alerting setup to your service using some of the leading and comprehensive open source products available in the market.

What is Prometheus?

This is an all-in-one package to monitor your entire system for resource utilization, performance, availability, errors and many more. It records real-time metrics in a time series database built using a HTTP pull model, with flexible queries and real-time alerting.

To simplify further, Prometheus stores all data as streams of timestamped values belonging to the same metric and the same set of labeled dimensions. Every time series is uniquely identified by its metric name and optional key-value pairs called labels.

Prometheus includes a local on-disk time series database, but also optionally integrates with remote storage systems. These time series data is stored in the disk in a custom format and the corresponding directory structure is as below.

./data
├── 01BKGV7JBM69T2GIITC864HV
│ └── meta.json
├── 01BKGTZQ1SYQJTRKJJLLKS85HF6
│ ├── chunks
│ │ └── 000001
│ ├── tombstones
│ ├── index
│ └── meta.json
├── 01BKGTZQ1HHWHV8FB75GC75JHC
│ └── meta.json
├── 01BKGV7JC0RY8A864HCE24HD09
│ ├── chunks
│ │ └── 000001
│ ├── tombstones
│ ├── index
│ └── meta.json
└── wal
├── 00000002
└── checkpoint.000001

Prometheus provides a functional query language called PromQL (Prometheus Query Language) that lets the user select and aggregate time series data in real time. The result of an expression can either be shown as a graph (which is not sophisticated and user friendly), viewed as tabular data in Prometheus’s expression browser, or consumed by external systems via the HTTP API.

The most fascinating things about Prometheus are it provides a reliable alerting system with an alert-manager service, numerous integrations with external client libraries and exporters. It collects metrics from configured targets at given intervals, evaluates alerting rule expressions, displays the results, and can trigger alerts if certain condition are satisfied.

Different metrics in Prometheus

There are four main types of metrics in Prometheus as below.

  • Counter - A cumulative metric that represents a single monotonically increasing counter whose value can only increase or be reset to zero on restart
  • Gauge - A metric that represents a single numerical value that can arbitrarily go up and down.
  • Histogram - A histogram samples observations (usually things like request duration or response sizes) and counts them in configurable buckets. It also provides a sum of all observed values.
  • Summary - This metric samples observations (usually things like request duration and response sizes). While it also provides a total count of observations and a sum of all observed values, it calculates configurable quantiles over a sliding time window.

Steps to Integrate your application with Prometheus.

Below are the steps to integrate your Java application with Prometheus server. Similar to this, you can integrate any of the other applications written by other languages simply by following the below steps with the corresponding libraries and packages relevant for your application.

1) First we need to add the Actuator and Prometheus dependencies to your application pom.xml file to generate and expose metrics from the application. This will expose a set of default metrics for your application.

<! — Actuator dependency →
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<! — Prometheus dependency →
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

2) Then expose the Prometheus metrics endpoint for the application by adding the below configurations to the application.yml

management:
endpoints:
web:
exposure:
include : health,prometheus
base-path: /monitor
enabled-by-default: true

Now the application is running with Prometheus metrics exposed for external service consumption and you should be able to view all the default metrics of your application through the web browser as below.

Prometheus default metrics created for your application

3) Now download the Prometheus server relevant for your OS and integrate it to your application to consume the exposed metrics. For that, we need to update the prometheus.yml file of Prometheus server to add a new endpoint for our application as given below.

# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: ‘your_service’
scrape_interval: ‘10s’
metrics_path: ‘/monitor/prometheus’
static_configs:
— targets: [‘localhost:8090’]

4) Now the Prometheus will start processing the metrics created by your application and storing it in a time-series database. We can use Prometheus queries to view the results. (These queries can be directly used in Grafana dashboards as well)

Prometheus will detect the service availability using a metric created from the your configuration data.

Your service connectivity with Prometheus

To can even check the service availability using the below query. It should return 1 if your application is up and running.

up{job=”your_service”}
Your service availability metric in Prometheus

Custom Metrics for your application

You can even create your own custom metrics in your application and expose those to the Prometheus server. Below example demonstrates the steps to create a Counter metric for Prometheus server in your Spring boot application.

  • First you need to create a metrics manager to your application. All your custom metrics will be implemented inside this class.
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.MeterRegistry;

@Component
public class MetricsManager {

private static final String METRIC_TAG = "filter";

@Autowired
private MeterRegistry meterRegistry;

public MetricsManager(MeterRegistry meterRegistry) {
this.meterRegistry = meterRegistry;
}

public void initMetricsCounter(String filterTag) {
Counter counter;
counter= this.meterRegistry.counter("api.request.", METRIC_TAG, filterTag );
counter.increment(1.0);
}
}
  • Now you can use your metrics manager to instantiate and generate values for your metrics.
@Component
public class RequestProcessor {

@Autowired
MetricsManager metricsManager;

@Override
public void process(Exchange exchange) throws Exception {
metricsManager.initMetricsCounter("TCP");
}
}

You can view the metrics in Prometheus server as below.

Your custom metric in Prometheus server
Your custom metric in Prometheus dashboard

In the next article we will create some alerts from these metrics and trigger it to Emails and Ms Teams. You can read more about Prometheus at here.

Thanks!

Senior Software Engineer | BSc (Hons) Engineering | CIMA | Autodidact | Knowledge-Seeker