Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability. Developed by SoundCloud and now a part of the Cloud Native Computing Foundation, Prometheus has become a leading choice for system and application monitoring. This guide will walk you through installing, configuring, and using Prometheus effectively.
What is Prometheus?
Prometheus is a powerful system monitoring and alerting toolkit that:
- Collects and stores metrics as time-series data.
- Uses a powerful query language called PromQL to aggregate and query metrics.
- Supports multiple modes of graphing and dashboarding.
- Integrates with numerous third-party tools and services.
Getting Started with Prometheus
1. Installation and Setup
Step 1: Download Prometheus
- Visit the Prometheus download page and download the latest release for your operating system.
Step 2: Install Prometheus
- Extract the downloaded archive and navigate to the directory.
- You should see binaries like
prometheus
andpromtool
.
Step 3: Configure Prometheus
- Create a configuration file named
prometheus.yml
. Here’s an example configuration:
global:
scrape_interval: 15s # Set the scrape interval to 15 seconds.
evaluation_interval: 15s # Evaluate rules every 15 seconds.
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090'] # The Prometheus server itself.
Step 4: Start Prometheus
- Run the Prometheus server:
./prometheus --config.file=prometheus.yml
- Access the Prometheus web UI at
http://localhost:9090
.
2. Collecting Metrics
Prometheus scrapes metrics from HTTP endpoints. Applications need to expose metrics in a format that Prometheus understands.
Step 1: Exporting Metrics
- Use client libraries available for various programming languages to instrument your code.
- Go: prometheus/client_golang
- Java: prometheus/client_java
- Python: prometheus/client_python
- Node.js: prometheus/client_node
Example (Python)
- Install the client library:
pip install prometheus-client
- Instrument your application:
from prometheus_client import start_http_server, Summary
import random
import time
# Create a metric to track time spent and requests made.
REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')
# Decorate function with metric.
@REQUEST_TIME.time()
def process_request(t):
time.sleep(t)
if __name__ == '__main__':
start_http_server(8000)
while True:
process_request(random.random())
Step 2: Configure Prometheus to Scrape Your Application
- Update your
prometheus.yml
configuration file:
scrape_configs:
- job_name: 'python_app'
static_configs:
- targets: ['localhost:8000']
3. Querying Metrics with PromQL
PromQL is a powerful query language used to aggregate and retrieve time-series data.
Basic Queries
- Instant Vector:
up
- Range Vector:
up[5m]
- Aggregation:
sum(rate(http_requests_total[1m]))
- Label Filtering:
http_requests_total{job="python_app"}
Step 1: Access Prometheus UI
- Navigate to the
Graph
tab in the Prometheus web UI.
Step 2: Run a Query
- Enter a query in the query box and click “Execute”. For example:
rate(http_requests_total[5m])
- This query calculates the per-second rate of HTTP requests over the last 5 minutes.
4. Setting Up Alerts
Prometheus allows you to define alerting rules and integrates with Alertmanager for handling alerts.
Step 1: Define Alerting Rules
- Create a file named
alert.rules.yml
:
groups:
- name: example
rules:
- alert: HighErrorRate
expr: rate(http_requests_total{status="500"}[5m]) > 0.05
for: 10m
labels:
severity: page
annotations:
summary: "High error rate detected"
description: "Error rate is greater than 5% for the last 10 minutes."
Step 2: Configure Prometheus to Use the Alerting Rules
- Update your
prometheus.yml
:
rule_files:
- "alert.rules.yml"
Step 3: Install and Configure Alertmanager
- Download Alertmanager from the Prometheus download page.
- Create a configuration file for Alertmanager,
alertmanager.yml
:
global:
resolve_timeout: 5m
route:
receiver: 'email'
receivers:
- name: 'email'
email_configs:
- to: 'you@example.com'
from: 'alertmanager@example.com'
smarthost: 'smtp.example.com:587'
auth_username: 'alertmanager@example.com'
auth_identity: 'alertmanager@example.com'
auth_password: 'password'
Step 4: Start Alertmanager
- Run Alertmanager:
./alertmanager --config.file=alertmanager.yml
Step 5: Configure Prometheus to Send Alerts to Alertmanager
- Update your
prometheus.yml
:
alerting:
alertmanagers:
- static_configs:
- targets: ['localhost:9093']
5. Visualizing Metrics
Prometheus does not include advanced visualization capabilities. Instead, it integrates seamlessly with Grafana for advanced dashboarding.
Step 1: Install Grafana
- Download Grafana from the Grafana website.
Step 2: Start Grafana
- Follow the installation instructions and start the Grafana server.
Step 3: Add Prometheus as a Data Source
- Log in to Grafana (default
http://localhost:3000
, admin/admin). - Go to “Configuration” > “Data Sources”.
- Click “Add data source” and select “Prometheus”.
- Configure the URL (e.g.,
http://localhost:9090
) and save.
Step 4: Create a Dashboard
- Go to “Dashboards” > “New Dashboard”.
- Click “Add new panel” and use PromQL to query Prometheus metrics.
- Customize the panel with different visualization options and save the dashboard.