Monitoring Magic: How Prometheus and cAdvisor Keep Your App in Check!
Turning app monitoring into a smooth, data-driven ride.
Monitoring (WHY is your application slow, off, etc)
Definition: Collecting and analyzing metrics over time to identify system performance trends, availability, and resource utilization.
Purpose: Alerts when predefined thresholds are breached, ensuring system reliability.
Example Metrics: CPU usage, memory consumption, request latency, error rates.
Tools: Prometheus, Datadog, Grafana.
Today, we will explore Prometheus.
What is Prometheus?
Prometheus is a Timeseries Database.
Timeseries DB: A time series database (TSDB) stores data points collected over time, with each entry having a timestamp. It is designed to analyze trends or patterns, such as CPU usage or stock prices, over time.
Prometheus includes components like a query server to store data. The scraper collects data from various sources and stores it in the data store, while the query server provides output to the user.
SoundCloud originally developed Prometheus as a time series database for monitoring. Later, Prometheus was contributed to the Cloud Native Computing Foundation (CNCF), where it became an official tool and was released as open source.
Prometheus is a time series database that gathers data from various sources. It includes a query server that processes this data and sends the output to visualization tools like Grafana.
What is Grafana?
- Grafana is a tool for visualizing and monitoring data. It creates dashboards and graphs to help you see and understand your data from sources like Prometheus or databases.
Create a Prometheus and Grafana stack, deploy an application, and monitor it.
Lets see the practical:
Steps:
- Follow below steps to create an EC2 instance:
- Assign any name to the instance and select the Ubuntu AMI.
- Select the t2.medium instance type because running both Prometheus and Grafana requires at least this level of resources. Additionally, create a
.pem
key for secure access.
- Allow HTTP and HTTPS traffic and allocate 15 GB of storage and click on Launch instance.
- Connect with the Instance via SSH.
- Update the system and install Docker. Then, create a directory named "observability."
#Update the system
sudo apt-get update
#Install docker and docker compose
sudo apt-get install docker.io docker-compose-v2 -y
#Give permission to newly created group
sudo usermod -aG docker $USER && newgrp docker
#Check the containers
docker ps
#Check docker compose version
docker compose version
#Create a directory
mkdir observability
cd observability
- To begin monitoring, we need an application. We will use our django-notes-app for this purpose.
git clone https://github.com/Chetan-Mohod/django-notes-app.git
cd django-notes-app/
#Switch from main branch to dev branch
git checkout dev
- Now create a Docker Image:
#Create an image
docker build -t notes-app .
#Create and start a container
docker run -d -p 8000:8000 notes-app
- Go to our AWS EC2 instance → Security Rules → Edit Inbound Rules → Add an entry for port 8000. As shown in below image.
Take the public IP address and enter it into the browser, followed by port number 8000.
You can see the app is running:
#Docker stop
docker stop container_ID
- Create
docker-compose.yml
& run:
version: "3.8"
services:
notes-app:
build:
context: django-notes-app/. #Dockerfile is inside django-notes-app
container_name: notes-app
ports:
- "8000:8000"
docker compose up
You can see that the application is running with Docker Compose. We will automate our application using only Docker Compose.
- Now we will move on to checking the logs, as shown in the image below.
- Where do you check Docker logs? —> The answer is Prometheus.
Let's proceed with the following steps:
Where can we find Prometheus?
Open your browser and search for Prometheus Docker Compose.
We need to add the Prometheus service/container to our
docker-compose.yml
file.
#vim docker-compose.yml
version: "3.8"
networks: #For the communication of two containers (notes-app & prometheus)
monitoring:
driver: bridge
services:
notes-app:
build:
context: django-notes-app/. #Dockerfile is inside django-notes-app
container_name: notes-app
ports:
- "8000:8000"
networks:
- monitoring
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
networks:
- monitoring
Explanation:
Prometheus uses the prometheus.yml
file to configure scrape jobs, which define where Prometheus collects metrics from (like targets and intervals). By default, this file should be at /etc/prometheus/prometheus.yml
inside the container. Using the volumes
line in docker-compose.yml
, you map your custom prometheus.yml
from the host to the container, allowing you to define or update scrape configurations easily without modifying the container.
First, we need to install or download the file. How do we create it? Refer to this documentation.
#prometheus.yml
global:
scrape_interval: 1m
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 1m
static_configs:
- targets: ['localhost:9090']
- job_name: 'docker'
scrape_interval: 1m
static_configs:
- targets: ['localhost:8080']
Every minute, prometheus
job name hitting on localhost:9090
and gathering the data.
- Now our configuration file has been created with the name
prometheus.yml
, now we can add this file into our prometheus container(docker-compose.yml
).
version: "3.8"
networks: #For the communication of two containers (notes-app & prometheus)
monitoring:
driver: bridge
services:
notes-app:
build:
context: django-notes-app/. #Dockerfile is inside django-notes-app
container_name: notes-app
ports:
- "8000:8000"
networks:
- monitoring
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
networks:
- monitoring
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--web.enable-lifecycle'
ports:
- "9090:9090"
docker compose up -d
Docker Compose is working. To verify this, expose 9090 in your inbound rules on AWS EC2 instance.
Then copy your public_ip
, paste it into your browser, and check the results. You should see the Prometheus Time Series Database. Prometheus server is up & running.
How to get data on this server?
- To access the data, navigate to
https://public_ip:9090/metrics
. Here, you will find the metrics data stored by Prometheus.
- To access the data, navigate to
So if you want to see this data in prometheus
Click on Stats→ Target health. Target health will show all the end points
- It will display that
localhost:9090
is active, which indicates that Prometheus is running. However, our Docker setup is not operating onlocalhost:8080
because there is no Docker service there. The services running inside the Prometheus container are accessible vialocalhost
. If a service is running in a different container within the same network, you can access it using thecontainer_name
.
Here's an important point to consider: different Docker containers may run on various ports, such as 80, 8080, 9000, and 3000. How many endpoints can you realistically add? For example, if you have 6 Docker containers, each running on different ports, you would need to add 6 endpoints. If you have 100 containers, you would need to add 100 endpoints which is not feasible. Ideally, there should be a single solution, which is why we will use cAdvisor.
cAdvisor (short for Container Advisor) is a tool made by Google, cAdvisor collects metrics for each container on a host and exports them to Prometheus, which centralizes and visualizes data for all containers across multiple hosts.
- Lets create another service for cAdvisor: Refer this documentation
#vim docker-compose.yml
version: "3.8"
networks: #For the communication of two containers (notes-app & prometheus)
monitoring:
driver: bridge
volumes: #prometheus volume
prometheus_data:
services:
notes-app:
build:
context: django-notes-app/. #Dockerfile is inside django-notes-app
container_name: notes-app
ports:
- "8000:8000"
networks:
- monitoring
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
networks:
- monitoring
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus #Save prometheus volume here, which we declare at top
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--web.enable-lifecycle'
ports:
- "9090:9090"
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: cadvisor
ports:
- "8080:8080"
volumes:
- /:/rootfs:ro #cadvisor directly access root file system due to that all container data it gathers
- /var/run:/var/run:rw #Data of all tghe running processes it gather
- /sys:/sys:ro #
- /var/lib/docker/:/var/lib/docker:ro #Docker related files
depends_on:
- redis
networks:
- monitoring
redis:
image: redis:latest
container_name: redis
ports:
- "6379:6379" #Redis port
cAdvisor directly accesses the root file system. This allows it to gather all container data effectively.
Here’s a concise explanation of the cAdvisor
configuration:
1. cadvisor:
Defines the cAdvisor
service in Docker Compose.
2. image:
gcr.io/cadvisor/cadvisor:latest
Uses the latest cAdvisor image from Google Container Registry.
3. container_name: cadvisor
Names the container cadvisor
for easy identification.
4. ports:
"8080:8080"
: Exposes cAdvisor on port 8080 for access to the UI or metrics.
5. volumes:
Mounts directories for cAdvisor to gather system and container data:
/-:/rootfs:ro
: Read-only access to the root filesystem for container stats./var/run:/var/run:rw
: Access to process runtime data and Docker socket./sys:/sys:ro
: Read-only access to system performance metrics (e.g., CPU/memory)./var/lib/docker/:/var/lib/docker:ro
: Access to Docker data for metadata and resource tracking.
This setup allows cAdvisor to monitor all containers on the host effectively.
- cAdvisor utilizes a Redis cache to operate efficiently, which means you need to set up a Redis server. Redis is a key-value pair data store, ideal for storing temporary data. Since cAdvisor handles a large amount of container data, it uses Redis for caching purposes.
- Lets run it now, first down the container then apply the changes:
docker compose down
docker compose up
- Our cAdvisor runs on port 8080. Add an entry for port 8080 in our EC2 inbound rules. Then, copy
public_ip:8080
and paste it into your browser. You can see cAdvisor is running:
- Click on Docker Containers, you can see notes-app is available here.
- Here you can see real time monitoring:
Now this real-time data cAdvisor provides to prometheus, how?
In Prometheus, you may notice that it is not displaying data for our other containers, indicating they are down. To resolve this, you need to add the cAdvisor container name to the
prometheus.yml
file. This will enable Prometheus to fetch data for all containers, as it currently only works with the localhost. By including the container names for other services in the file, Prometheus will be able to access their data.
- To view the endpoint in Prometheus, we will add the container name to the
prometheus.yml
file.
global:
scrape_interval: 1m
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 1m
static_configs:
- targets: ['localhost:9090']
- job_name: 'docker'
scrape_interval: 1m
static_configs:
- targets: ['cadvisor:8080'] #cadvisor = container name
- To restart the container, first stop it and then start it again:
docker compose down
docker compose up -d
Now, you can observe that our Docker container is running successfully.
This is the method for adding any Docker container to Prometheus.
Now that we have gathered all the data from our target, the next step is to monitor this information. How can we do this?
- To monitor the gathered data, click on "Query server" and enter some queries, which you can find in your browser.
#Queries
rate(container_cpu_usage_seconds_total{name="notes-app"}[5m]) * 100
rate(container_network_receive_bytes_total{name="notes-app"}[30m]) + rate(container_network_transmit_bytes_total{name="notes-app"}[30m])
- Now, if we have a server such as a Kubernetes cluster or a node (server), and we want to collect logs from it, we need to use a node exporter in our
docker-compse.yml
file. For detailed instructions, please refer to this documentation.
#vim docker-compse.yml
version: "3.8"
networks: #For the communication of two containers (notes-app & prometheus)
monitoring:
driver: bridge
volumes: #prometheus volume
prometheus_data:
services:
notes-app:
build:
context: django-notes-app/. #Dockerfile is inside django-notes-app
container_name: notes-app
ports:
- "8000:8000"
networks:
- monitoring
prometheus:
image: prom/prometheus:latest
container_name: prometheus
restart: unless-stopped
networks:
- monitoring
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- prometheus_data:/prometheus #Save prometheus volume here, which we declare at top
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--web.console.libraries=/etc/prometheus/console_libraries'
- '--web.console.templates=/etc/prometheus/consoles'
- '--web.enable-lifecycle'
ports:
- "9090:9090"
cadvisor:
image: gcr.io/cadvisor/cadvisor:latest
container_name: cadvisor
ports:
- "8080:8080"
volumes:
- /:/rootfs:ro #cadvisor directly access root file system due to that all container data it gathers
- /var/run:/var/run:rw #Data of all tghe running processes it gather
- /sys:/sys:ro #
- /var/lib/docker/:/var/lib/docker:ro #Docker related files
depends_on:
- redis
networks:
- monitoring
redis:
image: redis:latest
container_name: redis
ports:
- "6379:6379"
node-exporter:
image: prom/node-exporter:latest
container_name: node-exporter
restart: unless-stopped
volumes:
- /proc:/host/proc:ro
- /sys:/host/sys:ro
- /:/rootfs:ro
command:
- '--path.procfs=/host/proc'
- '--path.rootfs=/rootfs'
- '--path.sysfs=/host/sys'
- '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
ports:
- "9100:9100"
networks:
- monitoring
cAdvisor gathers container data and sends it to port 8080. In a similar manner, the node exporter collects all system or node-level information and makes it available on port 9100.
- To connect our
node exporter
target on port 9100 to Prometheus, we need to add an entry fornode exporter
in theprometheus.yml
file. And apply the changes.
#vim prometheus.yml
global:
scrape_interval: 1m
scrape_configs:
- job_name: 'prometheus'
scrape_interval: 1m
static_configs:
- targets: ['localhost:9090']
- job_name: 'docker'
scrape_interval: 1m
static_configs:
- targets: ['cadvisor:8080'] #cadvisor = container name
- job_name: 'node'
scrape_interval: 1m
static_configs:
- targets: ['node-exporter:9100'] #node-exporter = container name
docker compose up -d
- The node exporter is now visible in Prometheus.
- Now we can run queries by using these commands or browse through the commands:
#Stop our containers
docker compose down
You might be wondering how to remember these queries. It's not easy to memorize them. Therefore, we will use Grafana to display all this data. Grafana does not require you to remember queries and will automatically create an impressive dashboard for everything.
We will explore the integration of Prometheus and Grafana in our next blog.
Happy Learning :)
Chetan Mohod ✨
For more DevOps updates, you can follow me on 👇