Monitoring Magic: How Prometheus and cAdvisor Keep Your App in Check!

Monitoring (WHY is your application slow, off, etc)

Definition: Collecting and analyzing metrics over time to identify system performance trends, availability, and resource utilization.

Purpose: Alerts when predefined thresholds are breached, ensuring system reliability.

Example Metrics: CPU usage, memory consumption, request latency, error rates.

Tools: Prometheus, Datadog, Grafana.

Today, we will explore Prometheus.

What is Prometheus?

Prometheus is a Timeseries Database.
Timeseries DB: A time series database (TSDB) stores data points collected over time, with each entry having a timestamp. It is designed to analyze trends or patterns, such as CPU usage or stock prices, over time.
Prometheus includes components like a query server to store data. The scraper collects data from various sources and stores it in the data store, while the query server provides output to the user.

SoundCloud originally developed Prometheus as a time series database for monitoring. Later, Prometheus was contributed to the Cloud Native Computing Foundation (CNCF), where it became an official tool and was released as open source.
Prometheus is a time series database that gathers data from various sources. It includes a query server that processes this data and sends the output to visualization tools like Grafana.

What is Grafana?

Grafana is a tool for visualizing and monitoring data. It creates dashboards and graphs to help you see and understand your data from sources like Prometheus or databases.

Create a Prometheus and Grafana stack, deploy an application, and monitor it.

Lets see the practical:

Steps:

Follow below steps to create an EC2 instance:

Assign any name to the instance and select the Ubuntu AMI.

Select the t2.medium instance type because running both Prometheus and Grafana requires at least this level of resources. Additionally, create a .pem key for secure access.

Allow HTTP and HTTPS traffic and allocate 15 GB of storage and click on Launch instance.

Connect with the Instance via SSH.

Update the system and install Docker. Then, create a directory named "observability."

#Update the system
sudo apt-get update

#Install docker and docker compose
sudo apt-get install docker.io docker-compose-v2 -y

#Give permission to newly created group
sudo usermod -aG docker $USER && newgrp docker

#Check the containers
docker ps

#Check docker compose version
docker compose version

#Create a directory
mkdir observability

cd observability

To begin monitoring, we need an application. We will use our django-notes-app for this purpose.

git clone https://github.com/Chetan-Mohod/django-notes-app.git

cd django-notes-app/

#Switch from main branch to dev branch
git checkout dev

Now create a Docker Image:

#Create an image
docker build -t notes-app .

#Create and start a container
docker run -d -p 8000:8000 notes-app

Go to our AWS EC2 instance → Security Rules → Edit Inbound Rules → Add an entry for port 8000. As shown in below image.

Take the public IP address and enter it into the browser, followed by port number 8000.

You can see the app is running:

#Docker stop

docker stop container_ID

Create docker-compose.yml & run:

version: "3.8"

services:

  notes-app:
    build:
      context: django-notes-app/.    #Dockerfile is inside django-notes-app
    container_name: notes-app
    ports:
      - "8000:8000"

docker compose up

You can see that the application is running with Docker Compose. We will automate our application using only Docker Compose.

Now we will move on to checking the logs, as shown in the image below.

Where do you check Docker logs? —> The answer is Prometheus.

Let's proceed with the following steps:

Where can we find Prometheus?

Open your browser and search for Prometheus Docker Compose.
We need to add the Prometheus service/container to our docker-compose.yml file.

#vim docker-compose.yml

version: "3.8"

networks:  #For the communication of two containers (notes-app & prometheus)
  monitoring:
    driver: bridge

services:

  notes-app:
    build:
      context: django-notes-app/.    #Dockerfile is inside django-notes-app
    container_name: notes-app
    ports:
      - "8000:8000"
    networks:
      - monitoring

  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    networks:
      - monitoring

Explanation:

Prometheus uses the prometheus.yml file to configure scrape jobs, which define where Prometheus collects metrics from (like targets and intervals). By default, this file should be at /etc/prometheus/prometheus.yml inside the container. Using the volumes line in docker-compose.yml, you map your custom prometheus.yml from the host to the container, allowing you to define or update scrape configurations easily without modifying the container.

First, we need to install or download the file. How do we create it? Refer to this documentation.

#prometheus.yml

global:
  scrape_interval: 1m

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 1m
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'docker'
    scrape_interval: 1m
    static_configs:
      - targets: ['localhost:8080']

Every minute, prometheus job name hitting on localhost:9090 and gathering the data.

Now our configuration file has been created with the name prometheus.yml, now we can add this file into our prometheus container(docker-compose.yml).

version: "3.8"

networks:  #For the communication of two containers (notes-app & prometheus)
  monitoring:
    driver: bridge

services:

  notes-app:
    build:
      context: django-notes-app/.    #Dockerfile is inside django-notes-app
    container_name: notes-app
    ports:
      - "8000:8000"
    networks:
      - monitoring

  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    networks:
      - monitoring

    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml

    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--web.enable-lifecycle'
    ports:
      - "9090:9090"

docker compose up -d

Docker Compose is working. To verify this, expose 9090 in your inbound rules on AWS EC2 instance.

Then copy your public_ip, paste it into your browser, and check the results. You should see the Prometheus Time Series Database. Prometheus server is up & running.

How to get data on this server?
- To access the data, navigate to https://public_ip:9090/metrics. Here, you will find the metrics data stored by Prometheus.

So if you want to see this data in prometheus
Click on Stats→ Target health. Target health will show all the end points

It will display that localhost:9090 is active, which indicates that Prometheus is running. However, our Docker setup is not operating on localhost:8080 because there is no Docker service there. The services running inside the Prometheus container are accessible via localhost. If a service is running in a different container within the same network, you can access it using the container_name.

Here's an important point to consider: different Docker containers may run on various ports, such as 80, 8080, 9000, and 3000. How many endpoints can you realistically add? For example, if you have 6 Docker containers, each running on different ports, you would need to add 6 endpoints. If you have 100 containers, you would need to add 100 endpoints which is not feasible. Ideally, there should be a single solution, which is why we will use cAdvisor.
cAdvisor (short for Container Advisor) is a tool made by Google, cAdvisor collects metrics for each container on a host and exports them to Prometheus, which centralizes and visualizes data for all containers across multiple hosts.

Lets create another service for cAdvisor: Refer this documentation

#vim docker-compose.yml

version: "3.8"

networks:  #For the communication of two containers (notes-app & prometheus)
  monitoring:
    driver: bridge

volumes:   #prometheus volume
  prometheus_data:

services:

  notes-app:
    build:
      context: django-notes-app/.    #Dockerfile is inside django-notes-app
    container_name: notes-app
    ports:
      - "8000:8000"
    networks:
      - monitoring

  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    networks:
      - monitoring

    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus              #Save prometheus volume here, which we declare at top
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--web.enable-lifecycle'
    ports:
      - "9090:9090"

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    ports:
      - "8080:8080"
    volumes:
    - /:/rootfs:ro          #cadvisor directly access root file system due to that all container data it gathers
    - /var/run:/var/run:rw  #Data of all tghe running processes it gather
    - /sys:/sys:ro          #
    - /var/lib/docker/:/var/lib/docker:ro   #Docker related files
    depends_on:
    - redis
    networks:
      - monitoring

  redis:
    image: redis:latest
    container_name: redis
    ports:
    - "6379:6379"             #Redis port

cAdvisor directly accesses the root file system. This allows it to gather all container data effectively.

Here’s a concise explanation of the cAdvisor configuration:

1. `cadvisor:`

Defines the cAdvisor service in Docker Compose.

2. `image:` `gcr.io/cadvisor/cadvisor:latest`

Uses the latest cAdvisor image from Google Container Registry.

3. `container_name: cadvisor`

Names the container cadvisor for easy identification.

4. `ports:`

"8080:8080": Exposes cAdvisor on port 8080 for access to the UI or metrics.

5. `volumes:`

Mounts directories for cAdvisor to gather system and container data:

/-:/rootfs:ro: Read-only access to the root filesystem for container stats.
/var/run:/var/run:rw: Access to process runtime data and Docker socket.
/sys:/sys:ro: Read-only access to system performance metrics (e.g., CPU/memory).
/var/lib/docker/:/var/lib/docker:ro: Access to Docker data for metadata and resource tracking.

This setup allows cAdvisor to monitor all containers on the host effectively.

cAdvisor utilizes a Redis cache to operate efficiently, which means you need to set up a Redis server. Redis is a key-value pair data store, ideal for storing temporary data. Since cAdvisor handles a large amount of container data, it uses Redis for caching purposes.

Lets run it now, first down the container then apply the changes:

docker compose down

docker compose up

Our cAdvisor runs on port 8080. Add an entry for port 8080 in our EC2 inbound rules. Then, copy public_ip:8080 and paste it into your browser. You can see cAdvisor is running:

Click on Docker Containers, you can see notes-app is available here.

Here you can see real time monitoring:

Now this real-time data cAdvisor provides to prometheus, how?
In Prometheus, you may notice that it is not displaying data for our other containers, indicating they are down. To resolve this, you need to add the cAdvisor container name to the prometheus.yml file. This will enable Prometheus to fetch data for all containers, as it currently only works with the localhost. By including the container names for other services in the file, Prometheus will be able to access their data.

To view the endpoint in Prometheus, we will add the container name to the prometheus.yml file.

global:
  scrape_interval: 1m

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 1m
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'docker'
    scrape_interval: 1m
    static_configs:
      - targets: ['cadvisor:8080']   #cadvisor = container name

To restart the container, first stop it and then start it again:

docker compose down

docker compose up -d

Now, you can observe that our Docker container is running successfully.

This is the method for adding any Docker container to Prometheus.

Now that we have gathered all the data from our target, the next step is to monitor this information. How can we do this?

To monitor the gathered data, click on "Query server" and enter some queries, which you can find in your browser.

#Queries
rate(container_cpu_usage_seconds_total{name="notes-app"}[5m]) * 100

rate(container_network_receive_bytes_total{name="notes-app"}[30m]) + rate(container_network_transmit_bytes_total{name="notes-app"}[30m])

Now, if we have a server such as a Kubernetes cluster or a node (server), and we want to collect logs from it, we need to use a node exporter in our docker-compse.yml file. For detailed instructions, please refer to this documentation.

#vim docker-compse.yml

version: "3.8"

networks:  #For the communication of two containers (notes-app & prometheus)
  monitoring:
    driver: bridge

volumes:   #prometheus volume
  prometheus_data:

services:

  notes-app:
    build:
      context: django-notes-app/.    #Dockerfile is inside django-notes-app
    container_name: notes-app
    ports:
      - "8000:8000"
    networks:
      - monitoring

  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    networks:
      - monitoring

    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
      - prometheus_data:/prometheus              #Save prometheus volume here, which we declare at top
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--web.console.libraries=/etc/prometheus/console_libraries'
      - '--web.console.templates=/etc/prometheus/consoles'
      - '--web.enable-lifecycle'
    ports:
      - "9090:9090"

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    container_name: cadvisor
    ports:
      - "8080:8080"
    volumes:
    - /:/rootfs:ro          #cadvisor directly access root file system due to that all container data it gathers
    - /var/run:/var/run:rw  #Data of all tghe running processes it gather
    - /sys:/sys:ro          #
    - /var/lib/docker/:/var/lib/docker:ro   #Docker related files
    depends_on:
    - redis
    networks:
      - monitoring

  redis:
    image: redis:latest
    container_name: redis
    ports:
    - "6379:6379"

  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    restart: unless-stopped
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.rootfs=/rootfs'
      - '--path.sysfs=/host/sys'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
    ports:
      - "9100:9100"
    networks:
      - monitoring

cAdvisor gathers container data and sends it to port 8080. In a similar manner, the node exporter collects all system or node-level information and makes it available on port 9100.

To connect our node exporter target on port 9100 to Prometheus, we need to add an entry for node exporter in the prometheus.yml file. And apply the changes.

#vim prometheus.yml

global:
  scrape_interval: 1m

scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: 1m
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'docker'
    scrape_interval: 1m
    static_configs:
      - targets: ['cadvisor:8080']    #cadvisor = container name

  - job_name: 'node'
    scrape_interval: 1m
    static_configs:
      - targets: ['node-exporter:9100']  #node-exporter = container name

docker compose up -d

The node exporter is now visible in Prometheus.

Now we can run queries by using these commands or browse through the commands:

#Stop our containers
docker compose down

You might be wondering how to remember these queries. It's not easy to memorize them. Therefore, we will use Grafana to display all this data. Grafana does not require you to remember queries and will automatically create an impressive dashboard for everything.
We will explore the integration of Prometheus and Grafana in our next blog.

Happy Learning :)

Chetan Mohod ✨

For more DevOps updates, you can follow me on 👇

Monitoring Magic: How Prometheus and cAdvisor Keep Your App in Check!

Turning app monitoring into a smooth, data-driven ride.

Monitoring (WHY is your application slow, off, etc)

What is Prometheus?

What is Grafana?

Create a Prometheus and Grafana stack, deploy an application, and monitor it.

1. cadvisor:

2. image: gcr.io/cadvisor/cadvisor:latest

3. container_name: cadvisor

4. ports:

5. volumes: