Skip to content

Avicted/metronome

Repository files navigation

Metronome

Go Version License: MIT

A lightweight Linux metrics collection agent built in Go for learning systems programming and time-series databases. Metronome collects host, cgroup, disk I/O, network, GPU, and temperature metrics, storing them in either TimescaleDB or memory.

Table of Contents

Features

  • 7 Metric Collectors: Host (CPU, memory, load), cgroups v2, disk I/O, network, GPU (AMD), temperature sensors, processes
  • Dual Storage Modes: TimescaleDB for historical data or in-memory for real-time monitoring
  • HTTP API: JSON endpoints for programmatic access
  • Terminal UI: Real-time TUI viewer using Bubble Tea
  • Grafana Dashboards: Pre-configured visualizations for all metric types

Architecture

Metronome follows a clean architecture with separation of concerns:

graph LR
    A[Linux Kernel] -->|/proc, /sys, cgroups| B[Collectors]
    B --> C[Parsers]
    C --> D{Storage Backend}
    D -->|TimescaleDB| E[PostgreSQL + TimescaleDB]
    D -->|Memory| F[In-Memory Store]
    E --> G[Grafana Dashboards]
    F --> H[HTTP API]
    H --> I[Terminal Viewer]
Loading

Data Flow:

  1. Collectors read metrics from kernel pseudo-filesystems at configured intervals
  2. Parsers transform raw kernel data into structured metrics
  3. Storage persists data in either TimescaleDB (historical) or memory (real-time)
  4. Visualization through Grafana dashboards or terminal UI

Key Components:

  • cmd/agent/: Main collection agent
  • cmd/viewer/: Terminal-based UI client
  • internal/collector/: 7 metric collectors (host, cgroup, disk, network, GPU, temperature, process)
  • internal/parser/: Kernel file parsers (/proc, /sys, cgroup v2)
  • internal/storage/: Pluggable storage backends (TimescaleDB, memory)
  • internal/api/: HTTP server for real-time metrics

Screenshots

Viewer TUI

Viewer TUI

Real-time selected metrics

Host Overview Dashboard

Host Overview Dashboard Real-time CPU, memory, and load average visualization

GPU Metrics Visualization

GPU Metrics AMD GPU utilization, memory usage, temperature, and power consumption

Process Explorer Dashboard

Process Explorer Top processes by CPU and memory usage

Network Traffic Dashboard

Network Traffic Network interface bandwidth and packet rates

Quick Start

Memory Mode (no database required):

# Build binaries
make build-all

# Start agent in memory mode with API enabled
STORAGE_MODE=memory API_ENABLED=true ./bin/agent

# In another terminal, start the viewer
./bin/viewer

TimescaleDB Mode (full historical data):

# Start database
cp .env.example .env
make docker-up

# Build and run agent
make build
make run

# Access Grafana dashboards
open http://localhost:3000  # Login: admin/admin

API Documentation

The HTTP API is available when API_ENABLED=true (default port 8080).

Endpoints

GET /api/latest

Returns the latest collected metrics in JSON format.

Example Request:

curl http://localhost:8080/api/latest

Example Response:

{
  "version": "v1",
  "timestamp": "2026-02-04T15:41:22+02:00",
  "host": {
    "cpu_usage_percent": 15.3,
    "memory_used_mb": 8192,
    "memory_total_mb": 32768,
    "load_avg_1": 2.5,
    "load_avg_5": 2.1,
    "load_avg_15": 1.8
  },
  "disk": [
    {
      "device": "nvme0n1",
      "reads_completed": 12345,
      "read_mb": 512,
      "writes_completed": 6789,
      "write_mb": 256,
      "io_time_ms": 1500
    }
  ],
  "network": [
    {
      "interface": "eth0",
      "rx_mb": 1024,
      "rx_packets": 50000,
      "rx_errors": 0,
      "rx_dropped": 0,
      "tx_mb": 512,
      "tx_packets": 40000,
      "tx_errors": 0,
      "tx_dropped": 0
    }
  ],
  "gpu": [
    {
      "device": "card1",
      "vendor": "amd",
      "utilization_percent": 4,
      "memory_used_mb": 1337,
      "memory_total_mb": 16368,
      "temperature_c": 41,
      "power_draw_watts": 17
    }
  ],
  "temperature": [
    {
      "sensor_type": "cpu",
      "device": "hwmon2",
      "sensor_name": "k10temp",
      "label": "Tctl",
      "temperature_c": 46.0,
      "crit_temp_c": 95.0
    }
  ]
}

GET /health

Health check endpoint for monitoring the storage backend.

Example Request:

curl http://localhost:8080/health

Example Response (Healthy):

{
  "status": "healthy"
}

Example Response (Unhealthy):

{
  "status": "unhealthy",
  "error": "failed to ping database"
}

HTTP Status Codes:

  • 200 OK: Storage backend is healthy
  • 503 Service Unavailable: Storage backend is unhealthy

Configuration

Configure via environment variables or .env file:

Agent Settings:

STORAGE_MODE=timescale    # "timescale" or "memory"
API_ENABLED=true          # Enable HTTP API (default: false)
API_PORT=8080             # API server port

Viewer Settings:

METRONOME_API_URL=http://localhost:8080  # Agent API endpoint

Database (TimescaleDB mode only):

DB_HOST=localhost
DB_PORT=5432
DB_USER=metronome
DB_PASSWORD=metronome_password
DB_NAME=metronome
DB_SSLMODE=disable

Testing

# Run all tests
make test

# Run with coverage report
make test-coverage

# Run integration tests (requires Docker)
make test-integration

# View detailed coverage
./coverage.sh                    # Show all coverage
./coverage.sh --zero             # Show only 0% coverage
INTEGRATION_TESTS=true ./coverage.sh  # Include integration tests

Troubleshooting

Cgroups v2 Not Available

Symptom: Agent fails to collect cgroup metrics

Solution:

# Check if cgroups v2 is mounted
mount | grep cgroup2

# Verify cgroups v2 is available
cat /proc/filesystems | grep cgroup

If cgroups v2 is not enabled, add systemd.unified_cgroup_hierarchy=1 to your kernel boot parameters.

Permission Denied Errors

Symptom: Cannot read /proc or /sys files

Solution:

# Add your user to required groups
sudo usermod -a -G video,render $USER

# Log out and back in for changes to take effect

GPU Metrics Not Showing

Symptom: GPU metrics are empty or not collected

Supported: AMD GPUs only (reads /sys/class/drm/cardX/device/)

Not Supported: NVIDIA and Intel GPUs

# Check if GPU device files are readable
ls -la /sys/class/drm/card*/device/gpu_busy_percent

Docker Database Connection Failed

Symptom: Agent cannot connect to TimescaleDB

Solution:

# Check if containers are running
docker-compose ps

# View database logs
docker-compose logs timescaledb

# Restart stack
make docker-down
make docker-up

License

MIT License - see LICENSE for details.


Note: This is an educational project built for learning purposes. For production monitoring, use established tools like Prometheus node_exporter, cAdvisor, or commercial solutions.

About

Metronome is a Go-based Linux metrics agent with an optional terminal viewer and TimescaleDB storage.

Topics

Resources

License

Stars

Watchers

Forks

Contributors