Unified Monitoring Stack
📖Introduction
Mango, located at 192.168.110.133 on the Infra network, is the unified successor to the Prometheus & Grafana and Victoria triad. It serves as the central hub for the Home Lab's observability. Mango natively scrapes metrics from all Virtual Machines, the Proxmox host(Pear) and the services, stores them in a high-performance VictoriaMetrics time-series database, and provides a Grafana interface for visualization.
By consolidating these services, we reduce network overhead and simplify the management of our monitoring infrastructure while maintaining 12-month data retention on a dedicated 500GB storage pool.
🚦Security & Network Architecture
Mango sits within the Infra network. Because it aggregates data from every host in the lab, it is a high-value target.
- Web Interfaces: Grafana (Port 3000) and VictoriaMetrics VMUI (Port 8428) are restricted via pfSense to be accessible only from the MGT network (Cinnamon/Lemon).
- Scraping Flow: Mango acts as the source for all scrape requests. pfSense rules must allow Mango to reach out to Production, VPN, and Terminal networks on specific exporter ports (9100, 9113, 9117, etc.).
- Storage Pool: Data is stored on a dedicated 500GB virtual disk (PearPool), mounted at /mnt/metrics_data to ensure that metric growth never impacts the OS root partition.
🏛️Environment & Storage Setup
The VM was created using the Debian Gold Master template.
- Hostname: Mango
- IP/Gateway: 192.168.110.133 / 192.168.110.1
- Disk 1 (OS): 32GB
- Disk 2 (Data): 500GB (Added via Proxmox)
Storage Initialization To handle the long-term metrics, the 500GB disk was initialized and mounted:
# Identify disk (sdb), format, and mount sudo mkfs.ext4 /dev/sdb sudo mkdir -p /mnt/metrics_data sudo mount /dev/sdb /mnt/metrics_data # Ensure persistence in /etc/fstab /dev/sdb /mnt/metrics_data ext4 defaults 0 2
🔧Installation
⚡VictoriaMetrics Installation
VictoriaMetrics was installed as a native binary (not Docker) to replace both the Prometheus scraper and the Victoria storage VM.
- User & Directory Setup
sudo useradd --no-create-home --shell /bin/false victoriametrics sudo mkdir /etc/victoriametrics sudo chown -R victoriametrics:victoriametrics /etc/victoriametrics /mnt/metrics_data
- Binary Installation
Binaries were retrieved from the VictoriaMetrics GitHub.
wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.xx.x/victoria-metrics-linux-amd64-v1.xx.x.tar.gz tar -xvf victoria-metrics-linux-amd64-v1.xx.x.tar.gz sudo mv victoria-metrics-prod /usr/local/bin/victoriametrics sudo chown victoriametrics:victoriametrics /usr/local/bin/victoriametrics
- Service Configuration
sudo nano /etc/systemd/system/victoriametrics.service
[Service] ExecStart=/usr/local/bin/victoriametrics \ --storageDataPath=/mnt/metrics_data \ --retentionPeriod=12 \ --promscrape.config=/etc/victoriametrics/prometheus.yml \ --httpListenAddr=0.0.0.0:8428
Note: The --retentionPeriod=12 ensures one year of history.
🔍Scraping Configuration (prometheus.yml)
VictoriaMetrics uses the standard Prometheus YAML format for its scraper. The file was copied from the older Prometheus host Pineapple and copied to:
sudo nano /etc/victoriametrics/prometheus.yml
Key Change: The evaluation_interval directive was removed as it is not natively supported by the VictoriaMetrics single-binary scraper (it expects vmalert for that).
🧪Target Jobs
The configuration includes the legacy fleet plus the new 2026 additions:
- Infrastructure: Mango (Self), CTNS1.
- Production:
- Reverse proxy (Nginx) Raisin
- Webservers (Apache) Plum, Satsuma, Fig
- Database server (MySQL) Mandarin
- New 2026 Hosts: Blackcurrant (Data & Archive), Quince (AI/Media), Tayberry (OpenAlex).
- Gaming: Apple & Cherry (Minecraft Servers).
- Terminals:
- (NoMachine) Kiwiberry
- (XRDP), Kapok
- (Windows, RDP) Wahoo/Walnut .
Scrape Interval: Set to 120s to balance data resolution with disk I/O and longevity.
Adding the Dockge Targets
we had to update the /etc/victoriametrics/prometheus.yml to include the docker containers
- job_name: 'docker_containers'
static_configs:
- targets:
- 'quince.seaoffate.net:8080' # cAdvisor for AI Stack
- 'blackberry.seaoffate.net:8080' # cAdvisor for Data Archive
- 'tayberry.seaoffate.net:8080' # cAdvisor for OpenAlex
- job_name: 'gpu_metrics'
static_configs:
- targets: ['quince.seaoffate.net:9400']
- job_name: 'jellyfin'
static_configs:
- targets: ['quince.seaoffate.net:8096']
Target Agent Installation (Scrapers)
For Mango to collect data, each target VM must run a specific exporter. Most Linux hosts use the node_exporter for OS metrics, while application-specific exporters are used for Nginx, Apache, and MySQL.
Linux Node Exporter (Standard for all VMs)
Installed on all Linux hosts (Raisin, Plum, Satsuma, Apple, Cherry, etc.) to monitor CPU, RAM, and Disk. Any hosts that don't show on the targets webpage need to have the agent installed.
- Install via APT
sudo apt update && sudo apt install -y prometheus-node-exporter
- Enable and Start
sudo systemctl enable --now prometheus-node-exporter
- Verification (Run on target VM)
curl http://localhost:9100/metrics
- Firewall Requirement: Target VM must allow Inbound TCP 9100 from Mango (192.168.110.133).
Nginx (Raisin)
Used to monitor active connections and request rates. the Nginx exporter is a standalone binary that talks to Nginx's stub_status module.
- Enable Nginx Status: On Raisin, edit the Nginx config (e.g., /etc/nginx/sites-available/default) and add this block:
server {
listen 127.0.0.1:8080;
location /metrics {
stub_status on;
access_log off;
allow 127.0.0.1;
deny all;
}
}
sudo nginx -s reload
- Install & Run Exporter
wget https://github.com/nginx/nginx-prometheus-exporter/releases/latest/download/nginx-prometheus-exporter_0.11.0_linux_amd64.tar.gz tar -xvf nginx-prometheus-exporter_*.tar.gz sudo mv nginx-prometheus-exporter /usr/local/bin/
- Create a Systemd Service
sudo nano /etc/systemd/system/nginx-exporter.service
Paste this into the service file
[Unit] Description=Nginx Prometheus Exporter After=network.target
[Service]
Type=simple
User=prometheus
ExecStart=/usr/local/bin/nginx-prometheus-exporter \
-nginx.scrape-uri=http://127.0.0.1:8080/metrics
Restart=always
[Install] WantedBy=multi-user.target
Enable the service with
sudo systemctl enable --now nginx-exporter
MySQL (Mandarin)
Used to monitor query throughput and database health.
- Database User: Create a mysqld_exporter user in MySQL with PROCESS, REPLICATION CLIENT, SELECT privileges.
- Configuration: Store credentials in /etc/.mysqld_exporter.cnf.
- Service: Install prometheus-mysqld-exporter via APT.
- Port: TCP 9104
Apache (Plum, Fig, Satsuma)
- Enable Mod Status:
sudo a2enmod status.
- Install Exporter:
sudo apt install prometheus-apache-exporter
- Port: TCP 9117
Windows Exporter (Wahoo & Walnut)
For the Windows 11 desktops, we use the windows_exporter
- Download: Latest .msi from the Prometheus Community GitHub.
- Install: Run the installer; it defaults to port 9182.
- Firewall: The installer typically adds a "Windows Firewall" exception automatically.
Docker & Container App Exporters
Since we are using Dockge to manage our containers on hosts like Quince (AI), Blackcurrant (Archive), and Tayberry (OpenAlex), we should standardize how metrics are pulled from these environments. The most efficient way to do this is to add cAdvisor to each of our Dockge stacks. This allows Mango to "see" inside the Docker engine of that specific VM and report on the health of every individual container (Ollama, Jellyfin, etc.).
Docker Container Monitoring (The Dockge Layer)
For every VM running Dockge, you need to add a Monitoring Stack or add these services to your existing stacks. cAdvisor is the primary agent here; it scrapes resource usage from the Docker socket.
- Create a "Monitoring" Stack in Dockge for Blackberry and Tayberry: In the Dockge UI, create a new stack and use the following
- Container_name: cadvisor
version: "3.8"
services:
cadvisor:
image: gcr.io/cadvisor/cadvisor:v0.47.0 # Use a version compatible with your kernel
container_name: cadvisor
restart: unless-stopped
privileged: true
ports:
- 8080:8080
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
- Create a "Monitoring" Stack in Dockge for Quince: In the Dockge UI, create a new stack and use the following
- Container_name: cadvisor
services:
cadvisor:
image: gcr.io/cadvisor/cadvisor:v0.47.0
container_name: cadvisor
restart: unless-stopped
privileged: true
ports:
- 8080:8080
volumes:
- /:/rootfs:ro
- /var/run:/var/run:rw
- /sys:/sys:ro
- /var/lib/docker/:/var/lib/docker:ro
- /dev/disk/:/dev/disk:ro
devices:
- /dev/kmsg
nv-exporter:
image: nvcr.io/nvidia/k8s/dcgm-exporter:3.3.5-3.4.0-ubuntu22.04
container_name: nvidia_exporter
restart: unless-stopped
# Use 'command' to force it to listen on the network interface
command:
- -a
- 0.0.0.0:9400
ports:
- 9400:9400 # Map host 9400 to container 9400
cap_add:
- SYS_ADMIN
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities:
- gpu
networks: {}
Port Summary for Dockge Hosts (these ports must be opened in Pfsense for the exporters to report their status to Mango
- 8080: cAdvisor (Container CPU/RAM/Network)
- 9400: NVIDIA Exporter (GPU VRAM/Temp - Quince only)
Jellyfin monitoring
the switch to enable the metrics is not an option in the WebUI of our version of Jellyfin so we will have to enable it in the xml config. Since our Jellyfin config is mapped to /mnt/docker_data/jellyfin/config, finding the needle in the haystack is much easier. Because we are using the official Jellyfin image, the system.xml file is the principal config of the operation. On your host (Quince), the file you need to edit is right here:
/mnt/docker_data/jellyfin/config/config/system.xml
We could stop the container and modify the .xml directly from the terminal on Quince. We can also use sed to find the false value and make it true without having to hunt through the XML manually, after stopping the container (within dockge) and making a backup:
cd /mnt/docker_data/jellyfin/config/config/ cp system.xml system.xml.bak
Change EnableMetrics from false to true
sed -i 's/<EnableMetrics>false<\/EnableMetrics>/<EnableMetrics>true<\/EnableMetrics>/' system.xml
Now the metrics are switched on we can restart the container app in dockge
Proxmox Host (Pear)
To monitor the physical hardware and ZFS pools:
- Node Exporter: Installed directly on the Proxmox Debian host.
- SMART Metrics: Use the smartctl_exporter_script.sh (as detailed in legacy notes) to pipe drive health into the node_exporter's textfile collector.
Post-Installation validation on Mango
After installing an agent on a target, confirm Mango sees it:
- Open VMUI: http://mango:8428/targets.
- Search for the hostname.
- Status must be "UP". If "Connection Refused," check the service on the target; if "Timeout," check pfSense rules
📈 Grafana Installation
Grafana was installed on the same Mango host to provide the local visualization layer.
- Repository & App Setup
sudo apt install -y apt-transport-https software-properties-common wget wget -q -O - https://apt.grafana.com/gpg.key | gpg --dearmor | sudo tee /usr/share/keyrings/grafana.gpg > /dev/null echo "deb [signed-by=/usr/share/keyrings/grafana.gpg] https://apt.grafana.com stable main" | sudo tee /etc/apt/sources.list.d/grafana.list sudo apt update && sudo apt install grafana -y sudo systemctl enable --now grafana-server
🧩Network & Firewall Rules (pfSense)
To allow the new ports to function, pfSense was updated with a new Alias for Monitoring Ports:
- 3000: Grafana UI (remains the same from the previous Grafana installation)
- 8428: VictoriaMetrics UI/API ( The new port added for viewing of the scraping progress as was done by Prometheus web gui)
- 9090: removed the older Prometheus webgui port
Critical Rules
- MGT -> Mango: Allow ports 3000 & 8428 (Access from Cinnamon or other management console).
- Mango -> All Networks:
- Allow port 9100 (Node)
- Allow port 9113 (Nginx)
- Allow port 9117 (Apache)
- Allow port 9104 (MySQL)
- Allow port 9182 (Windows)
🔦Verification Steps
- Check Dockge: Ensure the cadvisor container shows as "Green/Running" in the Dockge UI.
- Service Status: (Confirmed Active/Running).
sudo systemctl status victoriametrics
Targets Check: verify all hosts are Green/UP. We should see the new entries for ports 8080, 9835, etc for the docker containers
http://mango:8428/targets
Data Source: In Grafana, added Prometheus data source pointing to
http://localhost:8428.
Disk Write Check: confirms ingestion of samples to the PearPool disk.
du -sh /mnt/metrics_data
We can verify the various agents are reporting with the curl command for example to test the docker container on Tayberry is working use
[curl http://tayberry.seaoffate.net:8080/metrics]
Summary of Legacy Retirement
With Mango fully operational:
- Pineapple (.130) services stopped.
- Granadilla (.131) services stopped.
- Victoria (.132) services stopped.
- Lychee identified as legacy and marked for rebuild via new Gold Master Template.
Build Complete: February 22, 2026