Prometheus & Grafana: Difference between revisions

Revision as of 10:04, 5 June 2025

Introduction

It has been decided that it would be better if we had some detailed knowledge of what is happening with the VMs and indeed on the host Pear as well. The Proxmox GUI does give an approximation of what the state of resources are on the Host and VMs but it does seem to be a bit vague and out of date. It is with the Proxmox lack of detail that prompted the installation of Prometheus on Pineapple. However, Prometheus does not do justice to the data that it collects so Grafana has also been installed on to Granadilla, also on the Infra network. Long term data storage requires some thing like Victoria Metrics is needed. So what started out to be a simple data collection issue ended with a suite of VMs to gather, store and view the metrics of Pear and it's collection of VMs and CTs.

Prometheus

Prometheus is a powerful, open-source monitoring system and time-series database. Developed originally at SoundCloud, it has become a cornerstone of modern cloud-native observability stacks, renowned for its flexible data model, powerful query language (PromQL), and efficient data collection mechanism. Unlike traditional monitoring systems that often rely on agents pushing data, Prometheus primarily uses a "pull" model, actively scraping metrics from configured endpoints. This pull model simplifies network setup in many scenarios and ensures that Prometheus controls the monitoring cadence.

In our home lab environment, Prometheus is strategically positioned to act as the central nervous system for all operational metrics, providing a comprehensive view of the health and performance across various network segments and critical infrastructure.

Architectural Placement and Core Function

Prometheus is installed and running on the host named Pineapple, which resides within the dedicated infra network segment. This placement is deliberate, allowing pineapple to securely access and scrape metrics from devices and services across all monitored networks without imposing undue load on production resources. As the central collector, Pineapple's Prometheus instance is responsible for:

Metric Collection: Actively reaching out to configured targets to pull metrics at regular intervals. These targets are typically "exporters" – small agents that expose metrics in a Prometheus-readable format (plain text over HTTP).

Time-Series Database: Storing these collected metrics locally in its optimized time-series database. This allows for efficient storage and retrieval of large volumes of numerical data indexed by timestamps and label sets.

PromQL Engine: Providing the PromQL query language, which is used for powerful and flexible data analysis, aggregation, and mathematical operations on the collected time-series data. This enables the creation of custom dashboards and alert conditions.

Alerting Rules

Evaluating predefined alerting rules against the collected data and sending notifications when thresholds are breached. While Prometheus evaluates the rules, the actual notification delivery is typically handled by its companion component, Alertmanager .

Monitoring Scope Across the Home Lab

From its vantage point on pineapple, Prometheus is configured to monitor a wide array of systems across different network segments, ensuring holistic observability:

Production Network Devices: Monitoring application servers, web services, and databases that host critical services. This involves deploying specific exporters (e.g., Node Exporter for Linux hosts, various application-specific exporters for databases or web servers) on these production machines.
Infra Network Devices: Monitoring core infrastructure components like DNS servers, directory services, and network appliances within the infra network itself.
Management Network Devices: Keeping an eye on systems dedicated to managing the lab, such as configuration management servers, backup solutions, or other utility services.
VPNNet Network: Monitoring VPN gateways and tunnels, ensuring connectivity and performance for remote access. Both Openvpn and Wireguard VMs are monitored
Terminals Network: For systems like thin clients or specific workstations, basic uptime and resource utilization can be monitored to ensure availability. The desktops monitored are Walnut, Wahoo and Lychee
Proxmox Host (Pear): Critically, Prometheus monitors the physical Proxmox host named Pear. This typically involves deploying Node Exporter on Pear itself, providing low-level system metrics such as CPU usage, memory consumption, disk I/O, and network traffic for the hypervisor. Proxmox Exporter is an optional dedicated exporter that provides metrics specific to Proxmox VE, such as VM/LXC container states, storage pool usage, and cluster health. By strategically deploying various "exporters" on target systems across these diverse networks, Prometheus on Pineapple aggregates a centralized stream of performance and health data.

Data Flow and Integration with Other Tools

Prometheus on Pineapple is not a standalone solution for data visualization or long-term storage in our setup. It integrates seamlessly with other specialized tools to form a comprehensive observability stack:

Data Presentation to Grafana (granadilla): Prometheus serves as the primary data source for our visualization platform, Grafana, which is running on the host granadilla. When we access a dashboard in Grafana, it queries Prometheus (on pineapple) using PromQL to retrieve the necessary time-series data. This separation allows Grafana to focus purely on presenting compelling dashboards without needing to manage data collection or storage. Data Storage on VictoriaMetrics (victoria): For efficient long-term storage and scalability, Prometheus on pineapple is configured to remotely write all its collected data to the VictoriaMetrics VM named victoria. This offloads the responsibility of high-volume, long-term data retention from pineapple's local storage and Prometheus's internal TSDB. VictoriaMetrics, optimized for large-scale time-series data, acts as a robust, scalable backend, ensuring that historical metrics are readily available for analysis, even years into the future. This architecture allows us to leverage Prometheus's excellent scraping and querying capabilities while benefiting from VictoriaMetrics' superior storage efficiency and scalability. This integrated approach, with Prometheus as the data collection and querying engine, provides a flexible, powerful, and scalable monitoring solution for our home lab, paving the way for advanced visualization with Grafana and robust long-term data retention with VictoriaMetrics.

Grafana The Visualization Hub of Our Home Lab

following the collection and storage of metrics by Prometheus, Grafana emerges as the critical component for transforming raw time-series data into actionable insights. Grafana is an open-source platform for data visualization, analytics, and monitoring, providing a highly customizable and interactive web-based interface. Crucially, it is not a database itself, but rather a powerful frontend designed to query, visualize, and alert on data from a multitude of data sources. Its strength lies in its intuitive dashboarding capabilities, allowing users to create rich, dynamic, and shareable views of their system's health and performance.

In our home lab, Grafana serves as the central hub for all operational dashboards, making the complex interplay of services and infrastructure easily comprehensible.

Architectural Placement and Core Function

Grafana is installed and operating on the host named Granadilla. For optimal accessibility by administrators and other services, Granadilla is strategically placed within the infra network segment and has Pfsense rules set ensuring it can be reached reliably for data visualization and management. As the visualization layer, Grafana's primary functions are:

Data Source Connection

Establishing secure connections to various data sources, primarily our Prometheus instance. Querying: Submitting queries to the configured data sources to retrieve specific time-series data. Visualization: Rendering the retrieved data into a wide array of customizable panel types, including graphs, gauges, heatmaps, tables, and more.

Dashboarding

Organizing multiple visualization panels into logical and interactive dashboards, providing a consolidated view of different aspects of the infrastructure.

Alert Display

While Prometheus is responsible for evaluating alert rules, Grafana can display the current status of alerts and provide visual cues on dashboards when issues arise.

Visualizing Data Across the Home Lab

Grafana on Granadilla brings together the vast array of metrics collected by Prometheus, offering comprehensive dashboards that span all critical network segments and infrastructure components:

Network-Specific Dashboards: Dedicated dashboards visualize the health and performance of devices and services within the production, infra, mgt, vpnnet, and terminals networks. These include panels showing network traffic volumes, latency, error rates for critical devices, and the uptime status of key services on each segment. For instance, the vpnnet dashboard could display active VPN connections and tunnel throughput.
Proxmox Host (Pear) Monitoring: One of the most critical sets of dashboards focuses on the Proxmox host, Pear. Grafana leverages the metrics scraped by Prometheus (from Node Exporter and Proxmox Exporter on Pear) to create detailed visualizations of:
- Resource Utilization: CPU load, memory usage, disk I/O, and network activity of Pear itself.
- Virtual Machine/Container Health: Overview of running VMs and LXC containers, their individual resource consumption, and uptime.
- Storage Pool Health: Performance and capacity trends for Pear's ZFS storage pool.
Application-Specific Dashboards: Beyond infrastructure, Grafana also provides granular insights into applications, showing metrics from web servers, databases, and other services across the monitored networks.
Unified Views: Grafana's ability to combine data from different sources and query types allows for composite dashboards that provide a holistic view of the entire home lab's health on a single screen, breaking down traditional silos between network segments.

Data Flow and Integration with Prometheus and VictoriaMetrics

Grafana's role in our observability stack is fundamentally as the query initiator and visualizer:

Primary Data Source: Prometheus (pineapple): Grafana is primarily configured to use Prometheus (running on pineapple) as its data source. When a dashboard is loaded, Grafana sends PromQL queries directly to the Prometheus API endpoint on pineapple.

Leveraging Victoria Metrics through Prometheus: While Grafana directly queries Prometheus, Prometheus itself is configured to remotely write its data to Victoria (Victoria) for long-term storage. This means that when Grafana requests historical data (e.g., performance trends from weeks or months ago), Prometheus on pineapple will efficiently retrieve that data from VictoriaMetrics on Victoria and then serve it back to Grafana. This architecture ensures that Grafana can access both recent and deep historical data seamlessly without needing to manage the complexities of long-term storage itself. In essence, Grafana on Granadilla is the window into the operational state of our entire home lab. It translates the raw numbers from Prometheus into intuitive charts and graphs, enabling us to quickly understand performance trends, diagnose issues, and ensure the stability of all our services and infrastructure. Its flexible visualization capabilities complement Prometheus's robust data collection and VictoriaMetrics' scalable storage, forming a powerful monitoring triumvirate.

@@ Line 72: / Line 72: @@
 Leveraging '''[[Victoria Metrics]]''' through Prometheus: While Grafana directly queries Prometheus, Prometheus itself is configured to remotely write its data to '''[[Victoria Metrics | Victoria (Victoria)]]'''  for long-term storage. This means that when Grafana requests historical data (e.g., performance trends from weeks or months ago), Prometheus on pineapple will efficiently retrieve that data from VictoriaMetrics on '''[[Victoria Metrics | Victoria]]''' and then serve it back to Grafana. This architecture ensures that Grafana can access both recent and deep historical data seamlessly without needing to manage the complexities of long-term storage itself.
 In essence, Grafana on '''[[Granadilla]]''' is the window into the operational state of our entire home lab. It translates the raw numbers from Prometheus into intuitive charts and graphs, enabling us to quickly understand performance trends, diagnose issues, and ensure the stability of all our services and infrastructure. Its flexible visualization capabilities complement Prometheus's robust data collection and VictoriaMetrics' scalable storage, forming a powerful monitoring triumvirate.
-==Setup and Installation==
-It seemed logical to install Prometheus & Grafana at the same time as they are both required to be meaningful but they are two separate hosts so the details will be listed separately. '''[[Victoria Metrics]]''' is not actually required to make monitoring possible but it is a useful addition to the setup and it's installation is covered in a different heading in the Infra network's hosts.
-===Prometheus===
-'''[[Pineapple | Prometheus install]]''' consists of installing the rule processing and it's supporting functions part of the suite on '''[[Pineapple]]''' followed by the lengthy installation of the agents that supply the data from each individual client. A more detailed step by step guide can be found '''[[Pineapple | here]]'''.
-===Grafana===
-'''[[Granadilla | Grafana install]]''' consist of installing the webserver on '''[[Granadilla]]''' and collecting the data to be viewed. further details can be found '''[[Granadilla | here]]'''.

Prometheus & Grafana: Difference between revisions

Revision as of 10:04, 5 June 2025

Contents

Introduction

Prometheus

Architectural Placement and Core Function

Alerting Rules

Monitoring Scope Across the Home Lab

Data Flow and Integration with Other Tools

Grafana The Visualization Hub of Our Home Lab

Architectural Placement and Core Function

Data Source Connection

Dashboarding

Alert Display

Visualizing Data Across the Home Lab

Data Flow and Integration with Prometheus and VictoriaMetrics

Navigation menu

Page actions

Page actions

Personal tools

Navigation

Search

Tools