Monitoring and alerting using InfluxDB
AIStor Server publishes cluster and node metrics using the Prometheus Data Model. InfluxDB supports scraping MinIO AIStor metrics data for monitoring and alerting.
The procedure on this page documents the following:
- Configuring an InfluxDB service to scrape and display metrics from an MinIO AIStor deployment
- Configuring an Alert on an MinIO AIStor metric
This tutorial uses metrics version 2. You can also use metrics version 3, which is recommened for new deployments. For more information about version 3, see Metrics and alerts.
For MinIO AIStor Deployments on Kubernetes, this procedure assumes all necessary network control components, such as Ingress or Load Balancers, to facilitate access between the object store and the InfluxDB service.
Configure InfluxDB to collect and alert using MinIO AIStor metrics
IMPORTANT
This procedure specifically uses the InfluxDB UI to create a scraping endpoint.
The InfluxDB UI does not provide the same level of configuration as using Telegraf and the corresponding Prometheus plugin. Specifically:
- You cannot enable authenticated access to the MinIO AIStor metrics endpoint via the InfluxDB UI
- You cannot set a tag for collected metrics (e.g.
url_tag) for uniquely identifying the metrics for a given deployment
The Telegraf Prometheus plugin also supports Kubernetes-specific features, such as scraping the minio service for a given object store.
Configuring Telegraf is out of scope for this procedure. You can use this procedure as general guidance for configuring Telegraf to scrape MinIO AIStor metrics.
-
Configure public access to MinIO AIStor metrics
Set the
MINIO_PROMETHEUS_AUTH_TYPEenvironment variable to"public"for all nodes in the MinIO AIStor deployment. You can then restart the deployment to allow public access to the metrics.You can validate the change by attempting to
curlthe metrics endpoint:curl https://HOSTNAME/minio/v2/metrics/clusterReplace
HOSTNAMEwith the URL of the load balancer or reverse proxy through which you access the deployment. You can alternatively specify any single node asHOSTNAME:PORT, specifying the Object Store API port in addition to the node hostname.The response body should include a list of collected metrics.
-
Log into the InfluxDB UI and create a bucket
Select the Organization under which you want to store MinIO AIStor metrics.
Create a New Bucket in which to store metrics for the deployment.
-
Create a new scraping source
Create a new InfluxDB Scraper.
Specify the full URL to the MinIO AIStor deployment, including the metrics endpoint:
https://HOSTNAME/minio/v2/metrics/clusterReplace
HOSTNAMEwith the URL of the load balancer or reverse proxy through which you access the deployment. You can alternatively specify any single node asHOSTNAME:PORT, specifying the Object Store API port in addition to the node hostname. -
Validate the data
Use the DataExplorer to visualize the collected data.
For example, you can set a filter on
minio_cluster_capacity_usable_total_bytesandminio_cluster_capacity_usable_free_bytesto compare the total usable against total free space on the deployment. -
Configure a check
Create a new Check on a metric.
The following example check rules provide a baseline of alerts for a deployment. You can modify or otherwise use these examples for guidance in building your own checks.
-
Create a Threshold Check named
MINIO_NODE_DOWN.Set the filter for the
minio_cluster_nodes_offline_totalkey.Set the Thresholds to WARN when the value is greater than 1
-
Create a Threshold Check named
MINIO_QUORUM_WARNING.Set the filter for the
minio_cluster_drive_offline_totalkey.Set the thresholds to CRITICAL when the value is one less than your configured Erasure Code Parity setting.
For example, a deployment using EC:4 should set this value to
3.
Configure your Notification endpoints and Notification rules such that checks of each type trigger an appropriate response.
-