Healthcheck Probes
Each AIStor server process exposes unauthenticated endpoints for probing server uptime and deployment high availability for simple healthchecks. These endpoints return an HTTP status code indicating whether the underlying resource is healthy or satisfies read/write quorum. The server exposes no other data through these endpoints.
AIStor liveness
Use the following endpoint to test if the specified AIStor server is up and ready to serve requests:
curl -I https://aistor.example.net:9000/minio/health/live
Replace https://aistor.example.net:9000
with the DNS hostname and port of the server to check.
A response code of 200 OK
indicates the server is online and functional.
Any other HTTP codes indicate an issue with reaching the server, such as a transient network issue or potential downtime.
The healthcheck probe alone cannot determine if the server is offline - only that the current host machine cannot reach the server.
Consider configuring a Prometheus alert using the minio_cluster_servers_offline_total
metric to detect whether one or more AIStor Server servers are offline.
Cluster write quorum
Use the following endpoint to test if an AIStor deployment has write quorum:
curl -I https://aistor.example.net:9000/minio/health/cluster
Replace https://aistor.example.net:9000
with the DNS hostname and port of any server in the deployment to check.
For clusters using a load balancer to manage incoming connections, specify the hostname for the load balancer.
A response code of 200 OK
indicates that the deployment has sufficient servers online to meet write quorum.
A response code of 503 Service Unavailable
indicates the deployment does not currently have write quorum.
The healthcheck probe alone cannot determine if the server is offline or processing write operations normally - only whether enough servers are online to meet write quorum requirements based on the configured erasure code parity.
Consider configuring a Prometheus alert using one of the following metrics to detect potential issues or errors on the cluster:
minio_cluster_servers_offline_total
to alert if one or more servers are offline.minio_server_drive_free_bytes
to alert if the deployment is running low on free drive space.
Cluster read quorum
Use the following endpoint to test if an AIStor deployment has read quorum:
curl -I https://aistor.example.net:9000/minio/health/cluster/read
Replace https://aistor.example.net:9000
with the DNS hostname and port of a server in the deployment to check.
For clusters using a load balancer to manage incoming connections, specify the hostname for the load balancer.
A response code of 200 OK
indicates that the deployment has sufficient servers online to meet read quorum.
A response code of 503 Service Unavailable
indicates the deployment does not currently have read quorum.
The healthcheck probe alone cannot determine if the server is offline or processing read operations normally - only whether enough servers are online to meet read quorum requirements based on the configured erasure code parity.
Consider configuring a Prometheus alert using the minio_cluster_servers_offline_total
metric to detect whether one or more servers are offline.
Cluster maintenance check
Use the following endpoint to test if an AIStor deployment can maintain both read and write if the specified server is taken down for maintenance:
curl -I https://aistor.example.net:9000/minio/health/cluster?maintenance=true
Replace https://aistor.example.net:9000
with the DNS hostname and port of a server in the deployment to check.
For clusters using a load balancer to manage incoming connections, specify the hostname for the load balancer.
A response code of 200 OK
indicates that the deployment has sufficient servers online to meet write quorum.
A response code of 412 Precondition Failed
indicates the deployment will lose quorum if the server goes offline.
The healthcheck probe alone cannot determine if the server is offline - only whether enough servers will be online after taking the server down for maintenance to meet read and write quorum requirements based on the configured erasure code parity.
Consider configuring a Prometheus alert using the minio_cluster_servers_offline_total
metric to detect whether one or more servers are offline.