Decommision Aged Hardware
AIStor supports decommissioning and removing server pools from a deployment with two or more pools. To decommission, there must be at least one remaining pool with sufficient available space to receive the objects from the decommissioned pools.
AIStor supports queueing multiple pools in a single decommission command. Each listed pool immediately enters a read-only status, but draining occurs one pool at a time.
Decommissioning is designed for removing an older server pool whose hardware is no longer sufficient or performant compared to the pools in the deployment. AIStor automatically migrates data from the decommissioned pools to the remaining pools in the deployment based on the ratio of free space available in each pool.
During the decommissioning process, AIStor routes read operations (e.g. GET
, LIST
, HEAD
) normally.
AIStor routes write operations (e.g. PUT
, versioned DELETE
) to the remaining “active” pools in the deployment.
Versioned objects maintain their ordering throughout the migration process.
The procedures on this page decommission and remove one or more server pools from an AIStor deployment with at least two server pools.
Prerequisites
Back up cluster settings first
Use the mc admin cluster bucket export
and mc admin cluster iam export
commands to take a snapshot of the bucket metadata and IAM configurations respectively prior to starting decommissioning.
You can use these snapshots to restore bucket/IAM settings to recover from user or process errors as necessary.
Networking and firewalls
Each node should have full bidirectional network access to every other node in the deployment.
For containerized or orchestrated infrastructures, this may require specific configuration of networking and routing components such as ingress or load balancers.
Certain operating systems may also require setting firewall rules.
For example, the following command explicitly opens the default Object Store API port 9000
on servers using firewalld
:
firewall-cmd --permanent --zone=public --add-port=9000/tcp
firewall-cmd --reload
If you set a static AIStor Console port (e.g. :9001
) you must also grant access to that port to ensure connectivity from external clients.
Use strongly recomends using a load balancer to manage connectivity to the cluster. The Load Balancer should use a “Least Connections” algorithm for routing requests to the AIStor deployment, since any AIStor node in the deployment can receive, route, or process client requests.
The following load balancers are known to work well with MinIO:
Configuring firewalls or load balancers to support AIStor is out of scope for this procedure.
Deployment must have sufficient storage
The decommissioning process migrates objects from the target pool to other pools in the deployment. The total available storage on the deployment must exceed the total storage of the decommissioned pool.
Use the Erasure Code Calculator to determine the usable storage capacity. Then reduce that by the size of the objects already on the deployment.
For example, consider a deployment with the following distribution of used and free storage:
Pool | Used Capacity | Total Capacity |
---|---|---|
Pool 1 | 100TB Used | 200TB Total |
Pool 2 | 100TB Used | 200TB Total |
Pool 3 | 100TB Used | 200TB Total |
Decommissioning Pool 1 requires distributing the 100TB of used storage across the remaining pools. Pool 2 and Pool 3 each have 100TB of unused storage space and can safely absorb the data stored on Pool 1.
However, if Pool 1 were full (that is, 200TB of used space), decommissioning would completely fill the remaining pools and potentially prevent any further write operations.
Considerations
Replacing a server pool
For hardware upgrade cycles where you replace old pool hardware with a new pool, you should complete the expansion before starting the decommissioning of the old pool. Adding the new pool first allows the decommission process to transfer objects in a balanced way across all available pools, both existing and new.
Decommissioning requires that a cluster’s topology remain stable throughout the pool draining process. Do not attempt to perform expansion and decommission changes in a single step.
Decommissioning is resumable
AIStor resumes decommissioning if interrupted by transient issues such as deployment restarts or network failures.
For manually cancelled or failed decommissioning attempts, AIStor resumes only after you manually re-initiate the decommissioning operation.
The pool remains in the decommissioning state regardless of the interruption. A pool can never return to active status after decommissioning begins.
Decommissioning is non-disruptive
Removing a decommissioned server pool requires restarting all AIStor nodes in the deployment at around the same time.
MinIO strongly recommends restarting all Object Store processes in a deployment simultaneously. AIStor operations are atomic and strictly consistent. As such the restart procedure is non-disruptive to applications and ongoing operations.
Do not perform “rolling” (that is, one node at a time) restarts.
DeleteMarker
Decommissioning ignores expired objects and trailing Decommissioning ignores objects where the only remaining version is a DeleteMarker
.
This avoids creating empty metadata on the remaining server pool(s) for objects that are effectively fully deleted.
Decommissioning also ignores object versions which have expired based on the configured lifecycle rules for the parent bucket.
You can monitor ignored delete markers and expired objects during the decommission process with mc admin trace --call decommission
.
Once the decommissioning process completes, you can safely shut down that pool.
Since the only remaining data was scheduled for deletion or was only a DeleteMarker
, you can safely clear or destroy those drives as per your internal procedures.
Behavior
Final listing check
At the end of the decommission process, AIStor checks for a list of items on the pool. If the list returns empty, AIStor marks the decommission as successfully completed. If any objects return, AIStor returns an error that the decommission process failed.
If the decommission fails, customers should open a SUBNET issue for further assistance before retrying the decommission.
Decommissioning a server with tiering enabled
For deployments with tiering enabled and active, decommissioning moves the object references to a new active pool.
Applications can continue issuing GET
requests against those objects where AIStor handles transparently retrieving them from the remote tier.