Spread Zone Configuration

Added in Operator RELEASE.2025-10-16T16-46-38Z

Spread zone configuration requires Operator version RELEASE.2025-10-16T16-46-38Z or later.

Spread zone configuration ensures that AIStor distributes Pods across configured zones to optimize availability and resiliency in the event any single worker node fails.

Overview

The Operator supports spread zone configuration by allowing you to specify zones in the topology of your cluster at the pool level.

When you configure spread zones, the Operator automatically distributes your object shards across the specified zones using a round-robin algorithm. This ensures that if one zone fails, the loss impacts the minimum possible number of parity or data shards, maintaining high availability.

When configuring your cluster, consider the number of zones needed for your erasure coding scheme. For example, an EC:3 configuration with a total stripe size of 8 should have at least 8 zones so that each data and parity shard of an object resides in a different failure domain.

How it works

The Operator uses node labels to identify which zone each node belongs to and distributes Pods across those zones in a round-robin pattern. By default, the Operator looks for the standard Kubernetes topology.kubernetes.io/zone label. You can configure a custom label using the matchLabel key. For example, you might use the label topology.kubernetes.io/rack to configure zones for a rack-based distribution.

When you enable spread zones for a pool, the Operator assigns each Pod to a specific zone. The first Pod goes to zone 1, the second to zone 2, and so on, cycling back to zone 1 after going through all zones. The Operator enforces this assignment using Kubernetes node affinity rules to ensure Pods only schedule on nodes in their assigned zone.

This assignment is permanent and consistent across Pod restarts.

Label nodes before deployment

Label all nodes with the appropriate zone identifier before deploying the ObjectStore. The Operator cannot assign Pods to zones that do not exist in the cluster.

Set up spread zones

Prerequisites

Before deploying an object store with spread zone configured, you must add zone labels.

Label your nodes with the topology label:

Modify the following examples to reflect your unique infrastructure.

For zone-based topology (default):

kubectl label node node1 topology.kubernetes.io/zone=zone-1
kubectl label node node2 topology.kubernetes.io/zone=zone-2
kubectl label node node3 topology.kubernetes.io/zone=zone-3
kubectl label node node4 topology.kubernetes.io/zone=zone-4

For rack-based topology (custom):

kubectl label node node1 rack-label=rack-1
kubectl label node node2 rack-label=rack-2
kubectl label node node3 rack-label=rack-3
kubectl label node node4 rack-label=rack-4

Verify node labels:

kubectl get nodes --show-labels | grep topology

Or, for the rack-based labels:

kubectl get nodes --show-labels | grep rack

Configuration

Configure spread zones in the ObjectStore custom resource definition at the pool level.

The following examples show how to configure spread zones in the Helm chart values file:

Default labels

Use the default configuration to distribute Pods across zones using the standard Kubernetes topology.kubernetes.io/zone label:

objectStore:
  pools:
    - name: pool-0
      servers: 8
      volumesPerServer: 4
      spreadZoneConfig:
        # Enable spread zone distribution
        enabled: true

The Operator automatically discovers zones by querying nodes with the topology.kubernetes.io/zone label.

Custom labels

Use a custom label (such as rack-label) for rack-based or other custom topology distributions:

objectStore:
  pools:
    - name: pool-0
      servers: 8
      volumesPerServer: 4
      spreadZoneConfig:
        # Enable spread zone distribution
        enabled: true
        # Set `matchLabel` to specify a custom label
        matchLabel: "rack-label"
        # Optional: specify zones explicitly (auto-discovers if not provided)
        matchValues:
          - "rack-1"
          - "rack-2"
          - "rack-3"
          - "rack-4"

AIStor places a Pod on rack-1, the next Pod on rack-2, then on rack-3, then on rack-4. After placing four Pods, it returns to placing a Pod on rack-1 again. Pod placement persists through restarts.

Verify the mutating webhook

Verify the mutating webhook is deployed and running in the operator namespace:

# Check webhook configuration
kubectl get mutatingwebhookconfigurations object-store-operator-webhook

# Check webhook deployment (in the operator namespace, default is 'aistor')
kubectl get deployment -n aistor object-store-webhook

# Check webhook logs
kubectl logs -n aistor deployment/object-store-webhook

Verify pod distribution

After deploying an ObjectStore with spread zone configuration, verify that AIStor distributes Pods correctly across zones:

# Check pod distribution across zones
kubectl get pods -l aistor.min.io/objectStore=<objectstore-name> -o wide

# Check nodeAffinity rules injected by webhook
kubectl get pod <pod-name> -o yaml | grep -A 10 nodeAffinity

Replace <objectstore-name> and <pod-name> with your actual ObjectStore name and pod name.

Best practices

Match zones to erasure set size

Configure the number of zones to match or exceed your erasure set stripe size for optimal fault tolerance.

Explicitly set matchValues

While auto-discovery is convenient, explicitly setting matchValues provides better control and predictability. This ensures the exact zone order and prevents unexpected changes when you add new zones to the cluster.

spreadZoneConfig:
  enabled: true
  matchValues: ["rack-1", "rack-2", "rack-3", "rack-4"]  # Explicit is better than implicit

Do not change zone labels after initial scheduling

Do not modify zone labels on nodes or add new nodes with the matchLabel after scheduling the StatefulSet for the first time. The webhook determines zone assignments during the initial pod creation. Changing the topology afterwards can lead to inconsistent state.

Understand nodeAffinity behavior

The webhook sets requiredDuringSchedulingIgnoredDuringExecution nodeAffinity rules. This means:

The Operator schedules Pods on nodes with the correct zone label during initial creation.
After initial scheduling, the affinity rules do not force Pod eviction if labels change.
If you accidentally add more matchValues or change node labels after deployment, existing Pods remain unaffected.
Only new Pods or Pods that need to be rescheduled use the updated topology.

Use consistent labeling

Maintain consistent topology labels across your cluster to avoid scheduling issues.

Test topology changes in staging

If you must modify the spread zone configuration, test the changes thoroughly in a staging environment first, as Pods will need to be recreated to adopt new zone assignments.

Reference

For complete field definitions, see the ObjectStore CRD reference.

SpreadZoneConfig fields

enabled: Whether to enable the spread zone feature.
matchLabel: The label selector to use for selecting the nodes that are part of the zone. Default label is topology.kubernetes.io/zone.
matchValues: The list of possible values in the spec.pools.[].spreadZoneConfig.matchLabel. The Operator only considers the nodes that have the label with one of the values in this list for spreading the Pods. If no value is provided, the Operator dynamically obtains the list of zones by querying all the nodes with the spec.pools.[].spreadZoneConfig.matchLabel.