Site Replication

Site replication configures multiple independent AIStor deployments as a cluster of replicas called peer sites.

Diagram of a site replication deployment with two sites

Site replication assumes the use of either the included AIStor identity provider (IDP) or an external IDP. All configured deployments must use the same IDP. Deployments using an external IDP must use the same configuration across sites.

AIStor does not recommend using macOS, Windows, or non-orchestrated containerized deployments for site replication except for early development, evaluation, or general experimentation.

Prerequisites

Back up cluster settings first

Run the following commands to take snapshots of important information before you configure site replication:

mc admin cluster bucket export for bucket metadata
mc admin cluster iam import IAM configurations

You can use these snapshots to restore bucket or IAM settings in the event of misconfiguration during site replication configuration.

One site with data at setup

Only one site can have data at the time of setup. The other sites must be empty of buckets and objects.

After configuring site replication, any data on the first deployment replicates to the other sites.

All sites must use the same IDP

All sites must use the same Identity Provider. Site replication supports the included AIStor IDP, OIDC, or LDAP.

All sites must use the same object store version

All sites must have a matching and consistent AIStor version. Configuring replication between sites with mismatched AIStor versions may result in unexpected or undesired replication behavior.

You should also ensure the mc version used to configure replication closely matches the AIStor server version.

All sites must enable encryption

All sites must enable encryption using a Key Management Service (KMS) as part of the site replication configuration. See the Linux and Kubernetes tutorials for more information.

Replication requires versioning

Site replication requires Bucket Versioning and enables it for all created buckets automatically. You cannot disable versioning in site replication deployments.

AIStor cannot replicate objects in prefixes in the bucket that you excluded from versioning.

Load balancers installed on each site

Specify the URL or IP address of the site’s load balancer, reverse proxy, or similar network control plane component. Requests are automatically routed to nodes in the deployment.

AIStor recommends against using a single node hostname for a peer site. This creates a single point of failure: if that node goes offline, replication fails.

Switch to site replication from bucket replication

Bucket replication and multi-site replication are mutually exclusive. You cannot use both replication methods on the same deployments.

If you previously set up bucket replication and wish to now use site replication, you must first delete all of the bucket replication rules on the deployment that has data when initializing site replication. Use mc replicate rm to remove bucket replication rules.

Behaviors

What replicates across all sites

Each AIStor deployment (“peer site”) synchronizes the following changes across the other peer sites:

Creation, modification, and deletion of buckets and objects, including:
- Bucket and Object Configurations
- Policies
- Object tags
- WORM Locking, including retention and legal hold configurations
- Encryption settings
Creation and deletion of IAM users, groups, policies, and policy mappings to users or groups (for LDAP users or groups)
Creation of Security Token Service (STS) credentials for session tokens verifiable from the local root credentials
Creation and deletion of access keys (except those owned by the root user)

Site replication enables bucket versioning for all new and existing buckets on all replicated sites.

Optional site replication settings

You can choose to replicate ILM expiration rules across peer sites. For new site replication configurations, use the mc admin replicate add with the --replicate-ilm-expiry flag. For existing site replication configurations, you can enable or disable the behavior using mc admin replicate update with either the --enable-ilm-expiry-replication or --disable-ilm-expiry-replication flag, as appropriate.

What does not replicate across sites

AIStor deployments in a site replication configuration do not replicate the creation or modification of the following items:

Encryption key replication

AIStor by default replicates the key ID used to encrypt an object on the source site to all peers. This configuration assumes that all peer sites use a centralized or synchronized Key Management Service (KMS), where all peers can successfully request a key with that ID.

For site replication configurations where each peer site has a fully independent KMS, replication may fail if an object replicates with a key ID that does not exist on a peer KMS. You can set MINIO_KMS_REPLICATE_KEYID=off to disable this behavior and direct the source site to omit the key id during replication. The peer sites use their own encryption configurations to encrypt the replicated object.

Disabling key replication applies to all sources of key IDs: client-specified, bucket default, and cluster default. For workloads that require peer-local KMS solutions, consider deploying an AIStor Key Manager cluster where each peer has one or more local Key Manager hosts for supporting cryptographic operations. This topology can satisfy High Availability requirements around KMS deployment while removing the need to manage per-peer key ID configurations.

Initial site replication process

After enabling site replication, identity and access management (IAM) settings sync in the following order:

AIStor IDP

Policies
User accounts (for local users)
Groups
Access Keys

Access Keys for root do not sync.
Policy mapping for synced user accounts
Policy mapping for Security Token Service (STS) users

OIDC

Policies
Access Keys associated to OIDC accounts with a valid AIStor Policy. root access keys do not sync.
Policy mapping for synced user accounts
Policy mapping for Security Token Service (STS) users

LDAP

Policies
Groups
Access Keys associated to LDAP accounts with a valid AIStor Policy. root access keys do not sync.
Policy mapping for synced user accounts
Policy mapping for Security Token Service (STS) users

After the initial synchronization of data across peer sites, AIStor continually replicates and synchronizes replicable data among all sites as they occur on any site.

Site healing

Any AIStor deployment in the site replication configuration can resynchronize damaged replica-eligible data from the peer with the most updated (“latest”) version of that data.

AIStor dequeues replication operations that fail to replicate after three attempts. The scanner picks up those affected objects at a later time and requeues them for replication.

Failed or pending replications requeue automatically when performing any GET or HEAD API method. For example, using mc stat, mc cat, or mc ls commands after a site comes back online prompts healing to requeue.

If one site loses data for any reason, resynchronize the data from another healthy site with mc admin replicate resync. This launches an active process that resynchronizes the data without waiting for the passive AIStor scanner to recognize the missing data.

You can adjust how AIStor balances the scanner performance with read/write operations using either the MINIO_SCANNER_SPEED environment variable or the scanner speed configuration setting.

Synchronous vs asynchronous replication

AIStor supports specifying either asynchronous (default) or synchronous replication for a given remote target.

With asynchronous replication, AIStor completes the originating PUT operation before placing the object into a queue for replication. The originating client may therefore see a successful PUT operation before the object is replicated. While this may result in stale or missing objects on the remote, it mitigates the risk of slow write operations due to replication load.

With synchronous replication, AIStor attempts to replicate the object prior to completing the originating PUT operation. AIStor returns a successful PUT operation whether or not the replication attempt succeeds. This reduces the risk of slow write operations at a possible cost of stale or missing objects on the remote location.

AIStor strongly recommends using the default asynchronous site replication. Synchronous site replication performance depends strongly on latency between sites, where higher latency can result in lower PUT performance and replication lag. To configure synchronous site replication use mc admin replicate update --sync

Proxy to other sites

AIStor peer sites can proxy GET/HEAD requests for an object to other peers to check if it exists. This allows a site that is healing or lagging behind other peers to still return an object persisted to other sites.

For example:

A client issues GET("data/invoices/january.xls") to Site1.
Site1 cannot locate the object.
Site1 proxies the request to Site2.
Site2 returns the latest version of the requested object.
Site1 returns the proxied object to the client.

For GET/HEAD requests that do not include a unique version ID, the proxy request returns the latest version of that object on the peer site. This may result in retrieval of a non-current version of an object, such as if the responding peer site is also experiencing replication lag.

AIStor does not proxy LIST, DELETE, and PUT operations.

Remove all versions of an object on delete

--purge-on-delete on individual buckets in a deployment with site replication enabled is supported on AIStor RELEASE.2025-03-31T22-43-59Z or later.

In applications where site replication is necessary but versioning is not critical, you can set individual buckets to delete all versions of an object when the object is deleted.

Use mc version enable with the --purge-on-delete flag on the desired bucket. For example, the following command enables deleting of all versions on the bucket called bucket1 on the deployment at alias aistor:

mc version enable aistor/bucket1 --purge-on-delete "on"