Bucket Replication

AIStor server-side bucket replication is an automatic bucket-level configuration that synchronizes objects between a source and destination bucket. AIStor server-side replication requires the source and destination bucket be two separate AIStor clusters running the same Object Store version.

For each write operation to the bucket, AIStor checks all configured replication rules for the bucket and applies the matching rule with highest configured priority. AIStor synchronizes new objects and object mutations, such as new object versions or changes to object metadata. This includes metadata operations such as enabling or modifying object locking or retention settings.

AIStor server-side bucket replication is functionally similar to Amazon S3 replication, with the addition of the following AIStor-only features:

Source and destination bucket names can match, supporting site-to-site use cases such as Splunk or Veeam BC/DR.
Simplified implementation than S3 bucket replication configuration, removing the need to configure settings like AccessControlTranslation, Metrics, and SourceSelectionCriteria.
Active-Active (Two-Way) replication of objects between source and destination buckets.
Multi-Site replication of objects between three or more AIStor deployments.

Replication Behaviors

Replication of Delete Operations

AIStor supports replicating delete operations, where AIStor synchronizes deleting specific object versions and new delete markers. Delete operation replication uses the same replication process as all other replication operations.

AIStor requires explicitly enabling versioned deletes and delete marker replication . Use the mc replicate add --replicate field to specify both or either delete and delete-marker to enable versioned deletes and delete marker replication respectively. To enable both, specify both strings using a comma separator delete,delete-marker.

For delete marker replication, AIStor begins the replication process after a delete operation creates the delete marker. AIStor uses the X-Minio-Replication-DeleteMarker-Status metadata field for tracking delete marker replication status. In active-active replication configurations, AIStor may produce duplicate delete markers if both clusters concurrently create a delete marker for an object or if one or both clusters were down before the replication event synchronized.

For replicating the deletion of a specific object version, AIStor marks the object version as PENDING until replication completes. Once the remote target deletes that object version, AIStor deletes the object on the source. While this process ensures near-synchronized version deletion, it may result in listing operations returning the object version after the initial delete operation. AIStor uses the X-Minio-Replication-Delete-Status for tracking delete version replication status.

AIStor only replicates explicit client-driven delete operations. AIStor does not replicate objects deleted from the application of lifecycle management expiration rules. For active-active configurations, set the same expiration rules on all of of the replication buckets to ensure consistent application of object expiration.

Replication of existing objects

AIStor by default replicates existing objects in the source bucket to the configured remote, similar to AWS: Replicating existing objects between S3 buckets without the overhead of contacting technical support.

AIStor marks all objects or object prefixes that satisfy the replication rules as eligible for synchronization to the remote cluster and bucket. AIStor only excludes those objects without a version ID, such as those objects written before enabling versioning on the bucket.

You can disable existing object replication while configuring or modifying the bucket replication rule. You must specify all desired replication features during creation or modification:

For new replication rules, exclude "existing-objects" from the list of replication features specified to mc replicate add --replicate.
For existing replication rules, remove "existing-objects" from the list of existing replication features using mc replicate update --replicate. The new rule replaces the previous rule.

Disabling existing object replication does not remove any objects already replicated to the remote bucket.

Synchronous vs asynchronous replication

AIStor supports specifying either asynchronous (default) or synchronous replication for a given remote target.

With asynchronous replication, AIStor completes the originating PUT operation before placing the object into a replication queue. The originating client may therefore see a successful PUT operation before the object is replicated. While this may result in stale or missing objects on the remote, it mitigates the risk of slow write operations due to replication load.

With synchronous replication, AIStor attempts to replicate the object prior to completing the originating PUT operation. AIStor returns a successful PUT operation whether or not the replication attempt succeeds. This reduces the risk of slow write operations at a possible cost of stale or missing objects on the remote location.

You can configure synchronous replication after creating the bucket replication rule using the mc replicate update --sync command.

Replicating from the edge

AIStor Object Store supports replicating objects from a bucket on a cluster located at the “edge”, where compute resources are placed closer to users to reduce latency. Create bucket replication rules on the edge cluster using mc admin replicate add --edge. To modify an existing rule, use the same flag with mc replicate update.

With the --edge flag enabled on a rule, objects replicate from the edge cluster to their target with a status of REPLICA-EDGE. This status allows the receiving cluster to further replicate the object to additional targets, based on the rules of the receiving cluster. Use caution when configuring replication rules to prevent creating a loop where objects re-replicate back to the original source cluster at the edge.

If you also have ILM expiration rules defined on the edge cluster, you can prevent the rules from deleting objects from the edge until after they have replicated to another cluster. Use the --edge-sync-before-expiry flag on the rule to prevent automatic deletion until after replication has completed.

Resynchronization (Disaster Recovery)

Resynchronization primarily supports recovery after partial or total loss of the data on an AIStor deployment using a healthy deployment in the replica configuration. Use the mc replicate resync command completely resynchronize the remote target using the specified source bucket.

The resynchronization process checks all objects in the source bucket against all configured replication rules that include existing object replication. For each object which matches a rule, the resynchronization process places the object into the replication queue regardless of the object’s current replication status.

AIStor skips synchronizing those objects whose remote copy exactly match the source, including object metadata. AIStor otherwise does not prioritize or modify the queue with regards to the existing contents of the target.

mc replicate resync operates at the bucket level and does not support prefix-level granularity. Initiating resynchronization on a large bucket may result in a significant increase in replication-related load and traffic. Use this command with caution and only when necessary.

For buckets with object transition (Tiering) configured, replication resynchronization restores objects in a non-transitioned state with no associated transition metadata. Any data previously transitioned to the remote storage is therefore permanently disconnected from the remote AIStor deployment. For tiering configurations which specify an explicit human-readable prefix as part of the remote configuration, you can safely purge the transitioned data in that prefix to avoid costs associated to the “lost” data.

Replication internals

This section documents internal replication behavior and is not critical to using or implementing replication. This documentation is provided strictly for learning and educational purposes.

AIStor uses a replication queuing system with multiple concurrent replication workers operating on that queue. AIStor continuously works to replicate and remove objects from the queue while scanning for new unreplicated objects to add to the queue.

AIStor queues failed replication operations and retries those operations up to three (3) times. AIStor dequeues replication operations that fail to replicate after three attempts. The scanner can pick up those affected objects at a later time and requeue them for replication.

Failed or pending replications requeue automatically when performing a list or any GET or HEAD API method. For example, using mc stat, mc cat, or mc ls commands after a site comes back online prompts healing to requeue.

AIStor sets the X-Amz-Replication-Status metadata field according to the replication state of the object:

Replication State	Description
`PENDING`	The object has not yet been replicated. AIStor applies this state if the object meets one of the configured replication rules on the bucket. AIStor continuously scans for `PENDING` objects not yet in the replication queue and adds them to the queue as space is available. For multi-site replication, objects remain in the `PENDING` state until replicated to all configured remotes for that bucket or bucket prefix.
`COMPLETED`	The object has successfully replicated to the remote cluster.
`FAILED`	The object failed to replicate to the remote cluster. AIStor continuously scans for `FAILED` objects not yet in the replication queue and adds them to the queue as space is available.
`REPLICA`	The object is itself a replica from a remote source.
`REPLICA-EDGE`	The object has been replicated from an edge source and is available to replicate to other targets.

The replication process generally has one of the following flows:

PENDING -> COMPLETED
PENDING -> FAILED -> COMPLETED