# Deploy MinIO in Distributed Mode

## Overview

A distributed MinIO deployment consists of 4 or more drives/volumes managed by one or more minio server process, where the processes manage pooling the compute and storage resources into a single aggregated object storage resource. Each MinIO server has a complete picture of the distributed topology, such that an application can connect to any node in the deployment and perform S3 operations.

Distributed deployments implicitly enable erasure coding, MinIO’s data redundancy and availability feature that allows deployments to automatically reconstruct objects on-the-fly despite the loss of multiple drives or nodes in the cluster. Erasure coding provides object-level healing with less overhead than adjacent technologies such as RAID or replication.

Depending on the configured erasure code parity, a distributed deployment with m servers and n disks per server can continue serving read and write operations with only m/2 servers or m*n/2 drives online and accessible.

Distributed deployments also support the following features:

## Prerequisites

### Networking and Firewalls

Each node should have full bidirectional network access to every other node in the deployment. For containerized or orchestrated infrastructures, this may require specific configuration of networking and routing components such as ingress or load balancers. Certain operating systems may also require setting firewall rules. For example, the following command explicitly opens the default MinIO server API port 9000 for servers running firewalld :

firewall-cmd --permanent --zone=public --add-port=9000/tcp


All MinIO servers in the deployment must use the same listen port.

If you set a static MinIO Console port (e.g. :9001) you must also grant access to that port to ensure connectivity from external clients.

MinIO strongly recomends using a load balancer to manage connectivity to the cluster. The Load Balancer should use a “Least Connections” algorithm for routing requests to the MinIO deployment, since any MinIO node in the deployment can receive, route, or process client requests.

The following load balancers are known to work well with MinIO:

Configuring firewalls or load balancers to support MinIO is out of scope for this procedure.

### Sequential Hostnames

MinIO requires using expansion notation {x...y} to denote a sequential series of MinIO hosts when creating a server pool. MinIO therefore requires using sequentially-numbered hostnames to represent each minio server process in the deployment.

Create the necessary DNS hostname mappings prior to starting this procedure. For example, the following hostnames would support a 4-node distributed deployment:

• minio1.example.com

• minio2.example.com

• minio3.example.com

• minio4.example.com

You can specify the entire range of hostnames using the expansion notation minio{1...4}.example.com.

Configuring DNS to support MinIO is out of scope for this procedure.

### Local JBOD Storage with Sequential Mounts

MinIO strongly recommends direct-attached JBOD arrays with XFS-formatted disks for best performance.

• Direct-Attached Storage (DAS) has significant performance and consistency advantages over networked storage (NAS, SAN, NFS).

• Deployments using non-XFS filesystems (ext4, btrfs, zfs) tend to have lower performance while exhibiting unexpected or undesired behavior.

• RAID or similar technologies do not provide additional resilience or availability benefits when used with distributed MinIO deployments, and typically reduce system performance.

Ensure all nodes in the deployment use the same type (NVMe, SSD, or HDD) of drive with identical capacity (e.g. N TB) . MinIO does not distinguish drive types and does not benefit from mixed storage types. Additionally. MinIO limits the size used per disk to the smallest drive in the deployment. For example, if the deployment has 15 10TB disks and 1 1TB disk, MinIO limits the per-disk capacity to 1TB.

MinIO requires using expansion notation {x...y} to denote a sequential series of disks when creating the new deployment, where all nodes in the deployment have an identical set of mounted drives. MinIO also requires that the ordering of physical disks remain constant across restarts, such that a given mount point always points to the same formatted disk. MinIO therefore strongly recommends using /etc/fstab or a similar file-based mount configuration to ensure that drive ordering cannot change after a reboot. For example:

$mkfs.xfs /dev/sdb -L DISK1$ mkfs.xfs /dev/sdc -L DISK2
$mkfs.xfs /dev/sdd -L DISK3$ mkfs.xfs /dev/sde -L DISK4

$nano /etc/fstab # <file system> <mount point> <type> <options> <dump> <pass> LABEL=DISK1 /mnt/disk1 xfs defaults,noatime 0 2 LABEL=DISK2 /mnt/disk2 xfs defaults,noatime 0 2 LABEL=DISK3 /mnt/disk3 xfs defaults,noatime 0 2 LABEL=DISK4 /mnt/disk4 xfs defaults,noatime 0 2  You can then specify the entire range of disks using the expansion notation /mnt/disk{1...4}. If you want to use a specific subfolder on each disk, specify it as /mnt/disk{1...4}/minio. MinIO does not support arbitrary migration of a drive with existing MinIO data to a new mount position, whether intentional or as the result of OS-level behavior. Network File System Volumes Break Consistency Guarantees MinIO’s strict read-after-write and list-after-write consistency model requires local disk filesystems. MinIO cannot provide consistency guarantees if the underlying storage volumes are NFS or a similar network-attached storage volume. For deployments that require using network-attached storage, use NFSv4 for best results. ## Considerations ### Homogeneous Node Configurations MinIO strongly recommends selecting substantially similar hardware configurations for all nodes in the deployment. Ensure the hardware (CPU, memory, motherboard, storage adapters) and software (operating system, kernel settings, system services) is consistent across all nodes. Deployment may exhibit unpredictable performance if nodes have heterogeneous hardware or software configurations. Workloads that benefit from storing aged data on lower-cost hardware should instead deploy a dedicated “warm” or “cold” MinIO deployment and transition data to that tier. ### Erasure Coding Parity MinIO erasure coding is a data redundancy and availability feature that allows MinIO deployments to automatically reconstruct objects on-the-fly despite the loss of multiple drives or nodes in the cluster. Erasure Coding provides object-level healing with less overhead than adjacent technologies such as RAID or replication. Distributed deployments implicitly enable and rely on erasure coding for core functionality. Erasure Coding splits objects into data and parity blocks, where parity blocks support reconstruction of missing or corrupted data blocks. The number of parity blocks in a deployment controls the deployment’s relative data redundancy. Higher levels of parity allow for higher tolerance of drive loss at the cost of total available storage. MinIO defaults to EC:4 , or 4 parity blocks per erasure set. You can set a custom parity level by setting the appropriate MinIO Storage Class environment variable. Consider using the MinIO Erasure Code Calculator for guidance in selecting the appropriate erasure code parity level for your cluster. ### Capacity-Based Planning MinIO generally recommends planning capacity such that server pool expansion is only required after 2+ years of deployment uptime. For example, consider an application suite that is estimated to produce 10TB of data per year. The MinIO deployment should provide at minimum: 10TB + 10TB + 10TB = 30TB MinIO recommends adding buffer storage to account for potential growth in stored data (e.g. 40TB of total usable storage). As a rule-of-thumb, more capacity initially is preferred over frequent just-in-time expansion to meet capacity requirements. Since MinIO erasure coding requires some storage for parity, the total raw storage must exceed the planned usable capacity. Consider using the MinIO Erasure Code Calculator for guidance in planning capacity around specific erasure code settings. ### Pre-Existing Data When starting a new MinIO server in a distributed environment, the storage devices must not have existing data. Once you start the MinIO server, all interactions with the data must be done through the S3 API. Use the MinIO Client, the MinIO Console, or one of the MinIO Software Development Kits to work with the buckets and objects. Warning Modifying files on the backend drives can result in data corruption or data loss. ## Deploy Distributed MinIO The following procedure creates a new distributed MinIO deployment consisting of a single Server Pool. All commands provided below use example values. Replace these values with those appropriate for your deployment. Review the Prerequisites before starting this procedure. ### 1) Install the MinIO Binary on Each Node The following tabs provide examples of installing MinIO onto 64-bit Linux operating systems using RPM, DEB, or binary. The RPM and DEB packages automatically install MinIO to the necessary system paths and create a systemd service file for running MinIO automatically. MinIO strongly recommends using RPM or DEB installation routes. Use the following commands to download the latest stable MinIO RPM and install it. wget https://dl.min.io/server/minio/release/linux-amd64/minio-20220611195532.0.0.x86_64.rpm -O minio.rpm sudo dnf install minio.rpm  Use the following commands to download the latest stable MinIO DEB and install it: wget https://dl.min.io/server/minio/release/linux-amd64/minio_20220611195532.0.0_amd64.deb -O minio.deb sudo dpkg -i minio.deb  Use the following commands to download the latest stable MinIO binary and install it to the system $PATH:

wget https://dl.min.io/server/minio/release/linux-amd64/minio
chmod +x minio
sudo mv minio /usr/local/bin/


### 2) Create the systemd Service File

The .deb or .rpm packages install the following systemd service file to /etc/systemd/system/minio.service. For binary installations, create this file manually on all MinIO hosts:

[Unit]
Description=MinIO
Documentation=https://docs.min.io
Wants=network-online.target
After=network-online.target
AssertFileIsExecutable=/usr/local/bin/minio

[Service]
WorkingDirectory=/usr/local

User=minio-user
Group=minio-user
ProtectProc=invisible

EnvironmentFile=-/etc/default/minio
ExecStartPre=/bin/bash -c "if [ -z \"${MINIO_VOLUMES}\" ]; then echo \"Variable MINIO_VOLUMES not set in /etc/default/minio\"; exit 1; fi" ExecStart=/usr/local/bin/minio server$MINIO_OPTS $MINIO_VOLUMES # Let systemd restart this service always Restart=always # Specifies the maximum file descriptor number that can be opened by this process LimitNOFILE=65536 # Specifies the maximum number of threads this process can create TasksMax=infinity # Disable timeout logic and wait until process is stopped TimeoutStopSec=infinity SendSIGKILL=no [Install] WantedBy=multi-user.target # Built for${project.name}-${project.version} (${project.name})


The minio.service file runs as the minio-user User and Group by default. You can create the user and group using the groupadd and useradd commands. The following example creates the user, group, and sets permissions to access the folder paths intended for use by MinIO. These commands typically require root (sudo) permissions.

groupadd -r minio-user
useradd -M -r -g minio-user minio-user
chown minio-user:minio-user /mnt/disk1 /mnt/disk2 /mnt/disk3 /mnt/disk4


The specified disk paths are provided as an example. Change them to match the path to those disks intended for use by MinIO.

Alternatively, change the User and Group values to another user and group on the system host with the necessary access and permissions.

MinIO publishes additional startup script examples on github.com/minio/minio-service.

### 3) Create the Service Environment File

Create an environment file at /etc/default/minio. The MinIO service uses this file as the source of all environment variables used by MinIO and the minio.service file.

The following examples assumes that:

• The deployment has a single server pool consisting of four MinIO server hosts with sequential hostnames.

minio1.example.com   minio3.example.com
minio2.example.com   minio4.example.com

• All hosts have four locally-attached disks with sequential mount-points:

/mnt/disk1/minio   /mnt/disk3/minio
/mnt/disk2/minio   /mnt/disk4/minio

• The deployment has a load balancer running at https://minio.example.net that manages connections across all four MinIO hosts.

Modify the example to reflect your deployment topology:

# Set the hosts and volumes MinIO uses at startup
# The command uses MinIO expansion notation {x...y} to denote a
# sequential series.
#
# The following example covers four MinIO hosts
# with 4 drives each at the specified hostname and drive locations.
# The command includes the port that each MinIO server listens on
# (default 9000)

MINIO_VOLUMES="https://minio{1...4}.example.net:9000/mnt/disk{1...4}/minio"

# Set all MinIO server options
#
# The following explicitly sets the MinIO Console listen address to
# port 9001 on all network interfaces. The default behavior is dynamic
# port selection.

# Set the root username. This user has unrestricted permissions to
# perform S3 and administrative API operations on any resource in the
# deployment.
#

#
# Use a long, random, unique string that meets your organizations

# Set to the URL of the load balancer for the MinIO deployment
# This value *must* match across all MinIO servers. If you do
# not have a load balancer, set this value to to any *one* of the
# MinIO hosts in the deployment as a temporary measure.
MINIO_SERVER_URL="https://minio.example.net:9000"


You may specify other environment variables or server commandline options as required by your deployment. All MinIO nodes in the deployment should include the same environment variables with the same values for each variable.

MinIO enables Transport Layer Security (TLS) 1.2+ automatically upon detecting a valid x.509 certificate (.crt) and private key (.key) in the MinIO ${HOME}/.minio/certs directory. For systemd-managed deployments, use the $HOME directory for the user which runs the MinIO server process. The provided minio.service file runs the process as minio-user. The previous step includes instructions for creating this user with a home directory /home/minio-user.

• Place TLS certificates into /home/minio-user/.minio/certs.

• If any MinIO server or client uses certificates signed by an unknown Certificate Authority (self-signed or internal CA), you must place the CA certs in the /home/minio-user/.minio/certs/CAs on all MinIO hosts in the deployment. MinIO rejects invalid certificates (untrusted, expired, or malformed).

If the minio.service file specifies a different user account, use the \$HOME directory for that account. Alternatively, specify a custom certificate directory using the minio server --certs-dir commandline argument. Modify the MINIO_OPTS variable in /etc/defaults/minio to set this option. The systemd user which runs the MinIO server process must have read and listing permissions for the specified directory.

For more specific guidance on configuring MinIO for TLS, including multi-domain support via Server Name Indication (SNI), see Network Encryption (TLS). You can optionally skip this step to deploy without TLS enabled. MinIO strongly recommends against non-TLS deployments outside of early development.

### 5) Run the MinIO Server Process

Issue the following commands on each node in the deployment to start the MinIO service:

sudo systemctl start minio.service


Use the following commands to confirm the service is online and functional:

sudo systemctl status minio.service
journalctl -f -u minio.service


MinIO may log an increased number of non-critical warnings while the server processes connect and synchronize. These warnings are typically transient and should resolve as the deployment comes online.

### 6) Open the MinIO Console

Open your browser and access any of the MinIO hostnames at port :9001 to open the MinIO Console login page. For example, https://minio1.example.com:9001.

You can use the MinIO Console for general administration tasks like Identity and Access Management, Metrics and Log Monitoring, or Server Configuration. Each MinIO server includes its own embedded MinIO Console.

## Deployment Recommendations

### Minimum Nodes per Deployment

For all production deployments, MinIO recommends a minimum of 4 nodes per server pool with 4 drives per server. With the default erasure code parity setting of EC:4, this topology can continue serving read and write operations despite the loss of up to 4 drives or one node.

The minimum recommendation reflects MinIO’s experience with assisting enterprise customers in deploying on a variety of IT infrastructures while maintaining the desired SLA/SLO. While MinIO may run on less than the minimum recommended topology, any potential cost savings come at the risk of decreased reliability.

### Server Hardware

MinIO is hardware agnostic and runs on a variety of hardware architectures ranging from ARM-based embedded systems to high-end x64 and POWER9 servers.

The following recommendations match MinIO’s Reference Hardware for large-scale data storage:

Processor Dual Intel Xeon Scalable Gold CPUs with 8 cores per socket. 128GB of Memory per pod Minimum of 25GbE NIC and supporting network infrastructure between nodes. MinIO can make maximum use of drive throughput, which can fully saturate network links between MinIO nodes or clients. Large clusters may require 100GbE network infrastructure to fully utilize MinIO’s per-node performance potential. SATA/SAS NVMe/SSD with a minimum of 8 drives per server. Drives should be JBOD arrays with no RAID or similar technologies. MinIO recommends XFS formatting for best performance. Use the same type of disk (NVMe, SSD, or HDD) with the same capacity across all nodes in the deployment. MinIO does not distinguish drive types when using the underlying storage and does not benefit from mixed storage types. Additionally. MinIO limits the size used per disk to the smallest drive in the deployment. For example, if the deployment has 15 10TB disks and 1 1TB disk, MinIO limits the per-disk capacity to 1TB.

### Networking

MinIO recommends high speed networking to support the maximum possible throughput of the attached storage (aggregated drives, storage controllers, and PCIe busses). The following table provides general guidelines for the maximum storage throughput supported by a given NIC:

NIC bandwidth (Gbps)

Estimated Aggregated Storage Throughput (GBps)

10GbE

1GBps

25GbE

2.5GBps

50GbE

5GBps

100GbE

10GBps

### CPU Allocation

MinIO can perform well with consumer-grade processors. MinIO can take advantage of CPUs which support AVX-512 SIMD instructions for increased performance of certain operations.

MinIO benefits from allocating CPU based on the expected per-host network throughput. The following table provides general guidelines for allocating CPU for use by based on the total network bandwidth supported by the host:

Host NIC Bandwidth

Recommended Pod vCPU

10GbE or less

8 vCPU per pod.

25GbE

16 vCPU per pod.

50GbE

32 vCPU per pod.

100GbE

64 vCPU per pod.

### Memory Allocation

MinIO benefits from allocating memory based on the total storage of each host. The following table provides general guidelines for allocating memory for use by MinIO server processes based on the total amount of local storage on the host:

Total Host Storage

Recommended Host Memory

Up to 1 Tebibyte (Ti)

8GiB

Up to 10 Tebibyte (Ti)

16GiB

Up to 100 Tebibyte (Ti)

32GiB

Up to 1 Pebibyte (Pi)

64GiB

More than 1 Pebibyte (Pi)

128GiB

### Requests Per Node

You can calculate the maximum number of concurrent requests per host with this formula:

$$totalRam / ramPerRequest$$

To calculate the amount of RAM used for each request, use this formula:

$$((2MiB + 128KiB) * driveCount) + (2 * 10MiB) + (2 * 1 MiB)$$

10MiB is the default erasure block size v1. 1 MiB is the default erasure block size v2.

The following table lists the maximum concurrent requests on a node based on the number of host drives and the free system RAM:

Number of Drives

32 GiB of RAM

64 GiB of RAM

128 GiB of RAM

256 GiB of RAM

512 GiB of RAM

4 Drives

1,074

2,149

4,297

8,595

17,190

8 Drives

840

1,680

3,361

6,722

13,443

16 Drives

585

1,170

2.341

4,681

9,362