Network and storage performance testing

This guide covers a layered approach to performance validation for MinIO AIStor deployments using purpose-built tools. Run these tests before and after upgrades to establish baselines and detect regressions.

For built-in MinIO performance tests, see Benchmarking. For OS-level tuning, see Performance Tuning.

Toolset

MinIO provides three dedicated performance testing tools:

Tool	Description	Install
hperf	Network bandwidth and latency testing across nodes and data centers	`go install github.com/minio/hperf@latest`
dperf	Drive I/O benchmarking for identifying storage bottlenecks and outliers	`go install github.com/minio/dperf@latest`
warp	S3 application-layer benchmarking (PUT, GET, LIST, STAT, mixed, multipart)	`go install github.com/minio/warp@latest`

Testing layers

Performance testing should cover all layers of the stack. Issues at lower layers (network, disk) directly impact application-layer performance.

Layer	Tests	Tools
Link (L1)	Per-node NIC throughput, MTU validation, drive baseline	hperf, dperf
Internet (L2)	Cross-DC latency, cross-DC bandwidth	hperf
Transport (L3)	TCP buffer tuning, concurrency scaling	hperf
Application (L4)	S3 PUT, GET, mixed, LIST, STAT, multipart	warp

Pre-check procedures

Run these checks on every node before starting performance tests.

NTP clock synchronization

ssh NODE_HOSTNAME "chronyc tracking"

MTU verification

ssh NODE_HOSTNAME "ip link show | grep mtu"

Drive health check

ssh NODE_HOSTNAME "df -h /data/disk{1..24}"

Adjust the disk path and count to match your deployment.

NIC ring buffer check

Check current ring buffer sizes:

ssh NODE_HOSTNAME "ethtool -g INTERFACE"

Set to maximum if not already:

ssh NODE_HOSTNAME "ethtool -G INTERFACE rx MAX_RX tx MAX_TX"

NIC link speed and error counters

ssh NODE_HOSTNAME "ethtool INTERFACE | grep -E 'Speed|Duplex|Link detected'"
ssh NODE_HOSTNAME "ip -s link show INTERFACE"

L1: Link layer tests

These tests validate the physical network: NIC throughput, link capacity, and MTU correctness. Run intra-DC to measure raw link performance without cross-DC routing effects.

Intra-DC bandwidth per node (hperf)

Measure maximum bandwidth between nodes within the same data center:

hperf bandwidth \
    --hosts DC_HOSTS \
    --port 9010 \
    --duration 60 \
    --concurrency 16 \
    --payload-size 10000000 \
    --id link-bw-DC_NAME

Replace DC_HOSTS with comma-separated hostnames or file:/path/to/dc-hosts.txt. Run once per DC. Any node with significantly lower throughput than its peers warrants investigation.

Intra-DC bandwidth via mc support perf (alternative)

If hperf is not yet deployed, mc support perf provides a quick network throughput test using the MinIO cluster’s internal mesh:

mc support perf net ALIAS

MTU validation (hperf)

Detect MTU misconfigurations by sending payloads sized to require jumbo frames:

hperf bandwidth \
    --hosts ALL_HOSTS \
    --port 9010 \
    --duration 30 \
    --concurrency 1 \
    --payload-size 8972 \
    --buffer-size 8972 \
    --id link-mtu-check

The payload size of 8972 bytes corresponds to a 9000-byte jumbo frame minus IP and TCP headers. Any drops or significantly lower throughput on specific hosts indicates an MTU mismatch.

Point-to-point MTU check (Linux tools)

For targeted checks, use ping with the do-not-fragment flag:

ping -M do -s 8972 -c 3 REMOTE_HOST

If the path MTU is smaller than the packet size, the ping fails with “message too long”.

L2: Internet layer tests

These tests validate IP routing, cross-DC connectivity, and latency characteristics.

DC-to-DC latency (hperf)

Measure round-trip latency between data centers:

hperf latency \
    --hosts HOST1,HOST2 \
    --port 9010 \
    --duration 60 \
    --concurrency 1 \
    --request-delay 10 \
    --payload-size 1000 \
    --id inet-latency-dc1-dc2

Run for each DC pair. The small payload and low concurrency ensure you measure network latency, not throughput.

DC-to-DC bandwidth (hperf)

Measure maximum throughput between data centers:

hperf bandwidth \
    --hosts CROSS_DC_HOSTS \
    --port 9010 \
    --duration 60 \
    --concurrency 16 \
    --payload-size 10000000 \
    --id inet-bw-dc1-dc2

Full-mesh latency (hperf)

Measure latency between all nodes simultaneously to identify routing anomalies or asymmetric paths:

hperf latency \
    --hosts ALL_HOSTS \
    --port 9010 \
    --duration 120 \
    --concurrency 1 \
    --request-delay 50 \
    --payload-size 1000 \
    --id inet-latency-fullmesh

After completion, download and analyze the results:

hperf download --hosts ALL_HOSTS --id inet-latency-fullmesh --file inet-latency-fullmesh.json
hperf analyze --file inet-latency-fullmesh.json --print-stats

Any server with P99 latency significantly higher than the median indicates a misconfigured route, congested switch uplink, or NIC issue.

L3: Transport layer tests

These tests validate TCP behavior under load: buffer tuning, connection concurrency, and aggregate throughput.

Transport buffer size check (hperf)

Test throughput at different TCP buffer sizes to validate that MinIO AIStor’s default buffer configuration is optimal for the network:

hperf bandwidth \
    --hosts DC_HOSTS \
    --port 9010 \
    --duration 60 \
    --concurrency 8 \
    --payload-size 5000000 \
    --buffer-size 65536 \
    --id transport-buf-64k

Run with different buffer sizes (64 KB, 128 KB, 256 KB) to determine network efficiency.

Transport concurrency scaling (hperf)

Test whether throughput scales with increasing connection concurrency:

hperf bandwidth \
    --hosts DC_HOSTS \
    --port 9010 \
    --duration 60 \
    --concurrency 16 \
    --payload-size 5000000 \
    --id transport-conc-16

Run with different concurrency values (16, 32, 64, 128, 256) to determine scaling behavior.

L4: Application layer tests

Disk I/O baseline (dperf)

Test all drives simultaneously to measure aggregate disk throughput:

dperf -v /data/disk{1..24}

The -v flag reports per-drive throughput. Any drive significantly slower than its peers may have a hardware issue.

Test drives serially to measure isolated performance without bus contention:

dperf -v --serial /data/disk{1..24}

If aggregate throughput is significantly less than the sum of serial throughputs, the storage bus or HBA is a bottleneck.

S3 PUT benchmark (warp)

warp put \
    --host ENDPOINT \
    --access-key ACCESS_KEY \
    --secret-key SECRET_KEY \
    --tls \
    --insecure \
    --obj.size 64MiB \
    --concurrent 20 \
    --duration 15m \
    --autoterm \
    --benchdata warp-put-64mib

The --autoterm flag ends the test early if throughput stabilizes.

S3 GET benchmark (warp)

warp get \
    --host ENDPOINT \
    --access-key ACCESS_KEY \
    --secret-key SECRET_KEY \
    --tls \
    --insecure \
    --obj.size 64MiB \
    --objects 2500 \
    --concurrent 20 \
    --duration 5m \
    --autoterm \
    --benchdata warp-get-64mib

The --objects 2500 pre-creates a pool of objects to read from.

S3 mixed workload (warp)

Simulate a realistic production workload with a configurable mix of operations:

warp mixed \
    --host ENDPOINT \
    --access-key ACCESS_KEY \
    --secret-key SECRET_KEY \
    --tls \
    --insecure \
    --obj.size 10MiB \
    --objects 2000 \
    --concurrent 20 \
    --duration 10m \
    --get-distrib 45 \
    --put-distrib 30 \
    --stat-distrib 15 \
    --delete-distrib 10 \
    --autoterm \
    --benchdata warp-mixed

Adjust the --*-distrib percentages to match your expected workload profile.

S3 LIST benchmark (warp)

warp list \
    --host ENDPOINT \
    --access-key ACCESS_KEY \
    --secret-key SECRET_KEY \
    --tls \
    --insecure \
    --objects 10000 \
    --concurrent 8 \
    --duration 3m \
    --benchdata warp-list

S3 STAT benchmark (warp)

warp stat \
    --host ENDPOINT \
    --access-key ACCESS_KEY \
    --secret-key SECRET_KEY \
    --tls \
    --insecure \
    --objects 5000 \
    --concurrent 8 \
    --duration 3m \
    --benchdata warp-stat

S3 multipart upload (warp)

warp multipart \
    --host ENDPOINT \
    --access-key ACCESS_KEY \
    --secret-key SECRET_KEY \
    --tls \
    --insecure \
    --parts 50 \
    --part.size 10MiB \
    --concurrent 4 \
    --duration 5m \
    --benchdata warp-multipart

Capturing production workload statistics

To tune the warp mixed distribution parameters, capture the actual S3 operation distribution from a production cluster:

mc admin trace ALIAS --verbose > operations-log.txt

Run for at least one hour during peak usage. Analyze the log to determine the GET/PUT/STAT/DELETE ratio and use those values for the --*-distrib flags.

Benchmarking for built-in mc support perf tests
Performance Tuning for OS and network tuning
System tuning checklist for pre-deployment validation
Debugging and Troubleshooting for profiling and trace analysis