Debugging and Troubleshooting

This guide covers tools and techniques for debugging MinIO AIStor deployments.

HTTP tracing

Use mc admin trace to capture real-time HTTP API calls to and from the MinIO AIStor server.

Default trace shows only API operation names and HTTP response status:

mc admin trace ALIAS

Trace entire HTTP requests including headers and body:

mc admin trace --verbose ALIAS

Trace all HTTP requests including internode communication:

mc admin trace --all --verbose ALIAS

Filter by API operation

Use --funcname to trace specific S3 API operations:

mc admin trace --call s3 --funcname s3.PutObject ALIAS

Combine multiple operations with a comma-separated list:

mc admin trace --call s3 --funcname s3.GetObject,s3.HeadObject ALIAS

Filter by path

Use --path to trace operations on a specific bucket or object:

mc admin trace --call s3 --path "mybucket/path/to/object" ALIAS

Filter by request or response size

Use --filter-request or --filter-response with --filter-size to trace only large operations:

mc admin trace --call s3 --filter-request --filter-size 100MB ALIAS

Filter by HTTP status code

Use --status-code to trace only errors:

mc admin trace --call s3 --status-code 403,500,503 ALIAS

Performance profiling

Use mc support profile to capture CPU, memory, and blocking profiles during a performance issue. Run the profile during the slow period for the most useful data.

mc support profile ALIAS --type cpu,mem,block

Additional profile types: mutex, trace, threads, goroutines, cpuio, runtime, metrics.

Reading profile output

Analyze the downloaded profile files with go tool pprof:

go tool pprof -top profile-NODE-cpu.pprof

Common bottleneck patterns:

Profile pattern Cause Action
High time in syscall.openat / xlStorage.readMetadata Metadata I/O saturation from many objects Enable drive cache (MINIO_CACHE_ENABLE=on), upgrade for metadata path optimizations
High time in runtime.selectgo / sync.Mutex.Lock Lock contention under high concurrency Upgrade for lock contention fixes, check internode latency
High xlStorage.WalkDir Listing operations scanning large prefixes Reduce prefix depth, check scanner delay setting
runtime.newosproc in crash dump OS thread exhaustion Increase ulimit -u or container pids limit

Drive cache metrics

When drive caching is enabled (MINIO_CACHE_ENABLE=on), monitor cache effectiveness with Prometheus:

Cache utilization:

sum(minio_system_drive_cache_used) / sum(minio_system_drive_cache_capacity)

Cache hit ratio:

sum(minio_system_drive_cache_hits) / (sum(minio_system_drive_cache_hits) + sum(minio_system_drive_cache_misses))

Multipart upload troubleshooting

Multipart uploads can stall or accumulate if clients disconnect before completing the upload.

Trace multipart operations

Trace the full lifecycle of multipart uploads in real time:

mc admin trace --call s3 --funcname s3.NewMultipartUpload,s3.PutObjectPart,s3.CompleteMultipartUpload ALIAS

Find large parts being uploaded:

mc admin trace --call s3 --funcname s3.PutObjectPart --filter-request --filter-size 100MB ALIAS

List incomplete multipart uploads

mc ls ALIAS/BUCKET --incomplete --recursive

Clean up stale multipart uploads

Remove all incomplete multipart uploads older than a specified duration:

mc rm ALIAS/BUCKET --incomplete --recursive --older-than 7d --force

Automate multipart cleanup with lifecycle rules

Set a lifecycle rule to automatically abort incomplete multipart uploads after a number of days:

mc ilm rule add ALIAS/BUCKET --expire-days 7

This prevents stale multipart uploads from accumulating and consuming disk space.

SUBNET health diagnostics

SUBNET health diagnostics collect system information to help ensure the underlying infrastructure is configured correctly. This test is resource-intensive and is recommended at initial deployment and when investigating failure scenarios.

mc support diagnostics ALIAS

The command collects:

  • Admin info
  • CPU information
  • Disk hardware details
  • OS information
  • Memory information
  • Process information
  • Configuration
  • Drive performance
  • Network performance

The output is saved as a compressed JSON file.

The diagnostics output file may contain sensitive information about your environment. Inspect the contents before sharing on any public forum.

Metadata inspection

MinIO AIStor stores object metadata in xl.meta files using a binary format. Use the xl-meta tool to decode these files.

Install xl-meta

Install the xl-meta tool using Go:

go install github.com/minio/minio/docs/debugging/xl-meta@latest

Use xl-meta

Run xl-meta in a directory containing xl.meta files:

xl-meta ./xl.meta

Decode multiple files recursively:

xl-meta ./**/xl.meta

View inline data stored in metadata:

xl-meta -data xl.meta

Export inline data to a file:

xl-meta --export xl.meta

Remote backend inspection

mc support inspect collects files from all backend drives based on a path pattern. Matching files are collected in a zip file with their respective host, drive, and path information.

Collect xl.meta from a specific object:

mc support inspect ALIAS/bucket/path/to/file.txt/xl.meta

Collect all xl.meta files recursively:

mc support inspect ALIAS/bucket/path/**/xl.meta

Collect part files for all versions of an object:

mc support inspect ALIAS/bucket/path/to/file.txt/*/part.*

The xl-meta tool accepts zip files as input and outputs all xl.meta files found within the archive:

xl-meta inspect.6f96b336.zip

Encrypted inspection

Use --encrypt to produce an encrypted output file:

mc support inspect --encrypt ALIAS/bucket/path/to/file.txt/xl.meta

This outputs an encrypted file and a one-time decryption key.

The decryption key is only shown once and cannot be recovered. The encrypted file can be safely shared without the decryption key.

Decryption tool

Install the decryption tool:

go install github.com/minio/minio/docs/debugging/inspect@latest

Decrypt an inspection file:

inspect -key=<decryption-key> inspect.ad2b43d8.enc

If --key is not specified, an interactive prompt asks for it. The file name contains the beginning of the key to verify the correct key is used.