MemKV
Operate

CLI Reference

Complete reference for the memkv command-line interface — every subcommand, flag, default, and exit-code semantics.

The memkv binary bundles every operator command: drive init, server start, docs site, admin client, offline drive maintenance. This page covers every subcommand and flag.

memkv [GLOBAL OPTIONS] <COMMAND> [ARGS]

Commands:
  setup    Initialize drives and generate config (outputs YAML to stdout)
  start    Start the server
  doc      Serve the bundled MemKV documentation over HTTP
  admin    Query the admin HTTP endpoint of one or more running memkv servers
  drive    Offline drive maintenance (server stopped); see the `drive` section

The macOS build of memkv only exposes start. setup, doc, admin, and drive are Linux-only — they depend on subsystems (JBOF, RDMA, hugepages) that are not compiled in on macOS.

Global

FlagDescription
--versionPrint the build version and exit.
--helpPrint top-level help. Works after any subcommand too.

memkv <command> --help prints the same help that ships with the binary, generated from the same source as this page.

setup

Format the listed NVMe drives in JBOF mode and emit a working config.yaml on stdout. Logs go to stderr so you can pipe stdout straight into a config file.

sudo memkv setup \
  --drives /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 \
  --rdma mlx5_0 \
  | sudo tee /etc/memkv/config.yaml
FlagTypeDefaultDescription
--drivesPATH...requiredOne or more NVMe block devices to format. All drives are owned exclusively by memkv after setup.
--shard-sizeKB64Minimum shard size in kilobytes. Values ≥ 1024 set the per-block size to shard_size / 1024 MB; smaller values keep the default block size from StorageConfig.
--rdmaSTRINGmlx5_0RDMA device name written into rdma.device of the generated YAML.
--forceflagoffReformat drives even if they contain existing memkv superblocks. Destroys all existing data on the listed devices.
--idempotentflagoffExit 0 without rewriting if every drive is already initialised. Mixed state (some fresh, some initialised) still fails — combine with --force to reformat anyway. Designed for orchestrators (Helm, systemd) that re-run setup on every restart.

Behavior

  • Logs each drive's model, optimal I/O size, and max_sectors_kb to stderr before formatting.
  • Picks the bounce-buffer pool size at 50% of available system memory (minimum 1 GB) and writes that into memory.max_size.
  • Without --force, fails fast on any drive that already contains memkv data, naming the first offender.
  • The generated YAML always sets storage.mode: direct. Other storage modes (file, memory) are reachable only by hand-editing the YAML — see Configuration.

Exit codes

CodeMeaning
0Drives formatted (or --idempotent matched and skipped) and YAML emitted on stdout.
1Missing drive path, existing data without --force, mixed state with --idempotent, or formatting failure. Stderr carries the reason.

start

Run the memkv server: license verification, RDMA bring-up, storage recovery, data + admin listeners.

sudo memkv start --config /etc/memkv/config.yaml
FlagTypeDefaultDescription
--configPATH/etc/memkv/config.yamlYAML config file. See Configuration for every key.
--licensePATHunsetOptional license file. When set and non-empty, takes precedence over the standard license-lookup chain (MEMKV_LICENSE, AISTOR_LICENSE, MINIO_LICENSE, well-known paths). Accepted plans: Free, Enterprise, EnterpriseLite, EnterprisePlus.
--log-levelSTRINGfrom configOverride logging.level. Accepts standard log-filter strings: trace, debug, info, warn, error, or fully-qualified per-target filters like memkv_server=debug,info.
--log-filePATHmemkv.logFile to append logs to. Pass --log-file none to send logs to stderr instead. JSON or text format follows logging.format from the config.
--log-write-timingSEC?disabledEnable per-request write-timing logs, rate-limited to the given interval in seconds. Bare --log-write-timing (no value) means 3600. Omit the flag entirely to disable. Useful for debugging tail latency without flooding the log when traffic is heavy.

Behavior

  • Refuses to start on Linux kernels older than 6.8 (io_uring and RDMA features used by memkv are unstable on earlier kernels).
  • Validates that the requested bounce-buffer pool fits in either free hugepages or 80% of available system memory, and prints a remediation hint (sysctl -w vm.nr_hugepages=…) if it doesn't.
  • Brings up RDMA via the device named in rdma.device. If the device is missing or the QP setup fails, the server logs an error, sets memkv_rdma_active=0, and continues with an in-process memory pool — useful for development, never for production traffic.
  • Recovers existing blocks from the configured drives on startup; absence of prior data is logged and not an error.
  • Listens for SIGINT and SIGTERM and flushes storage metadata on shutdown.

Exit codes

CodeMeaning
0Graceful shutdown after SIGINT / SIGTERM.
1Config load error, license verification failure, kernel too old, insufficient memory/hugepages, storage open failure, admin TLS misconfiguration, or a runtime license error that fires the background license-watcher abort.

doc

Serve the documentation site bundled into the binary. The site is the same Fumadocs build that ships at docs.min.io/memkv, packed into the binary as zstd-compressed assets at release time.

memkv doc

The server binds to an ephemeral loopback port; the resolved URL is logged on startup (http://127.0.0.1:<port>/). If the binary was built without docs/website/out/ on disk, this command serves a placeholder explaining how to rebuild.

admin

Query the admin HTTP endpoint of one or more running memkv servers. All routes are versioned under /v1/.

memkv admin [--servers <URLS>] <subcommand> [SUBCOMMAND ARGS]

Server resolution

Targets are resolved in this order, first match wins:

  1. --servers <URLS> (short -s) — comma-separated list. Whitespace and empty entries are stripped.
  2. $MEMKV_SERVERS environment variable, same comma-separated format.
  3. Fallback: http://127.0.0.1:9901.

Bare host:port entries are accepted; http:// is assumed. Use explicit https:// URLs against TLS-enabled servers (network.tls_cert + network.tls_key).

export MEMKV_SERVERS="coe02:9901,coe04:9901"
memkv admin status

Subcommands

SubcommandHTTPNotes
healthGET /v1/healthLiveness probe. Renders a per-server table with status text.
readyGET /v1/readyReadiness probe. Servers reporting not ready are flagged in the output and counted as errors for the exit code.
status [--json] [--verbose]GET /v1/statusDetailed status. See flags below.
metricsGET /v1/metricsPrometheus exposition. Streams the response body verbatim; multi-server output is separated by labelled rules.
drives failed [--json]GET /v1/drives/failedFailed-drive table per server. With --json, prints per-server JSON arrays (separator rule when multiple servers).
drives reset <device_id> [--secure]POST /v1/drives/{device_id}/resetBLKDISCARDs the drive and re-initialises superblock, journal, and B+ tree. Requires exactly one resolved server — pass --servers <one> if more are configured. Pass --secure to additionally zero-fill the data region.

admin status flags

FlagDescription
--jsonEmit the raw status JSON instead of a formatted summary. Multi-server runs print one JSON document per server, separated by a labelled rule.
--verbosePer-server breakdown: uptime, memory pool with usage bar, per-device storage table, RDMA connection count, block totals.

admin drives failed flags

FlagDescription
--jsonEmit the raw failed-drives JSON instead of the formatted table.

admin drives reset flags

FlagDescription
--secureZero-fill the data region after BLKDISCARD. Slower but guarantees residual data is overwritten regardless of the drive's discard semantics. Sent as ?secure=true on the wire.

Admin HTTP API

memkv admin wraps the same HTTP endpoints exposed by every server at 0.0.0.0:9901 (port + 1 from the data plane). All routes are versioned under /v1/. HTTPS is served when network.tls_cert and network.tls_key are configured; otherwise plain HTTP. Use the curl examples below from any tool that doesn't bundle the memkv binary (Helm hooks, external monitors).

EndpointMethodDescription
/v1/healthGETLiveness probe.
/v1/readyGETReadiness probe.
/v1/statusGETDetailed status (uptime, memory, drives, RDMA).
/v1/metricsGETPrometheus exposition.
/v1/drives/failedGETList failed drives with failure counts and state.
/v1/drives/{device_id}/resetPOSTWipe and re-initialize a failed drive. Accepts ?secure=true for an additional data-region zero-fill.
curl http://localhost:9901/v1/drives/failed
[
  {
    "device_id": 2,
    "failure_count": 3,
    "first_failed_at": 1738800000,
    "last_failed_at": 1738810000,
    "blacklisted": false
  }
]

Reset BLKDISCARDs the drive and re-initializes the superblock, journal, and B+ tree. The drive rejoins the active pool after reset. By default the data region is left to BLKDISCARD; pass ?secure=true to additionally zero-fill it, at the cost of write time proportional to drive capacity.

curl -X POST http://localhost:9901/v1/drives/2/reset
curl -X POST 'http://localhost:9901/v1/drives/2/reset?secure=true'

Drives that exceed flap_threshold failures are blacklisted and cannot be reset via the running-server API above. Set storage.flap_threshold in config.yaml to tune the cutoff. The only path to reactivate a blacklisted drive is the offline memkv drive reset <path> command described in the next section, which opens the device directly with the server stopped. If the drive is physically failing, replace it rather than resetting it.

Exit codes

CodeMeaning
0Every targeted server responded successfully.
1At least one server returned a transport or HTTP error, ready returned false, drives reset was issued against a target set with len() != 1, or no servers were configured.

The CLI continues querying remaining servers after an error — it surfaces all failures in one run and only flips the exit code at the end.

drive

Offline drive maintenance — operates on a drive directly, with the server stopped. For online resets against a running server, use memkv admin drives reset <device_id> instead.

memkv drive reset <PATH> [--secure]

Reads the drive's existing superblock to recover its device_id and the JBOF's num_devices, then wipes the superblock, journal, B+ tree, and bitmap regions before re-initialising them. Operators do not need to track the device index — the drive carries it. If the primary superblock is unreadable, the mirror is consulted before failing.

FlagDescription
--secureZero-fill the data region after BLKDISCARD. Slower but guarantees residual data is overwritten regardless of the drive's discard semantics. Default: BLKDISCARD only.
sudo memkv drive reset /dev/nvme2n1
sudo memkv drive reset /dev/nvme2n1 --secure

The server must be stopped before running memkv drive reset — the command opens the drive with O_DIRECT and a concurrent server process holding it open can lead to inconsistent state. Drives blacklisted by the running server are reset back to active on next start.

drive verify

Read-only integrity check of a single drive. Never writes — safe to run any time the server is stopped.

memkv drive verify <PATH> [--quick] [--json]

It validates, in order:

  • the superblock and its end-of-device mirror (CRC + cross-check);
  • every B+ tree page against its stored CRC, sweeping the region directly rather than following tree pointers, so a corrupt page cannot hide the entries beyond it;
  • the journal, replayed on top of the tree when the drive was not cleanly shut down (exactly as the server does on an unclean restart), so a post-crash drive is not reported as corrupt;
  • the allocation bitmap against the slots the index actually references — leaked slots (wasted space) and missing bits (an allocator hazard);
  • unless --quick, every stored block's CRC32, read back from the data region. This is the only bit-rot detector, since the server's verify_on_read is off by default.
FlagDescription
--quick, -qSkip the full data-region CRC scan; run metadata checks only.
--jsonEmit the structured report as JSON instead of a formatted summary.
sudo memkv drive verify /dev/nvme2n1
sudo memkv drive verify /dev/nvme2n1 --quick --json

Exit code is 0 when the drive is clean and 1 when any uncorrected error remains (the message points you at drive fsck --repair).

drive fsck

Runs the same checks as verify. Without --repair it is a dry run that only reports what it would fix (like xfs_repair -n). With --repair it rewrites the metadata to restore consistency.

memkv drive fsck <PATH> [--repair] [--quick] [--json]

Repair rebuilds the B+ tree from the surviving entries (so corrupt pages can never be read again), reconciles the bitmap with that rebuilt index, resets the journal, and marks the drive cleanly shut down.

FlagDescription
--repairApply repairs in place. Without it, the command only reports.
--quick, -qSkip the full data-region CRC scan; metadata checks and repair only.
--jsonEmit the structured report as JSON instead of a formatted summary.
sudo memkv drive fsck /dev/nvme2n1            # dry run — report only
sudo memkv drive fsck /dev/nvme2n1 --repair   # apply repairs

MemKV stores no cross-drive redundancy, so a block whose bytes are physically corrupt cannot be reconstructed — the data exists nowhere else. Repair makes the drive consistent and safe to serve, not durable: a corrupt, duplicate, or dangling block is quarantined (dropped from the index) so it becomes a clean cache miss the inference layer recomputes. Repair never serves or preserves data that fails its checksum.

drive health

Offline health snapshot of a drive — model, capacity, fill, clean-shutdown state, and the on-superblock failure registry. Reads only metadata; it does not scan the data region.

memkv drive health <PATH> [--json]
sudo memkv drive health /dev/nvme2n1
sudo memkv drive health /dev/nvme2n1 --json

verify, fsck, and health all open the drive with O_DIRECT, so the server must be stopped first — same constraint as drive reset. For a live-server view of drive health, use memkv admin status and memkv admin drives failed.

Environment variables

VariableRead byEffect
MEMKV_SERVERSadmin, clientComma-separated list of admin/data endpoints. For admin it falls back to http://127.0.0.1:9901 if unset. For the client (NIXL plugin and LD_PRELOAD shim) it overrides servers: from MEMKV_CONFIG.
MEMKV_CONFIGclientPath to a yaml file consumed by the NIXL plugin and LD_PRELOAD shim. Provides every MEMKV_* value (servers, rdma*devices, bind_addresses, cache_size_mb, connect_timeout_ms, gid_index, num_connections, license) in one place. Resolution order is defaults → MEMKV_CONFIG file → MEMKV*\*env vars. Seedeploy/examples/client-config.yaml.
MEMKV_LICENSEstart, clientmemkv-specific license lookup variable. Sits in the verify_license() chain on both the server and the memkv-client library used by the NIXL plugin and LD_PRELOAD shim. Value is either the JWT inline or a path to a file containing the JWT. The license: field in MEMKV_CONFIG (if set) wins over this. Accepted plans: Free, Enterprise, EnterpriseLite, EnterprisePlus. Free tier is limited to a single remote server in MEMKV_SERVERS — the engine refuses to start with multiple servers under a Free license.
AISTOR_LICENSEstart, clientConsulted after MEMKV_LICENSE in the unified license chain. Same value format (JWT inline or a path to a file). Used by both server and client.
MINIO_LICENSEstart, clientStandard MinIO license-lookup variable, consulted after AISTOR_LICENSE. Same value format. Used by both server and client.
MEMKV_AUTH_KEYstart, client, memkv-benchRequired. HMAC-SHA256 shared key as 64 hex chars (32 bytes). Used to sign and verify every wire message between client and server. Overrides network.auth_key: (server) / auth_key: (client MEMKV_CONFIG). The same value must be configured on both sides. Generate with openssl rand -hex 32.
MEMKV_TRANSPORTclientForce-select the client transport: auto (default; RDMA preferred, TCP fallback on failure), rdma (strict — the client errors at startup if no RDMA NIC and in-flight RDMA failures propagate), tcp (skip RDMA entirely; the configured server address is the TCP target). See Transport selection.
RUST_LOGallStandard tracing filter. setup and doc initialise tracing with info and ignore this; start honours --log-level first, then config, then this.

Default ports

PortProcessSource
9900Data planenetwork.address in config
9901Admin HTTP/HTTPSnetwork.address port + 1 (auto, no separate knob)

The data port carries the entire TCP wire protocol — control, RDMA bootstrap, and inline-bulk PUT/GET. Admin always lives at address + 1.

On macOS the binary does not expose RDMA — RDMA / JBOF / hugepages aren't compiled in there — but the TCP listener still binds on address, so the client targets the same address it would on Linux.