CLI Reference
Complete reference for the memkv command-line interface — every subcommand, flag, default, and exit-code semantics.
The memkv binary bundles every operator command: drive init, server start, docs site, admin client, offline drive maintenance. This page covers every subcommand and flag.
memkv [GLOBAL OPTIONS] <COMMAND> [ARGS]
Commands:
setup Initialize drives and generate config (outputs YAML to stdout)
start Start the server
doc Serve the bundled MemKV documentation over HTTP
admin Query the admin HTTP endpoint of one or more running memkv servers
drive Offline drive maintenance (server stopped); see the `drive` sectionThe macOS build of memkv only exposes start. setup, doc, admin, and
drive are Linux-only — they depend on subsystems (JBOF, RDMA, hugepages)
that are not compiled in on macOS.
Global
| Flag | Description |
|---|---|
--version | Print the build version and exit. |
--help | Print top-level help. Works after any subcommand too. |
memkv <command> --help prints the same help that ships with the binary, generated from the same source as this page.
setup
Format the listed NVMe drives in JBOF mode and emit a working config.yaml on stdout. Logs go to stderr so you can pipe stdout straight into a config file.
sudo memkv setup \
--drives /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 \
--rdma mlx5_0 \
| sudo tee /etc/memkv/config.yaml| Flag | Type | Default | Description |
|---|---|---|---|
--drives | PATH... | required | One or more NVMe block devices to format. All drives are owned exclusively by memkv after setup. |
--shard-size | KB | 64 | Minimum shard size in kilobytes. Values ≥ 1024 set the per-block size to shard_size / 1024 MB; smaller values keep the default block size from StorageConfig. |
--rdma | STRING | mlx5_0 | RDMA device name written into rdma.device of the generated YAML. |
--force | flag | off | Reformat drives even if they contain existing memkv superblocks. Destroys all existing data on the listed devices. |
--idempotent | flag | off | Exit 0 without rewriting if every drive is already initialised. Mixed state (some fresh, some initialised) still fails — combine with --force to reformat anyway. Designed for orchestrators (Helm, systemd) that re-run setup on every restart. |
Behavior
- Logs each drive's model, optimal I/O size, and
max_sectors_kbto stderr before formatting. - Picks the bounce-buffer pool size at 50% of available system memory (minimum 1 GB) and writes that into
memory.max_size. - Without
--force, fails fast on any drive that already contains memkv data, naming the first offender. - The generated YAML always sets
storage.mode: direct. Other storage modes (file,memory) are reachable only by hand-editing the YAML — see Configuration.
Exit codes
| Code | Meaning |
|---|---|
0 | Drives formatted (or --idempotent matched and skipped) and YAML emitted on stdout. |
1 | Missing drive path, existing data without --force, mixed state with --idempotent, or formatting failure. Stderr carries the reason. |
start
Run the memkv server: license verification, RDMA bring-up, storage recovery, data + admin listeners.
sudo memkv start --config /etc/memkv/config.yaml| Flag | Type | Default | Description |
|---|---|---|---|
--config | PATH | /etc/memkv/config.yaml | YAML config file. See Configuration for every key. |
--license | PATH | unset | Optional license file. When set and non-empty, takes precedence over the standard license-lookup chain (MEMKV_LICENSE, AISTOR_LICENSE, MINIO_LICENSE, well-known paths). Accepted plans: Free, Enterprise, EnterpriseLite, EnterprisePlus. |
--log-level | STRING | from config | Override logging.level. Accepts standard log-filter strings: trace, debug, info, warn, error, or fully-qualified per-target filters like memkv_server=debug,info. |
--log-file | PATH | memkv.log | File to append logs to. Pass --log-file none to send logs to stderr instead. JSON or text format follows logging.format from the config. |
--log-write-timing | SEC? | disabled | Enable per-request write-timing logs, rate-limited to the given interval in seconds. Bare --log-write-timing (no value) means 3600. Omit the flag entirely to disable. Useful for debugging tail latency without flooding the log when traffic is heavy. |
Behavior
- Refuses to start on Linux kernels older than
6.8(io_uringand RDMA features used by memkv are unstable on earlier kernels). - Validates that the requested bounce-buffer pool fits in either free hugepages or 80% of available system memory, and prints a remediation hint (
sysctl -w vm.nr_hugepages=…) if it doesn't. - Brings up RDMA via the device named in
rdma.device. If the device is missing or the QP setup fails, the server logs an error, setsmemkv_rdma_active=0, and continues with an in-process memory pool — useful for development, never for production traffic. - Recovers existing blocks from the configured drives on startup; absence of prior data is logged and not an error.
- Listens for
SIGINTandSIGTERMand flushes storage metadata on shutdown.
Exit codes
| Code | Meaning |
|---|---|
0 | Graceful shutdown after SIGINT / SIGTERM. |
1 | Config load error, license verification failure, kernel too old, insufficient memory/hugepages, storage open failure, admin TLS misconfiguration, or a runtime license error that fires the background license-watcher abort. |
doc
Serve the documentation site bundled into the binary. The site is the same Fumadocs build that ships at docs.min.io/memkv, packed into the binary as zstd-compressed assets at release time.
memkv docThe server binds to an ephemeral loopback port; the resolved URL is logged on startup (http://127.0.0.1:<port>/). If the binary was built without docs/website/out/ on disk, this command serves a placeholder explaining how to rebuild.
admin
Query the admin HTTP endpoint of one or more running memkv servers. All routes are versioned under /v1/.
memkv admin [--servers <URLS>] <subcommand> [SUBCOMMAND ARGS]Server resolution
Targets are resolved in this order, first match wins:
--servers <URLS>(short-s) — comma-separated list. Whitespace and empty entries are stripped.$MEMKV_SERVERSenvironment variable, same comma-separated format.- Fallback:
http://127.0.0.1:9901.
Bare host:port entries are accepted; http:// is assumed. Use explicit https:// URLs against TLS-enabled servers (network.tls_cert + network.tls_key).
export MEMKV_SERVERS="coe02:9901,coe04:9901"
memkv admin statusSubcommands
| Subcommand | HTTP | Notes |
|---|---|---|
health | GET /v1/health | Liveness probe. Renders a per-server table with status text. |
ready | GET /v1/ready | Readiness probe. Servers reporting not ready are flagged in the output and counted as errors for the exit code. |
status [--json] [--verbose] | GET /v1/status | Detailed status. See flags below. |
metrics | GET /v1/metrics | Prometheus exposition. Streams the response body verbatim; multi-server output is separated by labelled rules. |
drives failed [--json] | GET /v1/drives/failed | Failed-drive table per server. With --json, prints per-server JSON arrays (separator rule when multiple servers). |
drives reset <device_id> [--secure] | POST /v1/drives/{device_id}/reset | BLKDISCARDs the drive and re-initialises superblock, journal, and B+ tree. Requires exactly one resolved server — pass --servers <one> if more are configured. Pass --secure to additionally zero-fill the data region. |
admin status flags
| Flag | Description |
|---|---|
--json | Emit the raw status JSON instead of a formatted summary. Multi-server runs print one JSON document per server, separated by a labelled rule. |
--verbose | Per-server breakdown: uptime, memory pool with usage bar, per-device storage table, RDMA connection count, block totals. |
admin drives failed flags
| Flag | Description |
|---|---|
--json | Emit the raw failed-drives JSON instead of the formatted table. |
admin drives reset flags
| Flag | Description |
|---|---|
--secure | Zero-fill the data region after BLKDISCARD. Slower but guarantees residual data is overwritten regardless of the drive's discard semantics. Sent as ?secure=true on the wire. |
Admin HTTP API
memkv admin wraps the same HTTP endpoints exposed by every server at 0.0.0.0:9901 (port + 1 from the data plane). All routes are versioned under /v1/. HTTPS is served when network.tls_cert and network.tls_key are configured; otherwise plain HTTP. Use the curl examples below from any tool that doesn't bundle the memkv binary (Helm hooks, external monitors).
| Endpoint | Method | Description |
|---|---|---|
/v1/health | GET | Liveness probe. |
/v1/ready | GET | Readiness probe. |
/v1/status | GET | Detailed status (uptime, memory, drives, RDMA). |
/v1/metrics | GET | Prometheus exposition. |
/v1/drives/failed | GET | List failed drives with failure counts and state. |
/v1/drives/{device_id}/reset | POST | Wipe and re-initialize a failed drive. Accepts ?secure=true for an additional data-region zero-fill. |
curl http://localhost:9901/v1/drives/failed[
{
"device_id": 2,
"failure_count": 3,
"first_failed_at": 1738800000,
"last_failed_at": 1738810000,
"blacklisted": false
}
]Reset BLKDISCARDs the drive and re-initializes the superblock, journal, and B+ tree. The drive rejoins the active pool after reset. By default the data region is left to BLKDISCARD; pass ?secure=true to additionally zero-fill it, at the cost of write time proportional to drive capacity.
curl -X POST http://localhost:9901/v1/drives/2/reset
curl -X POST 'http://localhost:9901/v1/drives/2/reset?secure=true'Drives that exceed flap_threshold failures are blacklisted and cannot be
reset via the running-server API above. Set storage.flap_threshold in
config.yaml to tune the cutoff. The only path to reactivate a blacklisted
drive is the offline memkv drive reset <path> command described in the next
section, which opens the device directly with the server stopped. If the
drive is physically failing, replace it rather than resetting it.
Exit codes
| Code | Meaning |
|---|---|
0 | Every targeted server responded successfully. |
1 | At least one server returned a transport or HTTP error, ready returned false, drives reset was issued against a target set with len() != 1, or no servers were configured. |
The CLI continues querying remaining servers after an error — it surfaces all failures in one run and only flips the exit code at the end.
drive
Offline drive maintenance — operates on a drive directly, with the server stopped. For online resets against a running server, use memkv admin drives reset <device_id> instead.
memkv drive reset <PATH> [--secure]Reads the drive's existing superblock to recover its device_id and the JBOF's num_devices, then wipes the superblock, journal, B+ tree, and bitmap regions before re-initialising them. Operators do not need to track the device index — the drive carries it. If the primary superblock is unreadable, the mirror is consulted before failing.
| Flag | Description |
|---|---|
--secure | Zero-fill the data region after BLKDISCARD. Slower but guarantees residual data is overwritten regardless of the drive's discard semantics. Default: BLKDISCARD only. |
sudo memkv drive reset /dev/nvme2n1
sudo memkv drive reset /dev/nvme2n1 --secureThe server must be stopped before running memkv drive reset — the command
opens the drive with O_DIRECT and a concurrent server process holding it
open can lead to inconsistent state. Drives blacklisted by the running server
are reset back to active on next start.
drive verify
Read-only integrity check of a single drive. Never writes — safe to run any time the server is stopped.
memkv drive verify <PATH> [--quick] [--json]It validates, in order:
- the superblock and its end-of-device mirror (CRC + cross-check);
- every B+ tree page against its stored CRC, sweeping the region directly rather than following tree pointers, so a corrupt page cannot hide the entries beyond it;
- the journal, replayed on top of the tree when the drive was not cleanly shut down (exactly as the server does on an unclean restart), so a post-crash drive is not reported as corrupt;
- the allocation bitmap against the slots the index actually references — leaked slots (wasted space) and missing bits (an allocator hazard);
- unless
--quick, every stored block's CRC32, read back from the data region. This is the only bit-rot detector, since the server'sverify_on_readis off by default.
| Flag | Description |
|---|---|
--quick, -q | Skip the full data-region CRC scan; run metadata checks only. |
--json | Emit the structured report as JSON instead of a formatted summary. |
sudo memkv drive verify /dev/nvme2n1
sudo memkv drive verify /dev/nvme2n1 --quick --jsonExit code is 0 when the drive is clean and 1 when any uncorrected error
remains (the message points you at drive fsck --repair).
drive fsck
Runs the same checks as verify. Without --repair it is a dry run that
only reports what it would fix (like xfs_repair -n). With --repair it
rewrites the metadata to restore consistency.
memkv drive fsck <PATH> [--repair] [--quick] [--json]Repair rebuilds the B+ tree from the surviving entries (so corrupt pages can never be read again), reconciles the bitmap with that rebuilt index, resets the journal, and marks the drive cleanly shut down.
| Flag | Description |
|---|---|
--repair | Apply repairs in place. Without it, the command only reports. |
--quick, -q | Skip the full data-region CRC scan; metadata checks and repair only. |
--json | Emit the structured report as JSON instead of a formatted summary. |
sudo memkv drive fsck /dev/nvme2n1 # dry run — report only
sudo memkv drive fsck /dev/nvme2n1 --repair # apply repairsMemKV stores no cross-drive redundancy, so a block whose bytes are physically corrupt cannot be reconstructed — the data exists nowhere else. Repair makes the drive consistent and safe to serve, not durable: a corrupt, duplicate, or dangling block is quarantined (dropped from the index) so it becomes a clean cache miss the inference layer recomputes. Repair never serves or preserves data that fails its checksum.
drive health
Offline health snapshot of a drive — model, capacity, fill, clean-shutdown state, and the on-superblock failure registry. Reads only metadata; it does not scan the data region.
memkv drive health <PATH> [--json]sudo memkv drive health /dev/nvme2n1
sudo memkv drive health /dev/nvme2n1 --jsonverify, fsck, and health all open the drive with O_DIRECT, so the
server must be stopped first — same constraint as drive reset. For a
live-server view of drive health, use memkv admin status and
memkv admin drives failed.
Environment variables
| Variable | Read by | Effect |
|---|---|---|
MEMKV_SERVERS | admin, client | Comma-separated list of admin/data endpoints. For admin it falls back to http://127.0.0.1:9901 if unset. For the client (NIXL plugin and LD_PRELOAD shim) it overrides servers: from MEMKV_CONFIG. |
MEMKV_CONFIG | client | Path to a yaml file consumed by the NIXL plugin and LD_PRELOAD shim. Provides every MEMKV_* value (servers, rdma*devices, bind_addresses, cache_size_mb, connect_timeout_ms, gid_index, num_connections, license) in one place. Resolution order is defaults → MEMKV_CONFIG file → MEMKV*\*env vars. Seedeploy/examples/client-config.yaml. |
MEMKV_LICENSE | start, client | memkv-specific license lookup variable. Sits in the verify_license() chain on both the server and the memkv-client library used by the NIXL plugin and LD_PRELOAD shim. Value is either the JWT inline or a path to a file containing the JWT. The license: field in MEMKV_CONFIG (if set) wins over this. Accepted plans: Free, Enterprise, EnterpriseLite, EnterprisePlus. Free tier is limited to a single remote server in MEMKV_SERVERS — the engine refuses to start with multiple servers under a Free license. |
AISTOR_LICENSE | start, client | Consulted after MEMKV_LICENSE in the unified license chain. Same value format (JWT inline or a path to a file). Used by both server and client. |
MINIO_LICENSE | start, client | Standard MinIO license-lookup variable, consulted after AISTOR_LICENSE. Same value format. Used by both server and client. |
MEMKV_AUTH_KEY | start, client, memkv-bench | Required. HMAC-SHA256 shared key as 64 hex chars (32 bytes). Used to sign and verify every wire message between client and server. Overrides network.auth_key: (server) / auth_key: (client MEMKV_CONFIG). The same value must be configured on both sides. Generate with openssl rand -hex 32. |
MEMKV_TRANSPORT | client | Force-select the client transport: auto (default; RDMA preferred, TCP fallback on failure), rdma (strict — the client errors at startup if no RDMA NIC and in-flight RDMA failures propagate), tcp (skip RDMA entirely; the configured server address is the TCP target). See Transport selection. |
RUST_LOG | all | Standard tracing filter. setup and doc initialise tracing with info and ignore this; start honours --log-level first, then config, then this. |
Default ports
| Port | Process | Source |
|---|---|---|
9900 | Data plane | network.address in config |
9901 | Admin HTTP/HTTPS | network.address port + 1 (auto, no separate knob) |
The data port carries the entire TCP wire protocol — control,
RDMA bootstrap, and inline-bulk PUT/GET. Admin always lives at
address + 1.
On macOS the binary does not expose RDMA — RDMA / JBOF / hugepages
aren't compiled in there — but the TCP listener still binds on
address, so the client targets the same address it would on
Linux.
KV Store ABI
Vendor-neutral C ABI (kv_store_v1) for inference engines to persist KV state through any pluggable backend — a small dlopen contract that storage vendors can implement once and ship to llama.cpp and other consumers.
Configuration
Reference for the server config (/etc/memkv/config.yaml) and the client config consumed by the NIXL plugin and LD_PRELOAD shim via MEMKV_CONFIG.