Configuration Reference
Overview
This reference provides complete documentation for inventory job configuration. Inventory jobs use YAML configuration files that define what objects to include, what metadata to report, how often to run, and where to store generated reports.
The configuration file contains top-level fields that control job behavior, destination settings for output location and format, optional fields for including additional metadata, and filters for selecting specific objects.
All configuration files must specify apiVersion, id, and destination as required fields.
Top-level fields
The following table describes all available fields in the inventory configuration:
| Field | Description |
|---|---|
| apiVersion | Required. The version of the inventory configuration API. Must be set to "v1". |
| id | Required. A unique identifier for this inventory job. The ID appears in the output folder path and status queries. Use descriptive names that indicate the job purpose. |
| destination | Required. An object that specifies where to store generated inventory reports. See destination fields below for required and optional subfields. |
| destination.bucket | Required. The name of the destination bucket for inventory reports. The bucket must exist and you must have write permissions. |
| destination.prefix | Optional. A prefix for the report file keys in the destination bucket. Use prefixes to organize reports from multiple inventory jobs. The prefix is prepended to the standard output path structure. |
| destination.format | Optional. The output format for the report files. Valid values are csv, json, or parquet. Defaults to csv.CSV format includes field names in the first row. JSON format uses JSON Lines with one object per line. Parquet format uses columnar storage optimized for analytics. |
| destination.compression | Optional. Specifies whether to compress the report files. Valid values are "on" or "off". Defaults to "on".Compression uses ZSTD algorithm. Compressed files have the .zst extension. |
| schedule | Optional. Defines how often the inventory job runs. Valid values are once, hourly, daily, weekly, monthly, or yearly. Defaults to once or a one-time job.The scheduler determines the next run time based on the completion time of the previous run, not fixed calendar intervals. |
| mode | Optional. The operational mode for the job. Valid values are fast or strict. Defaults to fast.Fast mode prioritizes speed over consistency. Strict mode prioritizes consistency over speed. In both modes, objects modified during job execution may not be included in the report. |
| versions | Optional. Specifies which object versions to include in the report. Valid values are all or current. Defaults to all.The all setting includes all versions of each object. The current setting includes only the most recent version. |
| includeFields | Optional. An array of additional object metadata fields to include in the report. See the output field list section for available field names. The report always includes default fields regardless of this setting. |
| filters | Optional. An object containing rules to filter which objects the report includes. You can combine multiple filter types. Included objects must match all specified filters. See the filters section for detailed filter syntax. |
Output field list
Inventory reports include a set of default fields for every object.
You can optionally include additional metadata fields using the includeFields configuration.
Default fields
The inventory report always includes the following fields:
| Field | Description |
|---|---|
| Bucket | The name of the source bucket |
| Key | The object key (name) |
| SequenceNumber | The object sequence number |
| Size | Object size in bytes |
| LastModifiedDate | Last modified timestamp in RFC3339 format |
Version-specific fields
When versions is set to all, the report includes the following additional fields:
| Field | Description |
|---|---|
| VersionID | The version ID of the object |
| IsDeleteMarker | Boolean indicating if the object is a delete marker |
| IsLatest | Boolean indicating if this is the latest version |
Optional fields
Include any of the following fields by adding them to the includeFields array:
| Field | Description |
|---|---|
| ETag | The object’s ETag value |
| StorageClass | The object’s storage class |
| IsMultipart | Boolean indicating if the object was uploaded as multipart |
| EncryptionStatus | The server-side encryption status |
| IsBucketKeyEnabled | Boolean indicating if bucket key encryption is enabled |
| KmsKeyArn | The KMS key ARN used for server-side encryption |
| ChecksumAlgorithm | The checksum algorithm used for the object |
| Tags | The object’s tags, formatted as a query string (for example, key1=value1&key2=value2) |
| UserMetadata | User-defined metadata, formatted as a query string |
| AccessTime | The object’s last access time |
| ReplicationStatus | The object’s replication status |
| ObjectLockRetainUntilDate | The object lock retention date |
| ObjectLockMode | The object lock mode (for example, GOVERNANCE, COMPLIANCE) |
| ObjectLockLegalHoldStatus | The object lock legal hold status (on or off) |
| Tier | The object’s storage tier |
| TieringStatus | The status of the object’s tiering |
Example configuration:
includeFields:
- ETag
- StorageClass
- Tags
- UserMetadata
Filters
Filters select a subset of objects to include in the inventory report. You can specify multiple filter types in the same configuration. The report only includes objects that match all specified filters. Each filter type has specific syntax and operators appropriate for the data being filtered.
Prefix filters
Prefix filters include only objects whose keys start with one of the specified prefixes. This is the most efficient filter type for selecting subsets of a bucket.
Configuration uses an array of prefix strings:
filters:
prefix:
- "documents/"
- "images/"
- "videos/archive/"
Objects match if their key starts with any of the specified prefixes. Prefix filters are case-sensitive and must match exactly from the beginning of the key.
Age filters
Age filters select objects based on their last modified timestamp. You can specify relative time ranges using durations or absolute time ranges using timestamps.
Relative time
Relative time filters use duration strings with the following format:
dfor days (for example,30d)hfor hours (for example,12h)mfor minutes (for example,45m)- Combined (for example,
1d6hfor 1 day and 6 hours)
Available operators:
olderThan- Objects last modified before the specified duration agonewerThan- Objects last modified after the specified duration ago
The following example includes objects last modified more than 30 days ago.
filters:
lastModified:
olderThan: 30d
The following example includes objects last modified within the past 7 days.
filters:
lastModified:
newerThan: 7d
Absolute time
Absolute time filters use RFC3339 format timestamps (for example, 2023-10-27T10:00:00Z).
Available operators:
before- Objects last modified before the specified timestampafter- Objects last modified after the specified timestamp
The following example includes objects last modified during the year 2023.
filters:
lastModified:
before: "2023-12-31T23:59:59Z"
after: "2023-01-01T00:00:00Z"
Combining age filters
You can combine multiple age filter operators to specify precise time ranges. An object must satisfy all specified conditions to match.
The following example includes objects last modified between 30 and 90 days ago.
filters:
lastModified:
olderThan: 30d
newerThan: 90d
You can also combine relative and absolute time filters. The following example includes objects modified after January 1, 2024, but not within the past 7 days.
filters:
lastModified:
after: "2024-01-01T00:00:00Z"
olderThan: 7d
Numeric filters
Numeric filters select objects based on numeric properties such as size or version count. All numeric filters support three comparison operators.
Comparison operators
Available operators for all numeric filters:
lessThan- Value is less than and not equal to the specified numbergreaterThan- Value is greater than and not equal to the specified numberequalTo- Value equals the specified number
The following example includes objects that have more than 6 or more versions.
filters:
versionsCount:
greaterThan: 5
Size units
The size filter accepts human-readable units for byte sizes:
B- BytesKiB- Kibibytes (1024 bytes)MiB- Mebibytes (1024 KiB)GiB- Gibibytes (1024 MiB)
The following example includes objects larger than 100 mebibytes.
filters:
size:
greaterThan: 100MiB
Combining numeric filters
You can combine multiple operators to specify ranges. An object must satisfy all specified conditions to match.
The following example includes objects between 10 MiB and 1 GiB in size.
filters:
size:
greaterThan: 10MiB
lessThan: 1GiB
Name filters
Name filters select objects based on their key (name) using pattern matching. Three matching methods are available, and they are mutually exclusive within a single name filter configuration.
Glob patterns
The match operator uses glob-style pattern matching with wildcards:
*matches any sequence of characters?matches any single character[abc]matches any character in the set
The following example includes objects with keys ending in .jpg.
filters:
name:
match: "*.jpg"
The following example includes CSV files starting with report- followed by a year in the 2020s decade.
filters:
name:
match: "report-202?-*.csv"
match, contains, and regex filters are mutually exclusive.
You can specify only one in a single name filter configuration.
Substring search
The contains operator performs simple substring matching.
Objects match if their key contains the specified string anywhere.
The following example includes objects with archive anywhere in the key.
filters:
name:
contains: "archive"
match, contains, and regex filters are mutually exclusive.
You can specify only one in a single name filter configuration.
Regular expressions
The regex operator uses regular expression pattern matching.
The regular expression syntax follows the RE2 specification.
The following example includes objects matching the pattern data-YYYY-MM-DD.json.
filters:
name:
regex: "^data-\\d{4}-\\d{2}-\\d{2}\\.json$"
Note that backslashes in YAML strings must be escaped, so \d becomes \\d.
match, contains, and regex filters are mutually exclusive.
You can specify only one in a single name filter configuration.
Key-value filters
Key-value filters select objects based on their tags or user metadata. These filters support complex logical combinations.
Logical operators
Key-value filters use arrays of match specifications combined with logical operators:
and- All conditions in the array must be trueor- At least one condition in the array must be true
Each match specification includes a key field and either valueString or valueNum for the value comparison.
The following example includes objects that have both an environment tag equal to production and a department tag containing engineering.
filters:
tags:
and:
- key: environment
valueString:
match: "production"
- key: department
valueString:
contains: "engineering"
String value matching
The valueString operator matches tag or metadata values using the same methods as name filters:
match- Glob-style pattern matchingcontains- Substring searchregex- Regular expression matching
The following example includes objects where the project tag starts with ares-.
filters:
tags:
and:
- key: project
valueString:
match: "ares-*"
Numeric value matching
The valueNum operator matches numeric tag or metadata values using the same comparison operators as numeric filters:
lessThan- Value is less than the specified numbergreaterThan- Value is greater than the specified numberequalTo- Value equals the specified number
The following example includes objects where the x-amz-meta-priority user metadata value is 6 or more.
filters:
userMetadata:
and:
- key: x-amz-meta-priority
valueNum:
greaterThan: 5
Complex combinations
You can combine multiple conditions and nest logical operators to create sophisticated filters.
The following example includes objects that meet both of these conditions:
- Status tag is
activeAND priority tag is greater than 3 - Urgent tag is
trueOR escalated tag istrue
filters:
tags:
and:
- key: status
valueString:
match: "active"
- key: priority
valueNum:
greaterThan: 3
or:
- key: urgent
valueString:
match: "true"
- key: escalated
valueString:
match: "true"
The userMetadata filter uses the same syntax as tags.
The following example includes objects where the x-amz-meta-processed-by metadata matches the pattern worker-* and x-amz-meta-version is greater than 2.
filters:
userMetadata:
and:
- key: x-amz-meta-processed-by
valueString:
regex: "worker-.*"
- key: x-amz-meta-version
valueNum:
greaterThan: 2
Complete configuration example
The following example demonstrates all available configuration options. Comments indicate mutually exclusive options and other important notes.
# Required: API version must be "v1"
apiVersion: v1
# Required: Unique identifier for this job
id: comprehensive-inventory-example
# Required: Destination configuration
destination:
# Required: Destination bucket name
bucket: inventory-reports-bucket
# Optional: Prefix for report files
prefix: comprehensive-reports/
# Optional: Output format (csv, json, or parquet)
# Default: csv
format: parquet
# Optional: Compression setting (on or off)
# Default: on
compression: on
# Optional: Job schedule (once, hourly, daily, weekly, monthly, yearly)
# Default: once
schedule: weekly
# Optional: Operational mode (fast or strict)
# Default: fast
mode: strict
# Optional: Version handling (all or current)
# Default: all
versions: all
# Optional: Additional metadata fields to include
# See output field list for all available fields
includeFields:
- ETag
- StorageClass
- IsMultipart
- EncryptionStatus
- Tags
- UserMetadata
- ReplicationStatus
- ObjectLockMode
- TierField
# Optional: Filter rules to select objects
filters:
# Prefix filter: Include objects with these prefixes
prefix:
- "documents/"
- "images/"
- "videos/"
# Age filter: Filter by last modified date
lastModified:
# Relative time filters (use duration strings)
olderThan: 7d
newerThan: 365d
# Absolute time filters (use RFC3339 timestamps)
# Can be combined with relative filters
after: "2024-01-01T00:00:00Z"
# before: "2024-12-31T23:59:59Z"
# Size filter: Filter by object size
size:
greaterThan: 1MiB
lessThan: 100GiB
# equalTo: 50MiB # Mutually exclusive with range filters
# Version count filter: Filter by number of versions
versionsCount:
greaterThan: 1
# lessThan: 10
# equalTo: 5 # Mutually exclusive with range filters
# Name filter: Filter by object key
# NOTE: match, contains, and regex are mutually exclusive
name:
match: "*.mp4"
# contains: "archive" # Mutually exclusive with match/regex
# regex: "^data-\\d{4}\\.json$" # Mutually exclusive with match/contains
# Tag filter: Filter by object tags
tags:
# AND: All conditions must be true
and:
- key: project
valueString:
match: "ares-*"
- key: status
valueString:
contains: "active"
- key: priority
valueNum:
greaterThan: 5
# OR: At least one condition must be true
# NOTE: Can specify both 'and' and 'or' in same filter
# or:
# - key: urgent
# valueString:
# match: "true"
# User metadata filter: Filter by user-defined metadata
userMetadata:
and:
- key: x-amz-meta-processed-by
valueString:
regex: "worker-\\d+"
- key: x-amz-meta-version
valueNum:
greaterThan: 2