Configuration Reference

Overview

This reference provides complete documentation for inventory job configuration. Inventory jobs use YAML configuration files that define what objects to include, what metadata to report, how often to run, and where to store generated reports.

The configuration file contains top-level fields that control job behavior, destination settings for output location and format, optional fields for including additional metadata, and filters for selecting specific objects. All configuration files must specify apiVersion, id, and destination as required fields.

Top-level fields

The following table describes all available fields in the inventory configuration:

Field	Description
apiVersion	Required. The version of the inventory configuration API. Must be set to `"v1"`.
id	Required. A unique identifier for this inventory job. The ID appears in the output folder path and status queries. Use descriptive names that indicate the job purpose.
destination	Required. An object that specifies where to store generated inventory reports. See destination fields below for required and optional subfields.
destination.bucket	Required. The name of the destination bucket for inventory reports. The bucket must exist and you must have write permissions.
destination.prefix	Optional. A prefix for the report file keys in the destination bucket. Use prefixes to organize reports from multiple inventory jobs. The prefix is prepended to the standard output path structure.
destination.format	Optional. The output format for the report files. Valid values are `csv`, `json`, or `parquet`. Defaults to `csv`. CSV format includes field names in the first row. JSON format uses JSON Lines with one object per line. Parquet format uses columnar storage optimized for analytics.
destination.compression	Optional. Specifies whether to compress the report files. Valid values are `"on"` or `"off"`. Defaults to `"on"`. Compression uses ZSTD algorithm. Compressed files have the `.zst` extension.
schedule	Optional. Defines how often the inventory job runs. Valid values are `once`, `hourly`, `daily`, `weekly`, `monthly`, or `yearly`. Defaults to `once` or a one-time job. The scheduler determines the next run time based on the completion time of the previous run, not fixed calendar intervals.
mode	Optional. The operational mode for the job. Valid values are `fast` or `strict`. Defaults to `fast`. Fast mode prioritizes speed over consistency. Strict mode prioritizes consistency over speed. In both modes, objects modified during job execution may not be included in the report.
versions	Optional. Specifies which object versions to include in the report. Valid values are `all` or `current`. Defaults to `all`. The `all` setting includes all versions of each object. The `current` setting includes only the most recent version.
includeFields	Optional. An array of additional object metadata fields to include in the report. See the output field list section for available field names. The report always includes default fields regardless of this setting.
filters	Optional. An object containing rules to filter which objects the report includes. You can combine multiple filter types. Included objects must match all specified filters. See the filters section for detailed filter syntax.

Output field list

Inventory reports include a set of default fields for every object. You can optionally include additional metadata fields using the includeFields configuration.

Default fields

The inventory report always includes the following fields:

Field	Description
Bucket	The name of the source bucket
Key	The object key (name)
SequenceNumber	The object sequence number
Size	Object size in bytes
LastModifiedDate	Last modified timestamp in RFC3339 format

Version-specific fields

When versions is set to all, the report includes the following additional fields:

Field	Description
VersionID	The version ID of the object
IsDeleteMarker	Boolean indicating if the object is a delete marker
IsLatest	Boolean indicating if this is the latest version

Optional fields

Include any of the following fields by adding them to the includeFields array:

Field	Description
ETag	The object’s ETag value
StorageClass	The object’s storage class
IsMultipart	Boolean indicating if the object was uploaded as multipart
EncryptionStatus	The server-side encryption status
IsBucketKeyEnabled	Boolean indicating if bucket key encryption is enabled
KmsKeyArn	The KMS key ARN used for server-side encryption
ChecksumAlgorithm	The checksum algorithm used for the object
Tags	The object’s tags, formatted as a query string (for example, `key1=value1&key2=value2`)
UserMetadata	User-defined metadata, formatted as a query string
AccessTime	The object’s last access time
ReplicationStatus	The object’s replication status
ObjectLockRetainUntilDate	The object lock retention date
ObjectLockMode	The object lock mode (for example, `GOVERNANCE`, `COMPLIANCE`)
ObjectLockLegalHoldStatus	The object lock legal hold status (`on` or `off`)
Tier	The object’s storage tier
TieringStatus	The status of the object’s tiering

Example configuration:

includeFields:
  - ETag
  - StorageClass
  - Tags
  - UserMetadata

Filters

Filters select a subset of objects to include in the inventory report. You can specify multiple filter types in the same configuration. The report only includes objects that match all specified filters. Each filter type has specific syntax and operators appropriate for the data being filtered.

Prefix filters

Prefix filters include only objects whose keys start with one of the specified prefixes. This is the most efficient filter type for selecting subsets of a bucket.

Configuration uses an array of prefix strings:

filters:
  prefix:
    - "documents/"
    - "images/"
    - "videos/archive/"

Objects match if their key starts with any of the specified prefixes. Prefix filters are case-sensitive and must match exactly from the beginning of the key.

Age filters

Age filters select objects based on their last modified timestamp. You can specify relative time ranges using durations or absolute time ranges using timestamps.

Relative time

Relative time filters use duration strings with the following format:

d for days (for example, 30d)
h for hours (for example, 12h)
m for minutes (for example, 45m)
Combined (for example, 1d6h for 1 day and 6 hours)

Available operators:

olderThan - Objects last modified before the specified duration ago
newerThan - Objects last modified after the specified duration ago

The following example includes objects last modified more than 30 days ago.

filters:
  lastModified:
    olderThan: 30d

The following example includes objects last modified within the past 7 days.

filters:
  lastModified:
    newerThan: 7d

Absolute time

Absolute time filters use RFC3339 format timestamps (for example, 2023-10-27T10:00:00Z).

Available operators:

before - Objects last modified before the specified timestamp
after - Objects last modified after the specified timestamp

The following example includes objects last modified during the year 2023.

filters:
  lastModified:
    before: "2023-12-31T23:59:59Z"
    after: "2023-01-01T00:00:00Z"

Combining age filters

You can combine multiple age filter operators to specify precise time ranges. An object must satisfy all specified conditions to match.

The following example includes objects last modified between 30 and 90 days ago.

filters:
  lastModified:
    olderThan: 30d
    newerThan: 90d

You can also combine relative and absolute time filters. The following example includes objects modified after January 1, 2024, but not within the past 7 days.

filters:
  lastModified:
    after: "2024-01-01T00:00:00Z"
    olderThan: 7d

Numeric filters

Numeric filters select objects based on numeric properties such as size or version count. All numeric filters support three comparison operators.

Comparison operators

Available operators for all numeric filters:

lessThan - Value is less than and not equal to the specified number
greaterThan - Value is greater than and not equal to the specified number
equalTo - Value equals the specified number

The following example includes objects that have more than 6 or more versions.

filters:
  versionsCount:
    greaterThan: 5

Size units

The size filter accepts human-readable units for byte sizes:

B - Bytes
KiB - Kibibytes (1024 bytes)
MiB - Mebibytes (1024 KiB)
GiB - Gibibytes (1024 MiB)

The following example includes objects larger than 100 mebibytes.

filters:
  size:
    greaterThan: 100MiB

Combining numeric filters

You can combine multiple operators to specify ranges. An object must satisfy all specified conditions to match.

The following example includes objects between 10 MiB and 1 GiB in size.

filters:
  size:
    greaterThan: 10MiB
    lessThan: 1GiB

Name filters

Name filters select objects based on their key (name) using pattern matching. Three matching methods are available, and they are mutually exclusive within a single name filter configuration.

Glob patterns

The match operator uses glob-style pattern matching with wildcards:

* matches any sequence of characters
? matches any single character
[abc] matches any character in the set

The following example includes objects with keys ending in .jpg.

filters:
  name:
    match: "*.jpg"

The following example includes CSV files starting with report- followed by a year in the 2020s decade.

filters:
  name:
    match: "report-202?-*.csv"

Important

The match, contains, and regex filters are mutually exclusive. You can specify only one in a single name filter configuration.

Substring search

The contains operator performs simple substring matching. Objects match if their key contains the specified string anywhere.

The following example includes objects with archive anywhere in the key.

filters:
  name:
    contains: "archive"

Important

The match, contains, and regex filters are mutually exclusive. You can specify only one in a single name filter configuration.

Regular expressions

The regex operator uses regular expression pattern matching. The regular expression syntax follows the RE2 specification.

The following example includes objects matching the pattern data-YYYY-MM-DD.json.

filters:
  name:
    regex: "^data-\\d{4}-\\d{2}-\\d{2}\\.json$"

Note that backslashes in YAML strings must be escaped, so \d becomes \\d.

Important

The match, contains, and regex filters are mutually exclusive. You can specify only one in a single name filter configuration.

Key-value filters

Key-value filters select objects based on their tags or user metadata. These filters support complex logical combinations.

Logical operators

Key-value filters use arrays of match specifications combined with logical operators:

and - All conditions in the array must be true
or - At least one condition in the array must be true

Each match specification includes a key field and either valueString or valueNum for the value comparison.

The following example includes objects that have both an environment tag equal to production and a department tag containing engineering.

filters:
  tags:
    and:
      - key: environment
        valueString:
          match: "production"
      - key: department
        valueString:
          contains: "engineering"

String value matching

The valueString operator matches tag or metadata values using the same methods as name filters:

match - Glob-style pattern matching
contains - Substring search
regex - Regular expression matching

The following example includes objects where the project tag starts with ares-.

filters:
  tags:
    and:
      - key: project
        valueString:
          match: "ares-*"

Numeric value matching

The valueNum operator matches numeric tag or metadata values using the same comparison operators as numeric filters:

lessThan - Value is less than the specified number
greaterThan - Value is greater than the specified number
equalTo - Value equals the specified number

The following example includes objects where the x-amz-meta-priority user metadata value is 6 or more.

filters:
  userMetadata:
    and:
      - key: x-amz-meta-priority
        valueNum:
          greaterThan: 5

Complex combinations

You can combine multiple conditions and nest logical operators to create sophisticated filters.

The following example includes objects that meet both of these conditions:

Status tag is active AND priority tag is greater than 3
Urgent tag is true OR escalated tag is true

filters:
  tags:
    and:
      - key: status
        valueString:
          match: "active"
      - key: priority
        valueNum:
          greaterThan: 3
    or:
      - key: urgent
        valueString:
          match: "true"
      - key: escalated
        valueString:
          match: "true"

The userMetadata filter uses the same syntax as tags. The following example includes objects where the x-amz-meta-processed-by metadata matches the pattern worker-* and x-amz-meta-version is greater than 2.

filters:
  userMetadata:
    and:
      - key: x-amz-meta-processed-by
        valueString:
          regex: "worker-.*"
      - key: x-amz-meta-version
        valueNum:
          greaterThan: 2

Complete configuration example

The following example demonstrates all available configuration options. Comments indicate mutually exclusive options and other important notes.

# Required: API version must be "v1"
apiVersion: v1

# Required: Unique identifier for this job
id: comprehensive-inventory-example

# Required: Destination configuration
destination:
  # Required: Destination bucket name
  bucket: inventory-reports-bucket

  # Optional: Prefix for report files
  prefix: comprehensive-reports/

  # Optional: Output format (csv, json, or parquet)
  # Default: csv
  format: parquet

  # Optional: Compression setting (on or off)
  # Default: on
  compression: on

# Optional: Job schedule (once, hourly, daily, weekly, monthly, yearly)
# Default: once
schedule: weekly

# Optional: Operational mode (fast or strict)
# Default: fast
mode: strict

# Optional: Version handling (all or current)
# Default: all
versions: all

# Optional: Additional metadata fields to include
# See output field list for all available fields
includeFields:
  - ETag
  - StorageClass
  - IsMultipart
  - EncryptionStatus
  - Tags
  - UserMetadata
  - ReplicationStatus
  - ObjectLockMode
  - TierField

# Optional: Filter rules to select objects
filters:
  # Prefix filter: Include objects with these prefixes
  prefix:
    - "documents/"
    - "images/"
    - "videos/"

  # Age filter: Filter by last modified date
  lastModified:
    # Relative time filters (use duration strings)
    olderThan: 7d
    newerThan: 365d

    # Absolute time filters (use RFC3339 timestamps)
    # Can be combined with relative filters
    after: "2024-01-01T00:00:00Z"
    # before: "2024-12-31T23:59:59Z"

  # Size filter: Filter by object size
  size:
    greaterThan: 1MiB
    lessThan: 100GiB
    # equalTo: 50MiB  # Mutually exclusive with range filters

  # Version count filter: Filter by number of versions
  versionsCount:
    greaterThan: 1
    # lessThan: 10
    # equalTo: 5  # Mutually exclusive with range filters

  # Name filter: Filter by object key
  # NOTE: match, contains, and regex are mutually exclusive
  name:
    match: "*.mp4"
    # contains: "archive"  # Mutually exclusive with match/regex
    # regex: "^data-\\d{4}\\.json$"  # Mutually exclusive with match/contains

  # Tag filter: Filter by object tags
  tags:
    # AND: All conditions must be true
    and:
      - key: project
        valueString:
          match: "ares-*"
      - key: status
        valueString:
          contains: "active"
      - key: priority
        valueNum:
          greaterThan: 5

    # OR: At least one condition must be true
    # NOTE: Can specify both 'and' and 'or' in same filter
    # or:
    #   - key: urgent
    #     valueString:
    #       match: "true"

  # User metadata filter: Filter by user-defined metadata
  userMetadata:
    and:
      - key: x-amz-meta-processed-by
        valueString:
          regex: "worker-\\d+"
      - key: x-amz-meta-version
        valueNum:
          greaterThan: 2