Configuration Reference

Overview

This reference provides complete documentation for inventory job configuration. Inventory jobs use YAML configuration files that define what objects to include, what metadata to report, how often to run, and where to store generated reports.

The configuration file contains top-level fields that control job behavior, destination settings for output location and format, optional fields for including additional metadata, and filters for selecting specific objects. All configuration files must specify apiVersion, id, and destination as required fields.

Top-level fields

The following table describes all available fields in the inventory configuration:

Field Description
apiVersion Required.

The version of the inventory configuration API.

Must be set to "v1".
id Required.

A unique identifier for this inventory job.

The ID appears in the output folder path and status queries. Use descriptive names that indicate the job purpose.
destination Required.

An object that specifies where to store generated inventory reports.

See destination fields below for required and optional subfields.
destination.bucket Required.

The name of the destination bucket for inventory reports.

The bucket must exist and you must have write permissions.
destination.prefix Optional.

A prefix for the report file keys in the destination bucket.

Use prefixes to organize reports from multiple inventory jobs. The prefix is prepended to the standard output path structure.
destination.format Optional.

The output format for the report files.

Valid values are csv, json, or parquet. Defaults to csv.

CSV format includes field names in the first row. JSON format uses JSON Lines with one object per line. Parquet format uses columnar storage optimized for analytics.
destination.compression Optional.

Specifies whether to compress the report files.

Valid values are "on" or "off". Defaults to "on".

Compression uses ZSTD algorithm. Compressed files have the .zst extension.
schedule Optional.

Defines how often the inventory job runs.

Valid values are once, hourly, daily, weekly, monthly, or yearly.

Defaults to once or a one-time job.

The scheduler determines the next run time based on the completion time of the previous run, not fixed calendar intervals.
mode Optional.

The operational mode for the job.

Valid values are fast or strict. Defaults to fast.

Fast mode prioritizes speed over consistency. Strict mode prioritizes consistency over speed. In both modes, objects modified during job execution may not be included in the report.
versions Optional.

Specifies which object versions to include in the report.

Valid values are all or current. Defaults to all.

The all setting includes all versions of each object. The current setting includes only the most recent version.
includeFields Optional.

An array of additional object metadata fields to include in the report.

See the output field list section for available field names. The report always includes default fields regardless of this setting.
filters Optional.

An object containing rules to filter which objects the report includes.

You can combine multiple filter types. Included objects must match all specified filters. See the filters section for detailed filter syntax.

Output field list

Inventory reports include a set of default fields for every object. You can optionally include additional metadata fields using the includeFields configuration.

Default fields

The inventory report always includes the following fields:

Field Description
Bucket The name of the source bucket
Key The object key (name)
SequenceNumber The object sequence number
Size Object size in bytes
LastModifiedDate Last modified timestamp in RFC3339 format

Version-specific fields

When versions is set to all, the report includes the following additional fields:

Field Description
VersionID The version ID of the object
IsDeleteMarker Boolean indicating if the object is a delete marker
IsLatest Boolean indicating if this is the latest version

Optional fields

Include any of the following fields by adding them to the includeFields array:

Field Description
ETag The object’s ETag value
StorageClass The object’s storage class
IsMultipart Boolean indicating if the object was uploaded as multipart
EncryptionStatus The server-side encryption status
IsBucketKeyEnabled Boolean indicating if bucket key encryption is enabled
KmsKeyArn The KMS key ARN used for server-side encryption
ChecksumAlgorithm The checksum algorithm used for the object
Tags The object’s tags, formatted as a query string (for example, key1=value1&key2=value2)
UserMetadata User-defined metadata, formatted as a query string
AccessTime The object’s last access time
ReplicationStatus The object’s replication status
ObjectLockRetainUntilDate The object lock retention date
ObjectLockMode The object lock mode (for example, GOVERNANCE, COMPLIANCE)
ObjectLockLegalHoldStatus The object lock legal hold status (on or off)
Tier The object’s storage tier
TieringStatus The status of the object’s tiering

Example configuration:

includeFields:
  - ETag
  - StorageClass
  - Tags
  - UserMetadata

Filters

Filters select a subset of objects to include in the inventory report. You can specify multiple filter types in the same configuration. The report only includes objects that match all specified filters. Each filter type has specific syntax and operators appropriate for the data being filtered.

Prefix filters

Prefix filters include only objects whose keys start with one of the specified prefixes. This is the most efficient filter type for selecting subsets of a bucket.

Configuration uses an array of prefix strings:

filters:
  prefix:
    - "documents/"
    - "images/"
    - "videos/archive/"

Objects match if their key starts with any of the specified prefixes. Prefix filters are case-sensitive and must match exactly from the beginning of the key.

Age filters

Age filters select objects based on their last modified timestamp. You can specify relative time ranges using durations or absolute time ranges using timestamps.

Relative time

Relative time filters use duration strings with the following format:

  • d for days (for example, 30d)
  • h for hours (for example, 12h)
  • m for minutes (for example, 45m)
  • Combined (for example, 1d6h for 1 day and 6 hours)

Available operators:

  • olderThan - Objects last modified before the specified duration ago
  • newerThan - Objects last modified after the specified duration ago

The following example includes objects last modified more than 30 days ago.

filters:
  lastModified:
    olderThan: 30d

The following example includes objects last modified within the past 7 days.

filters:
  lastModified:
    newerThan: 7d

Absolute time

Absolute time filters use RFC3339 format timestamps (for example, 2023-10-27T10:00:00Z).

Available operators:

  • before - Objects last modified before the specified timestamp
  • after - Objects last modified after the specified timestamp

The following example includes objects last modified during the year 2023.

filters:
  lastModified:
    before: "2023-12-31T23:59:59Z"
    after: "2023-01-01T00:00:00Z"

Combining age filters

You can combine multiple age filter operators to specify precise time ranges. An object must satisfy all specified conditions to match.

The following example includes objects last modified between 30 and 90 days ago.

filters:
  lastModified:
    olderThan: 30d
    newerThan: 90d

You can also combine relative and absolute time filters. The following example includes objects modified after January 1, 2024, but not within the past 7 days.

filters:
  lastModified:
    after: "2024-01-01T00:00:00Z"
    olderThan: 7d

Numeric filters

Numeric filters select objects based on numeric properties such as size or version count. All numeric filters support three comparison operators.

Comparison operators

Available operators for all numeric filters:

  • lessThan - Value is less than and not equal to the specified number
  • greaterThan - Value is greater than and not equal to the specified number
  • equalTo - Value equals the specified number

The following example includes objects that have more than 6 or more versions.

filters:
  versionsCount:
    greaterThan: 5

Size units

The size filter accepts human-readable units for byte sizes:

  • B - Bytes
  • KiB - Kibibytes (1024 bytes)
  • MiB - Mebibytes (1024 KiB)
  • GiB - Gibibytes (1024 MiB)

The following example includes objects larger than 100 mebibytes.

filters:
  size:
    greaterThan: 100MiB

Combining numeric filters

You can combine multiple operators to specify ranges. An object must satisfy all specified conditions to match.

The following example includes objects between 10 MiB and 1 GiB in size.

filters:
  size:
    greaterThan: 10MiB
    lessThan: 1GiB

Name filters

Name filters select objects based on their key (name) using pattern matching. Three matching methods are available, and they are mutually exclusive within a single name filter configuration.

Glob patterns

The match operator uses glob-style pattern matching with wildcards:

  • * matches any sequence of characters
  • ? matches any single character
  • [abc] matches any character in the set

The following example includes objects with keys ending in .jpg.

filters:
  name:
    match: "*.jpg"

The following example includes CSV files starting with report- followed by a year in the 2020s decade.

filters:
  name:
    match: "report-202?-*.csv"
Important
The match, contains, and regex filters are mutually exclusive. You can specify only one in a single name filter configuration.

The contains operator performs simple substring matching. Objects match if their key contains the specified string anywhere.

The following example includes objects with archive anywhere in the key.

filters:
  name:
    contains: "archive"
Important
The match, contains, and regex filters are mutually exclusive. You can specify only one in a single name filter configuration.

Regular expressions

The regex operator uses regular expression pattern matching. The regular expression syntax follows the RE2 specification.

The following example includes objects matching the pattern data-YYYY-MM-DD.json.

filters:
  name:
    regex: "^data-\\d{4}-\\d{2}-\\d{2}\\.json$"

Note that backslashes in YAML strings must be escaped, so \d becomes \\d.

Important
The match, contains, and regex filters are mutually exclusive. You can specify only one in a single name filter configuration.

Key-value filters

Key-value filters select objects based on their tags or user metadata. These filters support complex logical combinations.

Logical operators

Key-value filters use arrays of match specifications combined with logical operators:

  • and - All conditions in the array must be true
  • or - At least one condition in the array must be true

Each match specification includes a key field and either valueString or valueNum for the value comparison.

The following example includes objects that have both an environment tag equal to production and a department tag containing engineering.

filters:
  tags:
    and:
      - key: environment
        valueString:
          match: "production"
      - key: department
        valueString:
          contains: "engineering"

String value matching

The valueString operator matches tag or metadata values using the same methods as name filters:

  • match - Glob-style pattern matching
  • contains - Substring search
  • regex - Regular expression matching

The following example includes objects where the project tag starts with ares-.

filters:
  tags:
    and:
      - key: project
        valueString:
          match: "ares-*"

Numeric value matching

The valueNum operator matches numeric tag or metadata values using the same comparison operators as numeric filters:

  • lessThan - Value is less than the specified number
  • greaterThan - Value is greater than the specified number
  • equalTo - Value equals the specified number

The following example includes objects where the x-amz-meta-priority user metadata value is 6 or more.

filters:
  userMetadata:
    and:
      - key: x-amz-meta-priority
        valueNum:
          greaterThan: 5

Complex combinations

You can combine multiple conditions and nest logical operators to create sophisticated filters.

The following example includes objects that meet both of these conditions:

  • Status tag is active AND priority tag is greater than 3
  • Urgent tag is true OR escalated tag is true
filters:
  tags:
    and:
      - key: status
        valueString:
          match: "active"
      - key: priority
        valueNum:
          greaterThan: 3
    or:
      - key: urgent
        valueString:
          match: "true"
      - key: escalated
        valueString:
          match: "true"

The userMetadata filter uses the same syntax as tags. The following example includes objects where the x-amz-meta-processed-by metadata matches the pattern worker-* and x-amz-meta-version is greater than 2.

filters:
  userMetadata:
    and:
      - key: x-amz-meta-processed-by
        valueString:
          regex: "worker-.*"
      - key: x-amz-meta-version
        valueNum:
          greaterThan: 2

Complete configuration example

The following example demonstrates all available configuration options. Comments indicate mutually exclusive options and other important notes.

# Required: API version must be "v1"
apiVersion: v1

# Required: Unique identifier for this job
id: comprehensive-inventory-example

# Required: Destination configuration
destination:
  # Required: Destination bucket name
  bucket: inventory-reports-bucket

  # Optional: Prefix for report files
  prefix: comprehensive-reports/

  # Optional: Output format (csv, json, or parquet)
  # Default: csv
  format: parquet

  # Optional: Compression setting (on or off)
  # Default: on
  compression: on

# Optional: Job schedule (once, hourly, daily, weekly, monthly, yearly)
# Default: once
schedule: weekly

# Optional: Operational mode (fast or strict)
# Default: fast
mode: strict

# Optional: Version handling (all or current)
# Default: all
versions: all

# Optional: Additional metadata fields to include
# See output field list for all available fields
includeFields:
  - ETag
  - StorageClass
  - IsMultipart
  - EncryptionStatus
  - Tags
  - UserMetadata
  - ReplicationStatus
  - ObjectLockMode
  - TierField

# Optional: Filter rules to select objects
filters:
  # Prefix filter: Include objects with these prefixes
  prefix:
    - "documents/"
    - "images/"
    - "videos/"

  # Age filter: Filter by last modified date
  lastModified:
    # Relative time filters (use duration strings)
    olderThan: 7d
    newerThan: 365d

    # Absolute time filters (use RFC3339 timestamps)
    # Can be combined with relative filters
    after: "2024-01-01T00:00:00Z"
    # before: "2024-12-31T23:59:59Z"

  # Size filter: Filter by object size
  size:
    greaterThan: 1MiB
    lessThan: 100GiB
    # equalTo: 50MiB  # Mutually exclusive with range filters

  # Version count filter: Filter by number of versions
  versionsCount:
    greaterThan: 1
    # lessThan: 10
    # equalTo: 5  # Mutually exclusive with range filters

  # Name filter: Filter by object key
  # NOTE: match, contains, and regex are mutually exclusive
  name:
    match: "*.mp4"
    # contains: "archive"  # Mutually exclusive with match/regex
    # regex: "^data-\\d{4}\\.json$"  # Mutually exclusive with match/contains

  # Tag filter: Filter by object tags
  tags:
    # AND: All conditions must be true
    and:
      - key: project
        valueString:
          match: "ares-*"
      - key: status
        valueString:
          contains: "active"
      - key: priority
        valueNum:
          greaterThan: 5

    # OR: At least one condition must be true
    # NOTE: Can specify both 'and' and 'or' in same filter
    # or:
    #   - key: urgent
    #     valueString:
    #       match: "true"

  # User metadata filter: Filter by user-defined metadata
  userMetadata:
    and:
      - key: x-amz-meta-processed-by
        valueString:
          regex: "worker-\\d+"
      - key: x-amz-meta-version
        valueNum:
          greaterThan: 2