Terraform Cross-Object Reference Limitations

Someone asked me for help with a provider that I never used before which led me to all sorts of recommendations and an issue that I will be opening. However, while writing up some examples of what could be done if they improved the provider, I ran across the next step in my unending dissatisfaction with any code.

Cross-object referencing was introduced in Terraform 1.9 and provides significant value when there are dependencies between a variable value and some other data. But there are some outstanding limitations that would be fantastic to resolve in a future release, but seem to be part of the core behavior of working with expressions in Terraform.

Previous Behavior

Before the feature was released, validation of variables was tricking and tedious. If there is a string with a set of possible valid values, it would require hardcoding those values twice, once for the condition and another time for a proper error message.

variable "disks" {
  default = {}
  description = "Map of disks with the key as the interface ID."
  type = map(object({
    backup   = optional(bool)
    iso      = optional(string)
    passthru = optional(bool)
    storage  = optional(string)
    type     = optional(string, "disk")
  }))

  validation {
    condition     = alltrue([
      for interface in keys(var.disks) :
      contains(
        ["ide", "sata", "scsi", "virtio"],
        trim(interface, "0123456789")
      )
    ])
    error_message = "Disk interfaces must be one of 'ide', 'sata', 'scsi', or 'virtio'."
  }
}

In this situation, the keys should be ide[0-3], sata[0-5], scsi[0-30], or virtio[0-15]. If a type is added or removed, both the condition and the error message must be updated, which is tedious and error prone. Cross-object referencing helps to alleviate this problem.

Current Behavior

With the introduction of cross-object referencing, the above example can be simplified to:

locals {
  valid_disk_interfaces = [
    "ide",
    "sata",
    "scsi",
    "virtio",
  ]
}

variable "disks" {
  default = {}
  description = "Map of disks with the key as the interface ID."
  type = map(object({
    backup   = optional(bool)
    iso      = optional(string)
    passthru = optional(bool)
    storage  = optional(string)
    type     = optional(string, "disk")
  }))

  validation {
    condition     = alltrue([
      for interface in keys(var.disks) :
      contains(local.valid_disk_interfaces, trim(interface, "0123456789"))
    ])
    error_message = format(
      "Disk interfaces must be one of '%s'.",
      join("', '", local.valid_disk_interfaces)
    )
  }
}

Now the list only needs to be updated once, or it could reference an external data source (via the HTTP provider, perhaps) that is supplies a dynamic value.

I was working through some recommended changes for a provider to allow for better patterns in code when I ran into a new limitation.

Limitations

When working with collections or structural data types, validation requires interating through the values, like the previous examples. Now, if I want to validation the numeric suffixes, I would have to create numerous validation blocks, per interface type:

  # ...
  validation {
    condition = alltrue([
      for interface in keys(var.disks) :
      provider::assert::between(
        0, 3,
        tonumber(trimprefix(interface, "ide"))
      )
      if startswith(interface, "ide")
    ])
    error_message = "Interfaces for 'ide' disks must be between 0 and 3, inclusively."
  }

  validation {
    condition = alltrue([
      for interface in keys(var.disks) :
      provider::assert::between(
        0, 5,
        tonumber(trimprefix(interface, "sata"))
      )
      if startswith(interface, "sata")
    ])
    error_message = "Interfaces for 'sata' disks must be between 0 and 5, inclusively."
  }

  validation {
    condition = alltrue([
      for interface in keys(var.disks) :
      provider::assert::between(
        0, 30,
        tonumber(trimprefix(interface, "scsi"))
      )
      if startswith(interface, "scsi")
    ])
    error_message = "Interfaces for 'scsi' disks must be between 0 and 30, inclusively."
  }

  validation {
    condition = alltrue([
      for interface in keys(var.disks) :
      provider::assert::between(
        0, 15,
        tonumber(trimprefix(interface, "virtio"))
      )
      if startswith(interface, "virtio")
    ])
    error_message = "Interfaces for 'virtio' disks must be between 0 and 15, inclusively."
  }
  # ...

That’s rather tedious, as well, and I would love to create a single validation block for that purpose. I can get close to what I want, but it would require some broader logic in the error message when it would be great to be able to create the the error message dynamically, based on the failure:

locals {
  valid_disk_interface_constraints = {
    ide    = 3
    sata   = 5
    scsi   = 30
    virtio = 15
  }
}

variable "disks" {
  # ...
  validation {
    condition = alltrue([
      for interface in keys(var.disks) :
      provider::assert::between(
        0,
        local.valid_disk_interface_constraints[trim(interface, "0123456789")],
        tonumber("^(?P<type>ide|sata|scsi|virtio)(?P<id>\\d{1,2})$", interface).id
      )
    ])
    error_message = format(
      "Interfaces for '%s' disks must be between 0 and %d, inclusively.",
      trim(interface, "0123456789"),
      local.valid_disk_interface_constraints[trim(interface, "0123456789")]
    )
  }
  # ...
}

This also reintroduces the “list”, again, in the regular expression. It also just doesn’t work.

So, I had two thoughts on how to resolve this. Separate logic within the `condition` and the `error_message`, or using a `dynamic` block to generate multiple `validation` blocks on the fly.

Separate Logic

In this situation, the error message is generated from the our locals that only needs to be maintained in one location, but it applies to each interface type:

locals {
  valid_disk_interface_constraints = {
    ide    = 3
    sata   = 5
    scsi   = 30
    virtio = 15
  }
}

variable "disks" {
  # ...
  validation {
    condition = alltrue([
      for interface in keys(var.disks) :
      provider::assert::between(
        0,
        local.valid_disk_interface_constraints[trim(interface, "0123456789")],
        tonumber(
          regex(
            format(
              "^(?P<type>%s)(?P<id>\\d{1,2})$",
              join("|", keys(local.valid_disk_interface_constraints))
            ),
            interface
          ).id
        )
      )
    ])
    error_message = format(
      "Interfaces must adhere to the following IDs:\n%s",
      join(
        "\n",
        [
          for k, v in local.valid_disk_interface_constraints :
          format("  %s => [0-%d]", k, v)
        ]
      )
    )
  }
  # ...
}

This likely has the potential to scare many people because it includes a regular expression (which is enough for most people) that is created with a format and join combination. But it does work, and it is a single validation block.

Dynamic Blocks

Writing it as a solution with a dynamic block would look like this:

locals {
  valid_disk_interface_constraints = {
    ide    = 3
    sata   = 5
    scsi   = 30
    virtio = 15
  }
}

variable "disks" {
  # ...
  dynamic "validation" {
    for_each = local.valid_disk_interface_constraints

    content {
      condition = alltrue([
        for interface in keys(var.disks) :
        provider::assert::between(
          0, validation.value,
          tonumber(trimprefix(interface, validation.key))
        )
        if startswith(interface, validation.key)
      ])
      error_message = format(
        "Interfaces for '%s' disks must be between 0 and %d, inclusively.",
        validation.key, validation.value
      )
    }
  }
  # ...
}

While this isn’t the most concise code, it is a single block that is generated dynamically. It effectively independently creates the four separate blocks from before, entirely using the locals constraints.

Unfortunately, dynamic blocks aren’t supported within variable block. However, I think this would be an ideal solution.

Leave a comment