Terraform Best Practices: Variables

I’ve previously discussed the concept of not hardcoding values in your Terraform configuration files. This is a best practice that is generally agreed upon by the Terraform community. However, there are some nuances to this best practice that are worth discussing. In this post, I’ll cover some best practices around fully utilizing variables in your Terraform configuration files.

Type Definitions

The minimum requirement for declaring a variable in Terraform is an empty variable block with the variable name:

variable "my_variable" {}

Now, an input can be supplied and used within the configuration. However, this leaves opportuntiy for errors and confusion. Terraform has some aspects of a dynamically-typed language with this regard. However, once a type is set, it cannot be changed. We can also set a data type statically for our variables. Without setting a type, as in our previous example, the variable assumes the pseudo-type of any.

What are data types?

I am not comfortable with superficial knowledge, but many introductory tools to learning programming languages don’t cover data types much beyond explaining that some types exist and hold certain types of information at a very basic level. Data types are really an established set of constraints for storing data within memory. This helps with data retrieval and ensuring that the operations that we’re attempting to perform on the data match with the type of data that we’re working with.

Primitive data types

Terraform has a very simplified set of data types. Systems begin with some “primitives” that are the building blocks for more complex data types. Primitives are often also referred to as “atomic” or “scalar” types. These are the most basic types of data that can be stored in a system and can be used to build more complex data types.

Terraform and HCL2 implement types through the go-cty package:

bool: represents true or false values and is directly mapped to the Go bool type. Default values are “False”.
number: represents what are called “JSON numbers”. Numbers are stored as the *big.Float type with 512 bits of precision and can be integers or floating point numbers. It has constructors for mapping different types of numbers onto the various numeric data types within Go. With 512 bits, it can easily represent any int64 , uint64, or float64. Default values are 0.
string: represents a sequence of UTF-8 (Unicode) characters. Strings are stored as the Go String type. Default values are “”.

These primitives can be coerced or converted into each other with varying degrees of success. For instance, the number 0 would be stored as 512 bits of zeros, but a string of “0” is stored as the UTF-8 value of 48, as digits begin with 48 for 0 and end with 57 for 9. Capital letters begin with 65 for A and end with 90 for Z. Lowercase letters begin with 97 for a and end with 122 for z. This is really the fundamental mechanisms for data types. In addition, the data will be stored with meta-data about which data type it is.

Terraform has the psuedo-type of any which we discussed earlier. It will accept any data type, but once a value is assigned, the data type is inferred from the value and it will remain that data type throughout the execution of Terraform, which moves it slightly outside of the realm of dynamically-typed languages taht generally allow the data type to be later changed. This is implemented with the “go-cty” Dynamic type. This is extremely useful when constructing the complex data types that we’ll discuss next, as the values are initialized as the Dynamic type until the type is inferred.

Collection data types

Beyond the primitives, Terraform has collection data types. Collections are a way to store multiple values and each value must be the same data type, or what we call type-homogenous:

list: this is an ordered sequence of elements of the same data type. They’re accessed via an index that begins at zero for the first element. They are stored as a slice, which is a dynamically sized array, in Go.
map: this is an unordered collection of key-value pairs. The keys are strings and the values can be any data type, but all the same. The keys must be unique, which technically makes the key a set, which we’ll discuss next. They key is the index for the value. It is eventually stored as a map in Go.
set: by defintion, a set is a collection of elements that are either in the set or out of the set, or a mathematical set. We could think of a set as a list of all possible unique values and they’re either true (in the set) or false (not in the set). However, this would be horribly bulky with respect to memory. Instead, they’re constructed exactly like a list. Each accept a Go slice that is type-homogenous. The set is also accessible via an index and stored as a slice in Go.

Structural Data Types

Terraform also has structural types. These types share many characteristics with the collection data types, however they are not type-homogenous. This means that the types are not the same, but they are consistent with the types that they are composed of:

object: this is a defined set of key-value pairs. The keys are strings and the values can be any data type. The keys must be unique. This is similar to a map, but the keys are defined in the object. It is stored as a set of parallel Go maps, a map each for the keys, the values, and the data types. Because maps are type-homogenous, we often see the type for map set to an object, which allows for a map to become type-heterogenous and effectively allows for a schema to be created.
tuple: this is an ordered sequence of elements of different data types. It is accessed via an index that begins at zero for the first element. It is stored as a set of parallel slices in Go, a slice each for the values and the data types.

By establishing data types, we’re constraining the data that can be stored in a variable and reducing the likelihood of errors. This means we don’t have to check the value to ensure that it is the correct data type before performing operations on it.

Descriptions

Using descriptions on variables is a best practice that is often overlooked. Descriptions are a way to document the purpose of a variable and provide context for the user. They are effectively a comment with additional functionality. These can be used with a tool like Terraform Docs to generate documentation for your Terraform configuration files.

Default Values

Default values offer a way to simplify the use of our code. We can set a default value to the most common or best practice value. If they are provided, the user of our code can simply ignore supplying a value if they are happy with the default. This is a great way to simplify the use of our code and make it more user-friendly.

Sensitive Values

Marking a variable as sensitive means that any form of output (standard output, logs, etc.) will be masked when the value is displayed. This is useful in preventing the values from being leaked. It does nothing more to protect the data. The data will still be stored in state. This is a best practice for any sensitive data, such as passwords, API keys, etc. This was introduced with Terraform v0.14.

Nullable

By default, all variables are nullable, meaning they can be assigned a value of null. This is another useful way to simplify the use of our code. If a variable is not required, we can set it to null and the user can simply ignore supplying a value if they don’t need it. It does require a value be set, so the default could be set to null. This allows for conditional logic to be used in the configuration. We can set the variable to not allow for null values by setting nullable = false. This was introduced with Terraform v1.1.

Validation

Validation is a nested block and can be defined numerous times per variable declaration. It takes two arguments:

condition: some expression that evaluates to true or false. A value of true means that the validation passes and a value of false means that the validation fails.
error_message: a string that is displayed if the validation fails.

This is a great way to ensure that the values supplied to the variable are within the expected range. This was introduced with Terraform v0.13.

String validation

The expression can use various functions to validate some properties of a string:

Length: length(var.my_string_variable) > 0
Contains a substring: strcontains(var.my_string_variable, "substring")
Starts with a substring: startswith(var.my_string_variable, "substring")
Ends with a substring: endswith(var.my_string_variable, "substring")
Has a certain number of parts: length(split(",", var.my_string_variable)) > 1
Is a one of a set of values: contains(["value1", "value2"], var.my_string_variable)

Number validation

The expression can use various functions to validate some properties of a number:

Greater than: var.my_number_variable > 0
Greater than or equal to: var.my_number_variable >= 0
Less than: var.my_number_variable < 0
Less than or equal to: var.my_number_variable <= 0
Within a range: var.my_number_variable >= 0 && var.my_number_variable <= 100
Equal to: var.my_number_variable == 0

List validation

The expression can use use a loop to validate all or some of the element of a list:

All values are greater than: alltrue([for i in var.my_list_variable : i > 0])
At least one value is greater than: anytrue([for i in var.my_list_variable : i > 0])
All elements are one of a set of values: alltrue([for i in var.my_list_variable: contains(["value1", "value2"], i)])

Map validation

The expression can use use a loop to validate all or some of the element of a map:

All cidr_block properties are valid CIDR blocks: alltrue([for k, v in var.my_map_variable: can(cidrhost(v.cidr_block, 1))])
All ddos_protection_plan_id properties are set if ddos_protection_enabled is true (I would prefer checking if it isn’t null rather than having two properties, but this can be useful in a more complex scenario): alltrue([for k, v in var.my_map_variable : v.ddos_protection_plan_id != null && v.ddos_protection_plan_id != "" if v.ddos_protection_enabled ])

Summary

By fully utilizing the capabilities with variable declarations, we’re able to sanitize our input values and reduce not only the likelihood of errors, but also avoid writing additional code to check for many types of errors in situ. This makes our code more readable and maintainable. This is a best practice that is often overlooked, but can greatly improve the quality of our code.