Why Do We Write Terraform Modules?

I am often in a privileged position to watch people develop their skills based on the roles I am in. With respect to Terraform, I see folks learn the basics where they’re hard coding values into all of the resource properties, then they move on to using input variables, and so on. Another interesting inflection point is when folks learn about modules. I see these folks write their first module, then their second module, and soon they’re writing modules all day long; every resource that they write is wrapped in a module. This is an extremely common pattern that I see. Folks that continue to mature their skills realize that there is a balance to be had with respect to modules.

Here, I want to explore the fundamental reasons why we should write modules. Using the “5 Whys” is a good approach, but we’ll just pretend like we have jumped through most of them already and focus on the root challenges that we want to solve for by creating modules.

What a Module Really Is

Bottom line up front: a module (with respect to Terraform) is a way for those of us writing code in HCL to create our own idea of a resource, but written in HCL rather than relying on Golang, the language that Terraform and its providers are written. Full stop.

This is a normal evolution of higher level langauges. First, a high level language is created with some basic functions built-in and then some standard libraries that offer most of the core functionality that we desire. If we need more features, then we need more libraries created and we have to write those libraries in the same lower level language that the higher level language is written in. This evolution happened with PowerShell. It was written in C# with the .Net Framework. PowerShell plugins (and later modules) were already written in C#. Eventually, we were able to write PowerShell modules in PowerShell. Writing a module in PowerShell is writing a set of related functions and we’re able to do essentially everything that we could do in C# but without having to learn it.

This is fundamentally what we’re able to do with modules in Terraform. We find some resource(s) that we want to deploy, but we want to standardize some capability, or add some sense of opinion to it. This is something else that we want to strike a proper balance with. If we want our modules to be reusable, we want to limit the opinionation of those modules to as little as possible.

A frequent example of this is deploying “networks” with Terraform. The pattern is the same for the various cloud providers, there is a network with some number of subnets that have address spaces carved out of the overall network address space. We can add in some route tables and establish peering. The cloud providers also build in some type of regional resiliency in terms of [availability] zones.

The providers in Terraform tend to allow us to treat subnets as separate resources or simply properties of the associated network. What experience has taught me is that using the separate resources for subnets allows for simpler code. However, we might want to offer up a way to cleanly create the network with all of its subnets in one effort. We could do this in many ways, but creating a solid and readable data structure that limits errors of ommission and transposition is probably the best.

If we look at the Microsoft written terraform-azure-vnet module we can see some ways to perform this task. For instances, it expects a list of subnet_names and a corresponding list of subnet_prefixes. What is troublesome about this is that we have very little control to validate the relationship between theses two lists. Each element, in order, corresponds to the element in the other list. First instance:

subnet_names    = ["first", "second", "third"]
subnet_prefixes = ["203.0.113.0/26", "203.0.113.64/26", "203.0.113.128/26"]

The subnet named “first” is associated with the prefix of “203.0.113.0/26”, while the subnet named “second” is associated with the prefix of “203.0.113.64/26”. If we change the order of either list, the address space that we expect to be associated with the name will not align. Also, we will not receive any sort of error if the lists are a different length until we get further into our process (at the plan stage). Ideally, we would be able to validate these before we do anything else, but the validation features in Terraform do not presently allow us to compare one variable to another in its conditions.

In addition, this becomes more complex as we consider other properties that we want to work with. For instance, do we want to establish some service endpoints? Now we have to create a facility to evaluate which (perhaps multiple or none) service endpoints that we want to create on each subnet.

Alternatively, we could create a better data structure to fix these issues, which ultimately helps us to have our module become more like a resource. For instance, if we have a map of an object and many properties included, we can do quite a bit. First, it solves the challenge created by associating elements of a list with another list. We create a direct relationship between the “key” of the map and its value and properties.

subnets = {
  "first" = {
    prefix = ["203.0.113.0/26"]
  }
  "second" = {
    prefix = ["203.0.113.64/26"]
  }
  "third" = {
    prefix = ["203.0.113.128/26"]
  }
}

Now, there is a parent-child relationship that is easily read. Further, we have the ability to provide more than on prefix to a subnet, which is a capability that the azurerm_subnet resource natively has; we cannot do this with the terraform-azure-vnet module.

With Terraform 1.3.0, we also had the “Optional attributes for object type constraints” feature officially released. This gives us capabilities that further align what we can do in a module with what can be done in a resource because the more complex data structures previously could have a default value assigned that represented the entire set of properties, or none. Let’s say that we want to optionally support service endpoints and do so we might set policies for private link and list of the service endpoints that we want to include with a subnet. We could define a default:

variable "subnets" {
  type = map(object({
    address_prefixes                               = list(string)
    enforce_private_link_endpoint_network_policies = bool
    enforce_private_link_service_network_policies  = bool
    service_endpoints                              = list(string)
  }))
}

If we supply no “default” then we are requiring the values be provided. This is good except we’re not able to simplify things as much. For every subnet, we must supply enforce_private_link_endpoint_network_policies, enforce_private_link_service_network_policies, and service_endpoints. A resource does not require such a thing because we can specify default values for first level properties for further level properties just the same.

With Terraform 1.3.0, we can adjust the variable definition:

variable "subnets" {
  type = map(object({
    address_prefixes                               = list(string)
    enforce_private_link_endpoint_network_policies = optional(bool, false)
    enforce_private_link_service_network_policies  = optional(bool, false)
    service_endpoints                              = optional(list(string))
  }))
}

Now, our original input is still valid and the default behavior will not require any values for the policies for service_endpoints. This gives a feel very similar to creating our own resource.

This is the first reason that we write modules, to effectively create our own resources.

Clean Code

Clean Code gives us a lot of insight into how we can craft our code. We can apply these principles to Terraform, just like we could to plenty of other languages. Let’s review some of the principles.

DRY (Don’t Repeat Yourself)

Ideally, we don’t want to write our code multiple times. We want to have it written once. These benefits that this provides are numerous. There is less code. Updating our code is less error prone because we don’t have to update each instance of that code, just the original. It is also easier to read.

We can do this by using meta arguments, like count and for_each (tip: just use for_each) for resources and modules.

resource "azurerm_subnet" subnet {
  for_each = var.subnets

  name             = each.key
  address_prefixes = each.value.address_prefixes
  ...
}

Small

We want to keep our code as small as we can. Writing massive monolithic modules creates all of the challenges of other monolithic architectures. If I want to be able to create multiple networks, I shouldn’t write a module that creates N networks, I should have a module that creates one network. Then, I can use a for_each to call the module if I need to define more than one network. This keeps the network module smaller and simpler and useable in more situations.

module "vnet" {
  for_each = var.networks

  name                = each.key
  resource_group_name = var.resource_group_name
  location            = var.location
  subnets             = each.value.subnets
}

This is also rather nice in Terraform because we can rarely nest our meta argument loops and must otherwise often use for loops and other items. Instead of nesting our loops, we can loop over the module, and we can loop within the module.

Prefer Fewer Arguments

Variables within modules automatically become “arguments”. I do find it better to default to defining a variable with a default value rather than hard coding values, almost exclusively. By offering default values, we’re “honoring” this principle the best we can in Terraform.

Always Try to Explain Yourself in Code

Try to avoid comments. Keeping a module smaller and DRY makes it easier read because there is less code to read. Using clear and complete names lends toward readability. It also helps to clearly explain what your code is doing. If at all possible, avoid comments by writing unambiguously clearly understandable code. If there is some reason things were done and you want to embed within the code the lessons learned, providing that with comments is fine.

Hide Internal Structure

If we have to do something nasty with our inputs, move that work to a locals definition. Make the inputs clearly understandable and make their usage with the resources clearly understandable. Put the spaghetti in the locals locker.

When Should We Write Modules

Many of the reasons have been discussed, but we’ll review for some clarity.

Creating Value

Any code that we write should create value. If we’re writing a module, it should create value. Needless wrapper modules don’t create value. Do not write wrapper modules for the sake of writing wrapper modules. This simply creates complexity. If we want to know what the code is doing, we have to peek into the module unnecessarily. For the same reason, aside from local development and for tests (using Terraform’s experimental testing functionality), local path modules probably should be avoided. If you want to create value with your module, it should be reusable… if it is nested in your structure and accessed through a local path (a subdirectory, for instance) then it isn’t very reusable.

Injecting Opinion

Again, we want to strike a balance. The more opinion that we inject the less flexible our module becomes. Sometimes this is preferred when we’re looking to standardize the way things are done. This is better suited for internal modules than public modules.

Learning

There is a lot of pressure on folks to reuse others’ code. Terraform, like many other IaC tools, tend to have this direct correlation where more complex things provide less value. If you reuse some other code that is horribly complex and it doesn’t work, you’ll have a heck of a time untangling the spaghetti. Instead of increasing velocity, it does the opposite. Find simple modules that you can completely understand. If there is an issue that arises, your understanding will help you arrive at a solution and you can contribute that solution in reciprocity! You’ll also learn more about Terraform in the process.

Summary

This post is a codification of principles that seem to work well when writing Terraform. They aren’t rules set in stone, but guidelines that help to understand code, its value, and improve the process.