Terraform Best Pratices: Defining Modules

Modularity in programming is a crucial capability for creating extensible and reusable pieces of code. It helps to reduce the volume of code, which promotes maintainability. It takes many forms, from initially establishing functions, subroutines, and methods, to bundling them into libraries, packages, or modules that are redistributable. In Terraform, there are two forms of modularity, the provider (written in Go) and modules (written in HCL). Witnessing many learners of Terraform, there is a pattern where they create their first module and then go crazy writing modules for everything. Creating a module for an Azure Resource Group is one that takes things to an extreme.

What factors should be considered when scoping a module?

Single Resource Modules

Creating a module for an Azure Resource Group is one extreme. Such a module would create absolutely minimal value; in fact, it may create a deficit because extra code is more code to maintain.

The resource has only four arguments:

name (required)
location (required)
managed_by
tags

This particular resource is what initially sold me on Terraform because an Azure Resource Manager template written in JSON required 40 lines just to deploy it, whereas minimally only the name and location must be set and would need four lines of code for the resource definition in Terraform. Creating a module would require at least four lines of code, as well, if we had a default location established and we weren’t doing something silly like randomizing the name within the module.

The question should be considered, do single resource modules make any sense?

There are many cases where they do. Some resources are highly complex and establishing some patterns for simiplication of deployment is highly valuable. Various HTTP proxying resources come to mind including load balancers, application gateways, and API gateways. I’ve created modules for them in the past that were in excess of four digits of code for just the single resource and it did greatly simplify the deployment.

However, there are more resources that shouldn’t be created as a single resource module. We often want to look for groups of resources that should be deployed together.

Modules for Complete Deployments

These should often be avoided, as well. Such a pattern goes a bit overboard on opinionation. The more opinionated a module is the lower the flexibility is. One example is deploying a Palo Alto NVA in Azure from the Marketplace. Deploying it for a customer we found that it deployed a Virtual Network and if we wanted to deploy two, the Marketplace template would deploy a Virtual Network for each. In reality, it shouldn’t deploy any Virtual Network at all, because we may want to deploy to an existing Virtual Network.

Dependency inversion/injection should be used, instead.

I have also created and watched others create a module that deploys an entire hub-and-spoke architecture.

These sorts of modules do too much. Not only do they make the code less flexible, they also inflate the number of resources managed by the state. While deployment can be impressive, a failure can be equally impressive.

Keep in mind that we’re focusing on highly resuable modules. The story changes if we’re considering the new No-Code Provisioning Modules (which should have been named something more like Self-Service Provisioning Modules) where a full deployment would be appropriate as the code would be a composition of a number of highly reusable modules.

The Goldielocks Zone

When designing a module, the first consideration should be given to what is the principal resource for the module. If that is a Virtual Network, then everything else within the module is architected around this. In such a situation, the module should only deploy one Virtual Network and not even offer the option to deploy numerous through count or for_each. If multiple networks need to be deployed, count or for_each can be used in calling the module to deploy multiple instances.

We should then consider resources that are tightly coupled. Resources that cannot exist as standalone resources, such as subnets, should certainly be included; the converse is also true because while a Virtual Network can be deployed without a subnet, it cannot do anything without the subnet. Since we may need numerous subnets, we would use count or for_each on the subnet resource.

It could be tempting to add route tables, security groups, DDOS protection plans, etc. However, these are not tightly coupled. Conditionally deploying with a boolean value that passes a “0” or “1” to count is also tempting, but it is creating additional code that may be better served with its own module and lifecycle. And while a case could be made for including these, having a separate module means that other code could call a dedicated module for these sorts of resources in a highly decoupled pattern, perhaps using Consul-Terraform-Sync. So, if there is a separate module, we shouldn’t include similar code in the network module, we should just architect the two modules to work very well together.

Data-Only Modules

Data-only modules deploy no resources. They could include data sources or only be variables and outputs with some sort of manipulation in between. These could be useful for establishing naming conventions, for instance. Instead of maintaining a naming convention within your other modules, which would be similar code that would need to be maintained in multiple repositories, it can be pulled out into its own repository. In addition, including a naming convention within a module that deploys resources increases its opinionation, again.

Another example would be the use of data sources to gather the necessary resources for a hub. In Azure, if we need to identify a hub Virtual Network, we need to know its subscription and resource group. This would give us the ID that can be used to establishing peering. Standardizing this approach can be extremely convenient.

Variable Naming

This is actually the topic that motivated me to write this post. I have been critical of Microsoft’s Terraform modules in the past, but I decided to look at the new Azure Verified Modules. I have been very impressed with the reasoning of Matt White. I am not sure how much of the existing set of modules has been his responsibility. However, I noticed one module where the naming was questionable, terraform-avm-res-network-virtualnetwork. The address space variable for the virtual network is named “virtual_network_address_space”.

The prinipal resource for a module should not have variables named for it. They followed the pattern appropriately for its name, because the variable is called “name”. Calling it “virtual_network_name” would be redundant for this module because that is the entire purpose of the module. So any variables that are for the principal resource or are used universally within the module should only have a direct name, such as “address_space” or “name”.

Files

In the Files post, I discussed which files should included. We use the same pattern here:

main.tf
outputs.tf
variables.tf

There is no need for any additional Terraform files within the module. By being thoughtful about the number of resources to include, we can ensure that our files will not be too volumenous in content (unless it is some resource that is massive on its own, and that cannot be split across files, anyhow). I have seen many people create a separate file for each type of resource. This causes you to bounce between too many different files to view your code.

Additionally, a “README.md” file should also be included and it can be generated using Terraform Docs.

With the General Availability of testing in Terraform 1.6, tests should also be created for different use cases. For instance, one use case would be using all of the default values that are possible in the module. So a “defaults” test should be created for that use case. Other use cases should have their own tests, as well.

An “example.tfvars” file can also be generated by Terraform Docs and included. HashiCorp recommends using “terraform.tfvars.example” as the filename, but this presents the issue that your linter won’t identify that as a Terraform file. The natural behavior of the files (terraform.fvars and *.auto.tfvars being automatically included) means that “example.tfvars” won’t automatically be included and the linter will identify the file.

Using a “.gitignore” file to keep the various files form being commited is necessary.

Repository

Each module should have its own repository. I’ve seen this discussion happen numerous times. People get concerned by the number of repositories that they have; this shouldn’t be a concern as repositories are free and can be automated. A module should have its own lifecycle and thus its own repository. This allows for versioning to be handled appropriately, as well. The repositories should be named in the format of “terraform-provider–name“, where provider is the name of the principal provider used in the module (this is easy if there is only one), and name is the name that you want to give the module.