Structuring HashiCorp Terraform Configuration for Production
Update 2022: This blog post now includes links to best practices guides from the Terraform section of developer.hashicorp.com. While this article's text remains for historical purposes, the links to our best practices guides will give you current recommended patterns indefinitely, as these guides are always kept up-to-date.
When you start learning to use HashiCorp Terraform, you might start with one configuration file containing all of your infrastructure as code. As you learn more, you start to share and collaborate on those configuration files with peers or teams. Eventually, multiple team members start creating, sharing, and collaborating on the same configurations. How do you scale your Terraform configuration as your team grows? In this post, we discuss approaches to structuring your Terraform configuration for improved testing, reusability, and scalability.
We begin by breaking up Terraform configuration into modules to reduce dependencies between components. Next, we’ll discuss the use of modules to facilitate the reuse and testing of infrastructure components across multiple environments. Finally, we examine how the Terraform Registry enables the standardization and reuse of modules.
» Break Down Monolithic Configuration
Current Guide: How to Refactor Monolithic Terraform Configuration
Let’s say you have a large Terraform configuration to deploy some functions that add to and read off of a message queue, all contained within a virtual network. You define all of the infrastructure components in one file, main.tf
.
resource "aws_iam_role" "document_translate" {}
resource "aws_lambda_function" "document_translate" {}
resource "aws_sqs_queue" "document_translate" {}
resource "aws_subnet" "document_translate" {}
resource "aws_vpc" "document_translate" {}
As more team members collaborate on functions outside of document translation functionality, they require their own message queues and separate network configuration. To enable reuse, you break each type of infrastructure component into its own directory so other teams can reference the configuration and customize for their own purpose. As you construct the modules of Terraform configuration, you parametrize specific configuration such as FIFO queue options or component naming.
> tree my-company-functions
├── document-metadata
│ └── main.tf
├── document-translate
│ └── main.tf
└── modules
├── function
│ ├── main.tf // contains aws_iam_role, aws_lambda_function
│ ├── outputs.tf
│ └── variables.tf
├── queue
│ ├── main.tf // contains aws_sqs_queue
│ ├── outputs.tf
│ └── variables.tf
└── vnet
├── main.tf // contains aws_vpc, aws_subnet
├── outputs.tf
└── variables.tf
This file structure takes advantage of Terraform modules. With our configuration organized into subdirectories with modules, you can test each piece individually and reuse them for new functions.
Reference the function
modules for the example by creating a main.tf
file for the document-translate
function. Set the module source
to the locally defined modules in the modules
directory.
module "vnet" {
source = "../modules/vnet"
cidr_block = "10.0.0.0/16"
}
module "queue" {
source = "../modules/queue"
name = "terraform-example-queue"
delay_seconds = 90
max_message_size = 2048
message_retention_seconds = 86400
receive_wait_time_seconds = 10
}
module "function" {
source = "../modules/function"
filename = "lambda_function_payload.zip"
function_name = "lambda_function_name"
role = aws_iam_role.iam_for_lambda.arn
handler = "exports.test"
source_code_hash = filebase64sha256("lambda_function_payload.zip")
runtime = "nodejs8.10"
}
The main.tf
file represents the infrastructure configuration for the document-translate
function, complete with module imports and required variables for each of the modules. Before applying the configuration, run terraform init
to retrieve the modules at the local file paths.
» Considerations for Defining Modules
Current Guide: Module Creation - Recommended Pattern
When breaking down Terraform configuration, you can divide configuration into modules based on blast radius, rate of change, scope of responsibility, and ease of management. In the previous example, you create a function
module with aws_lambda_function
and the aws_iam_role
associated with the function. You include the IAM role definition as part of the function module since the scope of its responsibility is to the function. A function can change frequently, especially as you update and redeploy code.
As a result, you want separate it from network configuration
, which may be fairly static. Similarly, changes to lambda function
have a smaller blast radius than making a change to network configuration
. A change to network configuration
may destroy and create the network resource entirely, which affects any resources hosted on the network.
In addition to module resources, determine which input parameters will be necessary for module configuration. In the example, you decide that the queue can only be standard and never FIFO. As a result, do not include an input to enable a FIFO queue. Use module inputs to pass information from other modules, such as subnet and security group identifiers.
module "vnet" {
source = "../modules/vnet"
cidr_block = "10.0.0.0/16"
}
module "queue" {
source = "../modules/queue"
name = "terraform-example-queue"
// omitted for clarity
vpc_config {
subnet_ids = module.vnet.subnet_ids
security_group_ids = module.vnet.security_group_ids
}
}
Alternatively, you can use data sources to dynamically discover infrastructure components within a module. For example, you decide to transfer ownership of the vnet
module to the networking team so they can manage IP address spacing for your function. After the networking team creates the virtual network, you dynamically reference the VPC they’ve created by leveraging the aws_vpcs
data source and search for the tags related to your function.
data "aws_vpcs" "foo" {
tags = {
function = "document-translate"
}
}
The data source retrieves a list of VPCs based on the tag identifying the purpose of the VPC, in this case, for the document-translate
function. For more details regarding the additional patterns to break down monolithic Terraform projects, refer to this article.
» Separate Configurations for Environments
In addition to modularizing common Terraform configuration for reuse, you often manage multiple environments such as staging
or production
. A good practice for doing this is to separate each environment into its own folder or even an entirely separate repository. Refer to the Terraform Recommended Practices documentation for additional information.
> tree my-company-functions
├── modules
├── prod
│ ├── document-metadata
│ │ └── main.tf
│ └── document-translate
│ └── main.tf
└── staging
├── document-metadata
│ └── main.tf
└── document-translate
└── main.tf
While you have some duplication with a folder for each environment, you gain a few benefits for scalability and availability. First, each environment maintains a separate state. With the Terraform CLI, you can initialize a new state for each environment with the terraform workspace
command.
Next, you must go into each environment’s directory and separately run terraform init
. This provides important isolation to protect a production environment from any kind of experimentation done in staging.
Second, you can selectively import modules for each environment. For example, some environments may require a queue with others that may have shared queues to reduce infrastructure costs during testing. This allows granular control over each environment and encourages repeatability of environment configuration as you add additional environments in the future.
» Sharing Modules
In addition to defining the module source with local paths, you can also define module sources using remote sources such as version control or object storage endpoints. This approach enables you to version the modules.
Recall in the example, the networking team takes ownership of the virtual network configuration. They decide to host it in GitLab, enabling separate functional testing and security hardening of network configuration. The latest version they’ve released that passes functional and security testing tagged with a git reference of v3.0.0
.
In order to consume their network module in GitLab, update the main.tf
for production to point to a new module source and the version v3.0.0
.
module "vnet" {
source = "git::https://git.mycompany.com/vnet.git?ref=v3.0.0"
cidr_block = "10.0.0.0/16"
}
After refactoring to use the networking team’s virtual network module, run terraform init
to retrieve the remote modules. You must run terraform init
or terraform get
to install remote modules.
By managing, hosting, and versioning modules separately at a remote endpoint, other teams can share, test, and harden their Terraform modules before releasing it for general use across an organization. Backwards incompatible changes for each configuration can be communicated before consuming teams update their infrastructure configuration with a new version of the module.
» Scaling Module Installation & Maintenance
You refactor your infrastructure configuration to use modules and begin to consume other modules created by various teams. However, after hundreds of module references, you realize that it takes some time to not only to retrieve and install the modules but also track a module’s updates, available versions, inputs, and outputs. As a first workaround, you can retrieve all of the modules as git submodules and refactor your Terraform configuration to reference the modules with a local path. This caches all of the modules locally for use while facilitating the maintenance of the module in a separate repository.
To solve both the slow installation and additional maintenance of Terraform modules, use the publicly available Terraform Registry or the organization-scoped Terraform Cloud Private Module Registry to store and manage your modules. This pattern quickly imports modules from the Terraform Registry or Terraform Cloud Private Module Registry. You can utilize publicly available, community-developed modules or share your own modules within your organization using Terraform Cloud.
Refactor the main.tf
to import the verified AWS VPC module from the Terraform Registry. Include the required module inputs listed in its documentation.
module "vnet" {
source = "terraform-aws-modules/vpc/aws"
version = "2.31.0"
# insert the 12 required variables here
}
Pin the version for the module, as inputs to the module may change and may not be backwards compatible. In this example, the networking team may want you to use their vnet
module instead of the publicly available Terraform modules. Similar to the publicly available modules, the networking team publishes the vnet
module to Terraform Cloud and version the module for your consumption. Reference the module source and version in your main.tf
with the details pointing to the Terraform Cloud organization, in this case, my-company
.
module "vnet" {
source = "app.terraform.io/my-company/vnet/aws"
version = "3.0.0"
cidr_block = "10.0.0.0/16"
}
» Conclusion
By organizing our Terraform configuration into modules and separating our various environments into distinct folders, we achieve state isolation between environments and reuse our modules only where they're needed. In addition, we can easily consume and share modules with the Terraform Registry or Terraform Cloud Private Module Registry. As a result, our Terraform modules define and spread the use of a common architecture for infrastructure.
To get started with Terraform Cloud, sign up at app.terraform.io. For additional details on modules, check out the modules track on HashiCorp Learn.
For questions about the content covered in this post, refer to our community forum.
Sign up for the latest HashiCorp news
More blog posts like this one
Fix the developers vs. security conflict by shifting further left
Resolve the friction between dev and security teams with platform-led workflows that make cloud security seamless and scalable.
HashiCorp at AWS re:Invent: Your blueprint to cloud success
If you’re attending AWS re:Invent in Las Vegas, Dec. 2 - Dec. 6th, visit us for breakout sessions, expert talks, and product demos to learn how to take a unified approach to Infrastructure and Security Lifecycle Management.
Speed up app delivery with automated cancellation of plan-only Terraform runs
Automatic cancellation of plan-only runs allows customers to easily cancel any unfinished runs for outdated commits to speed up application delivery.