Controlling Costs with Terraform Enterprise
Terraform is a tool for provisioning and managing infrastructure, particularly useful for multi-cloud infrastructure deployments. Many organizations adopting a cloud operating model use Terraform to provision infrastrastructure. One approach organizations take is a “lift and shift” of existing on-premises deployments into the cloud. However, this strategy doesn’t take into account how organizations can leverage the unique cost-saving benefits of the cloud, such as ephemeral work loads or auto-scaling machines that adapt to use. Terraform has features including modules, Sentinel policies, and automated policy enforcement that organizations can use to take advantage of these benefits and optimize cost savings and management.
» Codification Through Modules
To begin controlling costs organizations can codify infrastructure components into modules. Modules are packaged terraform configurations that can be quickly reused and modified through variables. Modules facilitate cost savings by allowing organizations to build operational best practices into reusable templates. This allows operators provisioning infrastructure to work quickly without sacrificing quality of production. Through these codified best practices organizations control cloud spend and prevent costly and unwieldy non-standardized solutions.
A look at modules in the Terraform Enterprise UI.
» Cost Sensitive Policy as Code
Sentinel is a policy as code framework that allows users to codify guardrails. In the context of Terraform Enterprise, Sentinel enables organizations to codify guardrails around infrastructure provisioning. You might think of software applications like factories and infrastructure costs like utility costs in those factories. As the factories get more complex it becomes difficult to manage and understand the utility costs, so Sentinel helps organizations attack that challenge in a programmatic, easy to administer fashion. By writing cost sensitive Sentinel policies organizations are able to better manage costs even as applications become more complex. There are many cost-sensitive axes one could write sentinel policies for, but the main ones we’ve identified are machine size and machine lifespan.
» Limiting Machine Size
Businesses adopting the cloud frequently run into “Cloud Waste” or unused cloud. Writing Sentinel policies that control machine size is a common approach to preventing cloud waste and optimizing costs. To write these policies, you create a list of allowed machine types and write a policy that prevents machine types outside of that list from being provisioned. Here is an example limiting machine size in Google Cloud:
import "tfplan"
allowed_machine_types = [
"n1-standard-1",
"n1-standard-2",
"n1-standard-4",
"n1-standard-8",
]
main = rule {
all tfplan.resources.google_compute_instance as _, instances {
all instances as _, r {
r.applied.machine_type in allowed_machine_types
}
}
}
You can see that only machines smaller than n1-standard-16 (i.e. n1-standard-1 through n1-standard-8) are allowed to be provisioned.
» Environment Right-Sizing
Terraform and Sentinel can also help save costs by controlling what infrastructure is used for development/test environments. Often times development and test environments use the same infrastructure as production ones, while bearing much lighter or more infrequent loads. There are a few ways to achieve this result. One possibility is tagging resources as “prod/dev/test” and then creating a sentinel policy like the one above that limits machine size based on those tags. Another option is to create a ‘promotion’ variable in your terraform configuration that corresponds to the ideal set of infrastructure for that environment. For example, setting that variable to “prod” would correspond to a map of large, high performing compute/storage while setting it to “dev” correspond to smaller, lower performing ones.
import "tfplan"
main = rule {
all tfplan.resources.aws_instance as _, instances {
all instances as _, r {
(length(r.applied.tags) else 0) > 0
}
}
}
This policy enforces that all AWS resources must be tagged.
» Limiting Machine Lifespan
Another cost saving strategy is limiting the lifespan of machines by ensuring that unused machines are shut down. A basic method for achieving that result is to set a TTL (Time to Live) variable for your Terraform configurations and then setting up a mechanism to check those TTL’s and queue destroys of configurations that have expired. We have two resources that go into more depth on this strategy. The first is an open source reaper bot, which is available here. The second, is this guide, which walks through how to get the same reaper bot functionality using AWS lambda functions.
» Automated Workflows
On top of codified infrastructure and enforcement of cost-sensitive policies, Terraform provides for automation that removes manual effort and reduces risk. Terraform Enterprise can be called via API within a deployment pipeline, and the guardrails of Sentinel policy as code are automatically applied to applicable infrastructure. This automation means fewer operators focused on infrastructure and more operators working on an organization's core business.
» Conclusion
Terraform Enterprise users can realize cost savings by ensuring provisioned machines are right-sized for their loads, creating maps of infrastructure sizes fit for respective environments, and destroying unused resources. For more help getting started with Sentinel review our guide, docs, and example policies.
Sign up for the latest HashiCorp news
More blog posts like this one
Fix the developers vs. security conflict by shifting further left
Resolve the friction between dev and security teams with platform-led workflows that make cloud security seamless and scalable.
HashiCorp at AWS re:Invent: Your blueprint to cloud success
If you’re attending AWS re:Invent in Las Vegas, Dec. 2 - Dec. 6th, visit us for breakout sessions, expert talks, and product demos to learn how to take a unified approach to Infrastructure and Security Lifecycle Management.
Speed up app delivery with automated cancellation of plan-only Terraform runs
Automatic cancellation of plan-only runs allows customers to easily cancel any unfinished runs for outdated commits to speed up application delivery.