Nomad Bench: Load testing and benchmarking for Nomad
Nomad Bench provides reusable infrastructure tooling, so you can load test and experiment with Nomad clusters running at scale.
HashiCorp Nomad is simple to deploy and highly scalable, but exactly how scalable? Production clusters reach 10,000 clients and beyond, but reproducing bugs or testing load characteristics at this scale is challenging and costly. The one million and two million container challenges provide an impressive baseline for scheduling performance, but they were not intended to create a realistic scenario for constant experimentation.
The nomad-bench project set out to create reusable infrastructure automation to run test scenarios to collect metrics and data from Nomad clusters running at scale. The core goal of this effort is to create reproducible, large-scale test scenarios so users can better understand how Nomad scales and uncover problems detected only in large cluster deployments.
» nomad-bench components
The nomad-bench infrastructure consists of two main components. The core cluster is a long-lived, production-ready core Nomad cluster used to run base services and drive test cases. One important service running in the core cluster is an InfluxDB instance that collects real-time data from test runs.
The test cluster is a short-lived ephemeral cluster running Nomad servers on Amazon EC2 instances and Nomad clients using nomad-nodesim
, allowing clusters to scale to tens of thousands of nodes. Each test cluster can have a different number of servers, EC2 instance type, disk performance, and operating system. Test clusters may also be configured with a custom binary to easily test and compare code changes.
» Data collection
To collect and analyze data from tests, each cluster has an associated InfluxDB bucket to isolate its data. InfluxDB also allows for real-time data analysis to monitor test progress via dashboards. Data is collected using Telegraf daemons deployed on all test cluster servers. In addition to Nomad metrics and logs, these daemons collect system metrics, such as CPU, memory, and disk IO.
We chose InfluxDB over Prometheus, Grafana, and other tools due to its ability to easily load existing data via the plain-text line protocol format, allow data isolation in buckets, and deploy as a single binary.
» nomad-nodesim
nomad-nodesim
is a lightweight, virtualized Nomad client wrapper that can simulate and run hundreds of processes per application instance. This helps simulate and register tens of thousands of Nomad clients in a single test cluster, without having to stand up tens of thousands of real hosts. The clients can have different configurations, such as being partitioned in different datacenters or node pools, or holding different metadata values.
These nomad-nodesim
processes are deployed to the core Nomad cluster so they run within the same private network. Each test cluster has its own nomad-nodesim
job that can be customized for the scenario being tested.
The application simplifies deployment and eases financial headaches when attempting to run large scales of Nomad clients. The calculations below roughly estimate the cost of running 3 Nomad servers and 10,000 Nomad clients for a month.
- Without
nomad-nodesim
: 3 x t3.medium and 10,000 x t3.nano = $25,617 (EC2 only) - With
nomad-nodesim
: 13 x t3.medium = $268 (EC2 only)
» Validation
To have confidence in running Nomad server load and stress tests using nomad-nodesim
, the team needed to validate that running nomad-nodesim
closely mimics “real” clients. To do this, we ran two experiment variations, one with real Nomad clients running on dedicated hosts, the other with nodesim
clients. Both ran three dedicated hosts for the Nomad server process and five clients.
Each experiment ran through a simple set of steps:
- Register a job using the mock driver with a single task group whose count is 100
- Update the task group resources to force a destructive update
- Deregister the job
The job specification that was initially registered is detailed below. It used an HCL variable to control the task memory resource assignment, which madescripting of updates easier as no manipulation of the specification is required.
variable "resource_memory" { default = 10 }
job "mock" {
update {
max_parallel = 25
}
group "mock" {
count = 100
task "mock" {
driver = "mock_driver"
config {
run_for = "24h"
}
resources {
cpu = 1
memory = var.resource_memory
}
}
}
The script below is used to run the experiment in a controlled and repeatable manner, pausing after each step to allow system stabilization.
#!/usr/bin/env bash
set -e -x
sleep 90
nomad run mock.nomad.hcl
sleep 90
nomad run -var='resource_memory=11' mock.nomad.hcl
sleep 90
nomad stop mock
sleep 90
nomad system gc
Here’s the resulting count(nomad.client.update_status)
charts:
And the resulting count(nomad.client.update_alloc)
charts:
And the count(nomad.client.get_client_allocs)
chart:
» nomad-bench results and next steps
With this validation experiment, we confirmed that the nomad-nodesim
application performs similarly to real Nomad clients. It updates Nomad servers of allocation updates slightly faster because it does not have a task-runner or driver implementation. In some cases, this may also cause updates to be batched slightly more efficiently than real clients.
In order to account for this minor difference and to allow for more flexible testing, we added configuration functionality within PR #24 to allow running nomad-nodesim
with either a simulated or real allocation runner implementation.
» Try it yourself
The Nomad benchmarking infrastructure and nomad-nodesim
application provide an excellent base for running repeatable and large scale tests. They allows engineers and users to test Nomad at scale and iterate on changes to identify throughput improvements. The Nomad engineering team uses this to run a persistent cluster for soak testing and short-lived clusters to test code changes.
If you want to check out and run the nomad-bench
infrastructure suite, you can do this using the publicly available repository. Instructions on how to get started are included and all feedback is welcome.
Sign up for the latest HashiCorp news
More blog posts like this one
Nomad 1.9 adds NVIDIA MIG support, golden job versions, and more
HashiCorp Nomad 1.9 introduces NVIDIA multi-instance GPU support, NUMA and quotas for devices, exec2 GA, and golden job versions.
Terraform, Packer, Nomad, and Waypoint updates help scale ILM at HashiConf 2024
New Infrastructure Lifecycle Management (ILM) offerings from HashiCorp Terraform, Packer, Nomad, and Waypoint help organizations manage their infrastructure at scale with reduced complexity.
Terraform Enterprise improves deployment flexibility with Nomad and OpenShift
Customers can now deploy Terraform Enterprise using Red Hat OpenShift or HashiCorp Nomad runtime platforms.