HashiCorp Nomad Meets the 2 Million Container Challenge
HashiCorp Nomad scheduled 2,000,000 Docker containers on 6,100 hosts in 10 AWS regions in 22 minutes.
Today, HashiCorp Nomad hit a major milestone by announcing the general availability of version 1.0. In parallel with this significant release, the team is also announcing that Nomad completed the 2 Million Container Challenge.
» Why the 2 Million Container Challenge
Airplane wings are built to be flexible so they can withstand the worst weather and impromptu bouts of turbulence. Do you know how much they can bend before they break? Boeing performed a wing bending test of the 787 Dreamliner a few years ago. During the test, the wings were flexed upward approximately 25 feet, which equates to 150 percent of the most extreme forces the airplane is ever expected to encounter. In the real world, no passenger is likely to run into such conditions but aircraft makers need these tests to demonstrate the safety margin for the design and certify that the airplane can withstand extremes forces.
The same goes for container deployment and orchestrators. Nomad, as a simple and flexible orchestrator, is built for ease of deployment and consistent performance at any scale. For most customers, deploying 2 million containers is a seemingly excessive number, but regardless of customers’ current scale, we want to test and certify that Nomad can robustly handle a hundred times or a thousand times the expected load as customers grow their business. Nomad 1.0 signifies product maturity and stability, and this new benchmark demonstrates unparalleled scalability to ensure that any customer can confidently scale up their Nomad deployment with ease, even under the most extreme requirements.
» The Results
The test is designed to measure Nomad’s scheduling throughput under extremely high pressure with relatively few schedulers (known as Nomad Servers in product terminology). Partnered with the AWS Spot team, we were able to use 3 Nomad schedulers to deploy 2 million Docker containers in 22 minutes across 10 AWS regions globally, at an average rate of nearly 1,500 containers per second. The above graph demonstrates that Nomad’s scheduling performance is nearly linear. The number of containers already placed does not negatively affect the placement of future containers.
» From 1 Million to 2 Million
The 2 Million Container Challenge was inspired by our previous container scheduling benchmark effort, the Million Container Challenge, which we ran using Nomad 0.3.1 back in 2016. Over the past few years, container adoption has grown rapidly in the enterprise. Today, organizations serving customers globally or regionally are building multi-cluster, multi-region, and multi-cloud architecture to make their applications available, responsive, and high-performing no matter where the end-user lives. We evolve our benchmark by scaling the load as well as the geographical span. This allows us to focus on Nomad’s global scalability as well as raw scheduling throughput, which together can offer more relevant infrastructure opportunities.
The diagram below illustrates the deployment scale of the test where the 3 Nomad servers are running on us-east-1 region in North Virginia and a total of more than 6,000 Nomad clients distributed across the globe, forming a single cluster topology.
In addition, all of the containers were running 100% on AWS Spot instances. Total costs for this run were reduced by 68% in aggregate using popular instance types and sizes in a mix of day and night locales.
Learn more about the detailed test setup and performance analysis here.
» Conclusion
The 2 Million Container Challenge is a public showcase of our approach and commitment to creating software designed to scale. We will continuously collaborate with innovative technology partners on cutting-edge research to push the performance of Nomad further.
Sign up for the latest HashiCorp news
More blog posts like this one
Nomad 1.9 adds NVIDIA MIG support, golden job versions, and more
HashiCorp Nomad 1.9 introduces NVIDIA multi-instance GPU support, NUMA and quotas for devices, exec2 GA, and golden job versions.
Terraform, Packer, Nomad, and Waypoint updates help scale ILM at HashiConf 2024
New Infrastructure Lifecycle Management (ILM) offerings from HashiCorp Terraform, Packer, Nomad, and Waypoint help organizations manage their infrastructure at scale with reduced complexity.
Terraform Enterprise improves deployment flexibility with Nomad and OpenShift
Customers can now deploy Terraform Enterprise using Red Hat OpenShift or HashiCorp Nomad runtime platforms.