Announcing HashiCorp Vault Resource Quotas
A common request we have had with HashiCorp Vault is how to better protect against distributed denial of service (DDoS) attacks. With Vault 1.5, we added a new feature called Resource Quotas, which allows you to protect your Vault environment's stability and resource consumption in a predictable way from runaway application through the use of request rate limiting and counters.
Vault operators can now use Resource Quotas to control how applications request resources from Vault, through the use of:
- Rate Limit Quotas (All versions): Allows operators to specify request-per-second quotas. Rate limit quotas are applicable to every node in the Vault cluster, meaning each node will maintain separate counters to enforce rate limits. If the rate limit quota limit is hit on any of the nodes in the Vault cluster, additional requests will be canceled for all clients with an HTTP status code of “429 Too Many Requests”.
- Lease Count Quotas (Enterprise Only): Allows operators to specify lease count quotas. If the number of leases in the cluster hits the configured quota limits, additional lease creations will be forbidden for all clients until a lease has been revoked or has expired.
Resource Quotas allow you to protect your Vault environment from misbehaving applications that might inadvertently saturate resources in the Vault cluster through high request rates. By canceling requires over a set rate we can maintain the overall health of Vault.
» How Resource Quotas Work
To learn more about how this works let's look at an example of setting a global rate limit quota via the sys/quotas/rate-limit/<name> endpoint. We can write the desired request rate using the following command:
$ vault write sys/quotas/rate-limit/global-rate rate=500
With a rate set to 500, a client may request at the specified rate of 500 per second. To learn more about these options please see our documentation.
To verify things are working as expected let’s read back the “global-rate” we can execute the following command:
$ vault read sys/quotas/rate-limit/global-rate
Key Value
--- -----
name global-rate
path n/a
rate 500
type rate-limit
Now, let's say you have a web application that is fetching an API key from Vault every so often (not even close to our rate limit). However, the application runs into an error and gets into a strange state, and starts requesting the secret from over what our rate limit is set at. Ultimately, this protects Vault and ensures that we have a healthy cluster for everyone else, even though this application is misbehaving.
On the applications side, you will see an error message that looks something like the following, where we return an HTTP status code of “429 Too Many Requests” when the rate limit is hit.
Error writing data to kv/webapp/apikey: Error making API request.
URL: PUT http://127.0.0.1:8200/v1/kv/webapp/apikey
Code: 429. Errors:
request path "kv/webapp/apikey": rate limit quota exceeded
We have many more example use cases, spanning both Open-Source and Enterprise, in our detailed Learn guide, as well as our documentation.
» Monitoring Resource Quotas
Requests that are rejected due to rate limit quota rule violations can be surfaced in a few different places. A client that makes a request and bumps up against the quota will receive an error message as demonstrated above. However, operations staff likely also want to know a client request was rejected due to rate limiting, as this might lead to service interruptions and further debugging of the situation.
Operations have several options when it comes to monitoring Resource Quotas. If audit logging of requests is enabled, you can detect when requests were rejected due to rate limit quota rule violation. Please note, requests that were rejected due to rate limit violation are not logged by default when audit logging is turned off. The following is an example of what an audit logged event looks like on the server side.
{
"time": "2020-07-17T05:40:54.733026Z",
"type": "request",
"auth": {
"token_type": "default"
},
"request": {
"id": "f15a1a00-c4cb-d479-ed74-f91a2ec233ac",
"operation": "update",
"namespace": {
"id": "root"
},
"path": "kv/webapp/apikey",
"data": {
"data": {
"pasword": "hmac-sha256:35a720a99d99595899663838b5d2d6d9039f78ea7d7bbef2a2cfd11717c083cc"
},
"options": {}
},
"remote_address": "127.0.0.1"
},
"error": "request path \"kv/webapp/apikey\": rate limit quota exceeded"
}
Another option is to use our enhanced telemetry Resource Quota Metrics to monitor, visualize, and potentially alert off these types of events. The table below outlines the newly added telemetry metrics which can be useful for monitoring.
Metric | Description | Unit | Type |
quota.rate_limit.violation | Total number of rate limit quota violations | quota | counter |
quota.lease_count.violation | Total number of lease count quota violations | quota | counter |
quota.lease_count.max | Total maximum amount of leases allowed by the lease count quota | lease | gauge |
quota.lease_count.counter | Total current amount of leases generated by the lease count quota | lease | gauge |
For more information on Telemetry, please see our documentation.
» Next Steps
For more information on Resource Quotas, see our Learn Guide or the documentation. Also, if you enjoy playing around with this type of stuff, maybe you’d be interested in working at HashiCorp too since we’re hiring!
Sign up for the latest HashiCorp news
More blog posts like this one
3 cybersecurity stories from 2024 that show what we need to do in 2025
The majority of attacks in 2025 aren’t going to be related to AI or use zero-days. They’ll continue to focus on the easiest exploits, including exposed credentials and user access patterns.
Vault integrations with MongoDB, Private Machines, and walt.id strengthen customer security
Three new HashiCorp Vault ecosystem integrations extend security use cases for customers.
HashiCorp at re:Invent 2024: Security Lifecycle Management with AWS
A recap of HashiCorp security news and developments on AWS from the past year, for your security management playbook.