Strategy and Insights

Secret remediation best practices

Finding insecure secrets in your environment before they lead to downtime or breach is critical, but so is establishing best practices for remediating the problem.

Aug 27 2024David Mills

Everyone has secrets, whether they know it or not. It’s all too common these days for organizations to discover they have secrets or sensitive data living in insecure locations. Secrets might be exposed in code repositories, collaboration platforms, or even messages. Because of this reality, removing hard-coded secrets from application code and other insecure locations should be at the top of your list of rules and policies.

This blog post provides a standard framework and best practices to help quickly and securely remediate issues with insecure secrets and sensitive data. It also includes a detailed example of remediating an insecure secret with HCP Vault Radar using these best practices.

»Common challenges and solutions in secrets management

The common challenges in secrets management stem from the fact that SecOps teams often have little control when it comes to secret usage. Developers are the ones using secrets in code to connect across systems, and it is often platform teams that manage the systems that generate those secrets. But developers are typically judged on how quickly they can code new features, and managing secrets securely may not be fast or easy. For the sake of expediency, secrets and sensitive data aren’t always managed according to established rules and policies.

Modern cloud security practices encourage organizations to closely audit code and require developers to manage high-value secrets using secret management vaults, rather than simply embedding them in code. That lets applications access secrets in separate files that can be isolated from the application code. They are then accessed only when needed during runtime and left out of code builds.

Finally, to avoid inadvertent leaks during builds, development teams should establish policies and processes that require developers to inform those who write and maintain configuration files about the location of any secrets. That way, when appropriate, they can add those files to a list that will exclude them from build packages.

»Secret discovery and remediation

In larger organizations, SecOps teams are responsible for establishing and enforcing these policies. The challenge is enabling and encouraging the affected developer teams to successfully follow these policies and processes. In order for developer teams to successfully meet security requirements, they need the right tools and guidelines.

»Discovery and notification

Before they choose the right tool, teams need a framework for their desired approach to secret discovery and remediation. A standard, best-practice approach starts with a discovery and notification workflow. The process can start with a manual or automated environment scan at regular intervals:

Scan (Reactive discovery): The SecOps team scans an organization’s environments for unsecured secrets and sensitive data. They find a critical or high-severity event in the scan results.
- Investigate: When triaging this event, the SecOps team member sees information such as the person who committed the secret, where it is located, and the severity of the issue. This contextual information will include steps required to appropriately remediate the risk.
- Assign: Using this information, the event can be assigned to the appropriate owner, such as the developer or repository owner, for review and remediation.

Alternatively or additionally, teams can set up targeted scans to happen during certain events in various areas of activity:

Monitor (Proactive discovery): Automated proactive scanning for secret introduction into version control repositories is implemented. The developer team receives notifications from these scanning tools so they can prevent secrets and secure data from being introduced into unprotected locations. For example:
- Prevent: Guardrail systems can stop version control merges or provisioning before an unsecured secret is introduced.
- Alert: A high-severity event triggers a PagerDuty notification to immediately alert the team if a secret is introduced into an unprotected location.
- Track: A Jira issue is automatically created to track, prioritize, and triage the event.

»Remediation workflows

The main goals of a remediation workflow are rapid response and containment. Consistently applying best practices to remediation can reduce mean time to detection, mean time to remediate, and the number of incidents per month. When working to properly contain secrets and sensitive data, keep these five key tasks in mind:

Assess: Assess the impact of changing the secret (see step 4).
Revoke: Exposed secrets should be immediately revoked. The secret must be deactivated as quickly as possible, then systems need to be in place to monitor the status of the revoked secret.
Rotate: A new secret must be quickly created and implemented. This task is best enabled by an automated process in your workflow for quick turnaround, low rate of implementation errors, and least-privilege access.
Delete: Revoked or rotated secrets must be immediately removed from the exposed system, application code, logs, or other unprotected locations. Be aware that secrets in code could have a commit history, so avoid breaking links to other commits with history rewrites. Also, you need a process for removing secrets in logs while maintaining log integrity.
Track: Incident response teams need access to information about the lifecycle of a secret to aid in containment and remediation via log files, including who had access, when they used it, and the last rotation or update.

There are two main categories of remediation: proactive and reactive.

»Proactive remediation

Proactive remediation workflows typically involve automated scans that detect unsecured secrets or sensitive data before they get pushed into environments. An example of this would be implementing automated scanning for tip-of-branch and pull-request merge operations to stop secrets from being pushed into builds. In this case, contextual information would be provided in GitHub if that’s where the scan is running.

»Reactive remediation

Reactive remediation workflows, on the other hand, are required upon discovery of secrets that have already been released into insecure environments. An example would be to either manually, or with a product workflow, revoke the secret from all locations.

»Example workflow

Appropriate remediation is dependent on many factors, such as the type of secret, where it was found, the intended user, the criticality of the secret, and having context-specific guidance from your tools. A clear system of ownership for secrets remediation is key to a smooth workflow.

This is a good, general example of a secret remediation process for a developer team:

There’s a clear system of ownership and coordination. The owning developer receives timely notification of a secret discovered in their repo or environment, either automatically by tool integration or via a hand-over from SecOps after triage.
The notification contains contextual information about the secret, such as severity, location, time of exposure, and other details. This information can include the exact location of that secret, who introduced it, whether it’s still active, and the length of time from when it was exposed and when it was discovered.
Based on the details of the event, the developer receives customized steps for appropriately remediating the issue. In many cases, the remediation solution is to immediately remove the exposed secret and then rotate and store the secret in an approved secure secrets management system, such as HashiCorp Vault.

It’s important to note that remediation is not “one size fits all”, so having all the details prior to remediation is critical. For example, revoking secrets without the appropriate visibility can lead to broken systems and disruption of service. The team that uses a secret may not be the one that has access to the system that generates it, so coordination is required for successful remediation of unmanaged secrets.

»Choosing the right tool

HCP Vault Radar is one of the few products that has the right mix of features to support the full best-practices workflow for secret discovery and remediation outlined above. Its native integration with HashiCorp Vault, one of the most popular secrets managers, makes it ideal for delivering secret remediation and moving those secrets into a proven solution for secrets management. The example below uses HCP Vault Radar to enact a reactive secret remediation workflow.

»Example reactive remediation workflow with HCP Vault Radar

Remediation steps and workflows are dependent on many variables, including your organization's security policies as well as audit and reporting requirements. In order to determine the appropriate remediation workflow, you’ll need to gather information about an unsecured secret when it’s discovered. When HCP Vault Radar finds a secret in plaintext, it gathers the following information about the secret:

Type of secret (Stripe API key, AWS credential.)
Time of commit, if secret was committed
Location of the secret (public or private repo, line of code, main branch or historic)
Author of the commit
Activeness status

To put this information in context, here is an example of detailed remediation guidance and steps following remediation best practices.

»An AWS secret is discovered in a code server

In this example, HCP Vault Radar provides step-by-step remediation recommendations following the most secure workflows available in Vault. Because of this, best practices for remediation with Vault are already built in. In this case, the workflow is following best practices for remediating a dynamic secret using Vault.

»1) Create an incident

An incident event is automatically generated by HCP Vault Radar upon discovering an unsecured secret in your code server. Since a secret is leaked in code and event details indicate it is still active, this can lead to unauthorized access of your services. To prevent malicious activity, the following actions are recommended:

Rotate and store the secret in a secrets manager.
Follow your organization’s guidelines for emergency rotation of a secret
Contact your AWS account owner

»2) Store secret in Vault’s AWS secrets engine

To keep secrets secure, store them in a secret manager such as Vault.

Note: This example uses Vault’s AWS secrets engine and IAM auth method. If your application is already configured to access secrets in Vault, use your existing secrets engines and auth methods.

1. Enable the AWS secrets engine in Vault:

$ vault auth enable aws

2. Create a client:

$ vault write auth/aws/config/client access_key={AWS_ACCESS_KEY} secret_key={AWS_SECRET_KEY}

Note: Secrets created by Vault using the AWS secrets engine have a default lease of 60 minutes. For more information on how to change the lease, visit our AWS secrets engine documentation.

3. Create a policy:

$ vault policy write my-policy - << EOF
path "my-secret" {
 		capabilities = [ "read" ]
}
EOF

4. Create an IAM role with the policy. This IAM role needs to match the permissions required on AWS:

$ vault write aws/roles/my-role \
    credential_type=iam_user \
    policy_document=-my_policy

Note: Find more information about the AWS secrets engine here.

»3) Remove the secret from code

You can now reference the secret from Vault in your code.

Note: This example uses an environment variable to store the secret in Vault.

1. Populate the environment variable MY_SECRET with a Vault token:

$ export MY_SECRET=$(curl \
    --header "X-Vault-Token: ..." \
    http://127.0.0.1:8200/v1/aws/static-creds/{my-role})

2. Remove the secret from source and add a reference to the secret stored at the environment variable MY_SECRET:

mySecret = getenv(“MY_SECRET”)

3. Start your local server and test.

Note: Dynamic secrets need to be refreshed. If you are using an environment variable to read the secret, your application needs a way to refresh the secret value when it expires. Here are some ways to keep the dynamic secret value updated within your running application:

Create secrets with the Vault secrets operator for Kubernetes.
Use Vault Enterprise secret sync for services like Vercel or Heroku.
Read and reload secrets in Spring.
If you do not use environment variables, refer to the Vault API documentation.

»4) Revoke the secret

Revoke the unsecured secret to complete the remediation process.

Test that the code changes work.
- Validate the environment variable is populated from Vault.
- Deploy and test the application.
Work with the AWS service owner to revoke the previous secret value.