Secure Access Management with Boundary and Consul-Terraform-Sync
This talk covers delegating Boundary authentication to an external Open ID Connect (OIDC) provider, automating target discovery with Consul and Terraform (via Consul-Terraform-Sync), granularly controlling session routing with worker filters, and accessing Boundary targets with Boundary Desktop and the CLI
Boundary enables a role-based-access control (RBAC) model for any networked resource. Learn how Boundary, in conjunction with the rest of HashiCorp projects, can help manage the access lifecycle for ephemeral resources to create secure, just-in-time access.
Speaker: Pete Pacent, Charles Zaffery
» Transcript
Pete Pacent:
Hi, HashiConf. It's hard to believe it's been 6 months since Boundary was released. Boundary launched in October of last year as an open source, secure access management offering focused on enabling identity-based access controls, for dynamic infrastructure for human users.
Today, we'd like to talk about the enhancements that Boundary has made since launch, where we're headed, and how Roblox has started their Boundary journey to enable secure access to their global infrastructure.
Before going further, let's do a couple of quick introductions. My name's Pete Pacent and I'm part of the Boundary team, where I'm responsible for product management. With me today is senior Roblox site reliability engineer Charles Zaffery, who's one of the lead staff ensuring the availability and security of one of the world's largest online experience platforms.
Charles, take it away.
Charles Zaffery:
Thanks, Pete. My name is Charles and I'm part of the orchestration team here at Roblox.
By orchestration, we mean orchestrating everything above the OS and below the application itself. Today we at Roblox accomplish that with a combination of Nomad, Consul, and Vault. I think you'll hear it popularly referred to as the Hashi stack.
We also utilize everything else that HashiCorp uses, from Packer, Vagrant, Terraform, and so on. So we're really looking forward to how we can implement Boundary within our infrastructure.
Roblox is an experience-focused platform for the metaverse. This means our players can create almost anything, from clothing and body parts for your digital avatar to a diverse set of experiences, from action-packed, first-person experiences to amazing concerts to outlandish fashion shows. Roblox has 200 million monthly active users, with over 3 billion monthly hours of engagement.
We do all of this with an SRE team of about 6. Today, that SRE team manages Nomad, Consul, and Vault for over 20,000 nodes across over 20 clusters, serving more than 600 internal developers.
» Ephemeral Access for Vendors
For Boundary, one of our problems is that out-of-country folks need to access private infrastructure. This includes everything I've just talked about, for Nomad's control plane, Consul's control plane, Vault for secrets for both reading and writing, as well as various databases and caches.
One of our big challenges is, How do we give access to Redis? Another challenge we have is that we need multiple access levels, depending on who is accessing what.
Today we do it primarily through VPNs. They aren't completely static, but they're a little difficult to automate, so we're looking at bringing ourselves into today's solutions with Boundary.
» Boundary's Role
Let's talk about where Boundary fits into the picture. HashiCorp builds identity-based tools to automate the various workflows in the application delivery lifecycle.
One of the workflows in that lifecycle is that, ultimately, humans, whether developers or DevOps engineers, are going to need access to their app's infrastructure. That's where Boundary comes into the picture.
Today, Roblox relies heavily on Vault and Consul for discovery and security. However, these solutions are primarily focused on usage within an operated and owned network.
Our use cases are almost entirely used by machines. Humans can access these tools, but there's still a pathway of tooling they need to have access to, and they have to already be authenticated before they can even get to them in the first place.
Talking about the future, as Roblox expands our footprint and brings on individuals from all over the planet, we want our developers and operators to have total security, with the ability to react to new situations and develop new software as fast as possible.
Something that gets me really excited about Boundary is that it's a new open source entry into the dynamic network security world. Not just open source, but very simple to understand at an infrastructure level.
It's just a normal web application, with a client and a server backed by a Postgres database, so that makes it easy to run. It's also elegant to use. Not just elegant to use, but an elegant solution to a hard problem.
Another thing that really gets me excited is the integration with all of our existing tooling. I mentioned Terraform, but on the roadmap is also Consul for service discovery and transient secrets for backend services through Vault.
The local development story is also really good. It's the same dev flag that all the other HashiCorp tools have, such as Nomad, Consul, and Vault. It just makes things like developing local Terraform an absolute snap before going to production.
Some of the key benefits that we get, and key requirements, are that they sync with our existing identity provider (IdP). That's already there today. Another thing is that we can configure everything with infrastructure as code, that being Terraform for us, and ideally have dynamic targets through Consul, which you can do today with Consul-Terraform-Sync.
Another thing that I like is that it has a really nice GUI, and it's very programmatically friendly. To give an example, one thing that I do for a lot of local development is pipe a `curl` into Ruby and parse it as JSON. If I can't do that, I don't want it.
It needs to be so easy that you can't not do it to access resources behind firewalls. Pete's going to talk a little about that.
Pete Pacent:
Thanks, Charles. In talking through the scenario, it became clear that this closely aligns with our vision for Boundary as a whole. That vision is an ephemeral model of access in which users can authenticate to Boundary using their identity provider of choice and then have granular, role-based access controls that authorize what actions they can perform on those targets.
One of the key ideas with this is that when users go to access those targets, the configuration of address and connection information is automated by Boundary. That way, users aren't stuck having to manage brittle information like IPs and ports and credentials that they need to securely connect to those targets.
In having Boundary automate this configuration, that access information can be managed on a just-in-time basis. So even if that connection information was compromised by a malicious entity, there are time limits in terms of how long those are valid and exploitable.
And this leads me to why we think Roblox's use case is one of the canonical examples for the value of Boundary. Roblox can use Boundary's role-based access control to establish identity-based access for any infrastructure target. They're able to delegate to Roblox's preferred identity provider via Boundary's OpenID Connect (OIDC) authentication method.
In this example, we'll use Azure Active Directory, but Boundary supports other common IdPs like Keycloak, Okta, Auth0, and others. On top of this, Charles will use Consul and Terraform to automate target configuration info. This is a powerful paradigm where Boundary can lean on external mechanisms to manage the networking of target discovery and environment configuration.
Lastly, we'll see how Boundary's newest features also can be used to improve the story, like Boundary's new desktop app for macOS, which allows low- or no-code users to view what systems they can access and connect to over GUI interface.
We'll also see worker tags and filters, which allow users to choose which Boundary proxy workers should manage a target's connections. This is really critical for meeting user security and latency requirements for private access, particularly if administrators need to ensure connections route through a worker that is both peered to the target private network and local to that region.
That way, as a Boundary user, I don't need to route traffic through a completely different region or risk having a connection fail due to it being assigned to a worker on an unpeered network.
» The Demo
With that, I'll pass it over to Charles to walk through the Roblox solution and give a demo.
Charles Zaffery:
We're going to bring up a full local Boundary environment that's more "production-ready" than a local Boundary dev, as well as being a lot more dynamic. We're going to do this using Docker Compose.
This set of containers consists of a Postgres database, a Boundary cluster that uses that Postgres database, a container of MySQL, a container of Redis, a container of Consul, as well as Consul registrator to register all the services.
What's happening, as a start comes up, is that the Postgres database is being bootstrapped for use with Boundary, because Boundary just uses normal everyday database migrations.
Now that everything is up, we're going to apply some Terraform to configure Boundary.
This is going to create local users, roles, accounts, some new auth methods, a handful of scopes, as well as OIDC users and all the configuration for it.
Now this is finished. We've applied all of our static configuration, and so it's time to be a little dynamic.
If we look at Consul, you can see that we have MySQL and have Redis registered in here now. Right now, if we were to go to look at everything, we would see that Redis and MySQL aren't in Boundary right now.
Let's take care of that.
With the demo up here, you can see that it forgot the path. Going to the right path helps a lot here.
We're running Consul-Terraform-Sync. It reached out to Consul, it discovered Redis and MySQL, and then Redis ran a set of dynamic Terraform to introduce them as targets and hostsets into Boundary.
This will all happen dynamically. If I go in and I restart this Redis container, this is going to start to fail because Redis isn't there anymore. It doesn't exist.
When this finishes, you can see that Redis came right back up, you reach out to it. Because we're using Docker with a local setup, this isn't changing hosts. But if this were to move to a different host and get registered differently in Consul at the Redis service, this would have dynamically migrated all of this behind the scenes with no input from anyone.
That's one of the cool things about Consul-Terraform-Sync. We're now fully configured, so let's authenticate. Let's get into Boundary a bit.
We're a new user, and we don't know what all the auth methods are, so we want to take a look. We can see that we have a couple password methods, but we have this OIDC one, which you really care about.
The way that we're going to do that is, we're going to authenticate on the command line, through OIDC. That's going to bring us to Azure Active Directory. We're logging in with this account.
Just like that, we're authenticated.
Today we also want to use a bit of Redis. We've been told that it's not open to the public, and we have to go through Boundary, so let's walk through what that looks like. Why don't we do a little more discovery along the way?
We want to say, "Hey, Boundary, tell me all the host catalogs." And here we have this host catalog set in the database project. We can use this and say, "Here are all the different hosts."
This set was created on the initial Terraform `apply`. And these 2 hostsets were dynamically created by Consul-Terraform-Sync.
Again, we care about Redis. So we're going to say, "What about this hostset? Who's in it?
In this case it's just the one host. Again, we're going to say, "Boundary, this is host." You can see it's 172.29.02. If we ask Docker, "What is this IP import?” you can see that they match up.
Now that we've confirmed that Redis is where we expect it to be, we want to connect to it. To do that, we want to get all the targets, and we want to say, "Here's the Redis target, and we're going to connect to it."
What we're going to do right now is to straight up exec right into Redis without ever having to know what the port is. In this case, it's 60962. It doesn't actually matter.
This is a brand-new Redis server, so we don't have anything in it. We're going to set the foo key to bar.
We're all done with Redis and we want to get out of here, but just to confirm, we do have Redis listening on the host, 6379. And if you look at the keys, you can see that key that we just set.
Again, if we go back in through Boundary, same thing: you look at the keys, there's our key. We get the same value. This is accessing Redis, or really anything, dynamically directly through Boundary.
» Life Is Easier with CTS
Consul-Terraform-Sync, is pretty cool. We have a basic config file here, which is just your general stuff. You have some log information. You need to tell it where Consul's at, and then you tell the providers what you want and how to configure them.
This used the Terraform driver, so this has its own set of configuration, where you say, "Here's where all my syncs are going to be at." They call them "sync tasks."
This is the backend. We're using Consul, because Roblox is using Consul with everything right now.
This is the task itself. We're just calling it "Boundary." It's a local thing that we've written into this demo, and the providers that it's using are HTTP and Boundary, as well as the services that it wants to look at, which are Redis and MySQL, as well as a buffer period.
The buffer period is really important. We at Roblox run a very large cluster. We have something like 70,000 health checks at any given point in time. As you can imagine, they move around a lot. If this wasn't set and we just let it go full bar, it would pretty much go crazy on our cluster. So we needed at least a 5-second buffer here.
Let's look in Terraform itself. It generates some Terraform for us. This is all generated by Consul-Terraform-Sync. It's got some basic information here and some backend stuff. It's just normal Terraform.
If we look one level deeper, there is a special variables file. This is on HashiCorp's Learn website. If you go there, you will be able to find this. It's just copy and paste, and you can omit a lot of this.
If we look at the Terraform we've written for this, this is just a very normal Terraform, with the secret sauce being that services variable.
For this demo, we've just made some maps, some services into a thing that we could consume.
To get all the information out of Boundary, to pass to the resources, there aren't data resources today for the Boundary provider, but there will be in the future. For now, there is a lot of, we'll say, creative use of the HTTP data provider to populate all this information.
This happens totally dynamically. We can go back in and we can start it right back up, and this shouldn't make any changes because it's already in sync.
And just like that, we now have a fully dynamic Boundary thing with Consul-Terraform-Sync, that's hooked up to Consul, which is using Terraform and Boundary. So we have a bunch of HashiCorp products, all talking together and working very well.
As a bit of a bonus, if you don't like command line, o you can access all of that right through OIDC, as I showed earlier.
And you can do all the same stuff that we just did. You can connect to Redis. This gives you the information. We're on 61057, and there's that key that we had earlier.
And this isn't the only thing you can do with this. You can connect over SSH. We have Postgres, Redis, MySQL. But I only have redis-cli installed on my laptop right now. That's why I showed this.
You can put anything in here. You can do web applications, all kinds of stuff.
I hope that this demo has shown a little bit of the power of Boundary and the dynamicity that you get with Consul, Terraform, and Consul-Terraform-Sync.
Back to you, Pete.
» The Boundary Roadmap
Pete Pacent:
Thanks, Charles.
To close, let's talk a little bit about where Boundary's headed from here.
First off, while Boundary already supports integrations with external identity providers via OpenID Connect, we think we can take this a step further by syncing group membership claims from IdPs to Boundary groups. This way, the claims managed at the identity provider level can be used by Boundary without additional configuration.
Second, we'll continue to enhance Boundary's target discoveries story by enabling administrators to define dynamic host catalogs, even outside of Terraform and Consul-Terraform-Sync, so that services can be discovered based on predefined rules or tags. These catalogs will be extensible via plugin frameworks, so the community can create their own plugins for their preferred infrastructure platforms or service discovery tools.
Lastly, in an upcoming release, we'll also focus on integration with Vault or your preferred credential management solution, so that you can use ephemeral credentials minted by Vault to authenticate to Boundary targets. This would enable a single sign-on flow so that those credentials are obscured from end users.
And in closing, as always, we'd love to invite you to try out Boundary for yourselves. Boundary's available at boundaryproject.io, and we have a whole set of onboarding docs and tutorials to make it easier to onboard.
Good luck trying out Boundary.