Nomad and Vault in a Post-Kubernetes World
Learn why Kubernetes was not flexible enough for Cosmonic's use case and how HashiCorp Nomad fit the bill.
Containers are amazing, but will they be the preferred mode of packaging and deploying applications for much longer? Several startups are making big bets on WebAssembly (Wasm)-based applications becoming the next major revolution in application architecture.
One of those companies (Fermyon) shared their story a few months back explaining why Kubernetes wouldn't be the right fit for these applications with its main focus being on container orchestration, so they built their Wasm PaaS product on HashiCorp Nomad instead. Cosmonic, another Wasm PaaS, has also chosen Nomad for this use case.
» Why Nomad?
The main advantage of Nomad over Kubernetes is that it has more flexibility in the workloads it can manage. Not only can it managed containers based on Docker and other options, it also supports VMs, Java JARs, Qemu, Raw & Isolated Executables, Firecracker microVMs, and even Wasm. If an engineer wanted to orchestrate something else on Nomad, they can build that support with Nomad task drivers.
» Cosmonic's Nomad Use Case for a Wasm PaaS
Cosmonic's PaaS is built with WebAssembly and runs on top of Nomad. The services on the PaaS are all composable actors that are stateless and reactive. Being written in Wasm, the services are lightweight and fast, and no connection strings or clients are required. Consmonic also uses HashiCorp Consul for networking and service mesh, along with HashiCorp Vault for secrets management.
Cosmonic extended Vault in several interesting ways for their use case. For greater security, they also run customer hosts on Firecracker microVMs, which are very tightly and very securely placed all next to each other on the same actual servers.
The fact that Nomad could flexibly handle Firecracker and Wasm mixed environments in one job scheduler was the deciding factor for Cosmonic. And Cosmoic sure did give Kubernetes a chance, but from their initial experiences trying to bring Wasm to Kubernetes, they found that Kubernetes does not like things that are not containers. There was a lot of pain and thousands of lines of code to build something that started to work. Many Wasm platforms have given up on Krustlet (Wasm-on-Kubernetes).
Learn more in this session and transcript below.
» Transcript
Well, everyone, we've made it to the very last session. It's my job to keep you all entertained for the last 45 minutes. Welcome to Distributed Flexibility: Nomad and Vault in a Post-Kubernetes World.
If you've been paying attention to the schedule, I am not Dan. I am Taylor. Dan had a family emergency, so I filled in for him because I worked on a lot of this with him. I am a recovering infra engineer. I think all of us probably know the feeling in this room.
» Agenda
I'm going to talk a bit about all the different things we've used with Nomad and Vault. Let's talk about the agenda here. To really talk about what we're doing with Nomad, we have to level-set and talk about what WebAssembly is and what exactly we're doing. Without that, this whole talk doesn't make a lot of sense.
We're going to start with the WebAssembly (Wasm) part, and talk about why we chose Nomad in the first place. Then we're going to get deep and talk about building a task driver — in this case — one based on the Firecracker VM.
We're also going to talk about how we extend the Vault system with a key-value provider, which I'll mention lots of details about. Then we'll finish up at the very end with what we see as the future of Nomad and WebAssembly.
» Cosmonic
Let's dive into what we do as a company and what WebAssembly is. Cosmonic is the company I work for. I am an engineering director there, so I still do a lot of engineering work. We are a Platform-as-a-Service (PaaS), and we run WebAssembly on the backend.
Contrary to the name, which we'll talk about in a second, it has nothing to do with just being a web technology. We run services using WebAssembly. Cosmonic is what we jokingly call the enterprise goo that sits around wasmCloud, which is an open source project that's owned by the CNCF — we're just one of the principal maintainers. That's what Cosmonic does.
» WebAssembly (Wasm)
But underneath it, all is a technology called WebAssembly. If you talk to anybody involved in the WebAssembly community right now, you'll notice that we say it's neither web nor assembly, and that's for good reason.
It is neither of those things. It was built to be exactly what this definition is here up on screen. That's the textbook definition. But the idea is, why couldn't we create something that could be cross-platform and run everywhere? And that's why it was originally pulled onto the browser — we want to be able to run any language in the browser rather than JavaScript.
But if you think about why it first became popular with the browser, you'll start noticing that there are some similarities to why we would want to use it on the backend. That gets us to this question — why do we use WebAssembly on the backend at all?
» Why Wasm?
The first thing is this whole idea of having it be cross-platform, cross-architecture. It can run anywhere, and that literally means anywhere. If you look at pretty much any WebAssembly project that's doing stuff on the backend right now. Whether it's something we do or any of our friends out in the community do, they've probably compiled that WebAssembly module on their Linux desktop at home or a Mac — and then people are running it on IoT devices and servers and all sorts of things.
It is completely cross-platform and cross-architecture. You're also able to code in any language, so it's truly polyglot. Rather than having to buy into a specific toolset, you say no; I just want to code in whatever language using the same toolchain I've always used.
It's also secure by design because it really gained that traction. The browser is a sandbox model for obvious reasons because you're running untrusted code, so the sandbox means you have to grant what they call capabilities in order to do it.
Think about what happens on your phone when you boot up an application for the first time. Most phones Android, iOS say do you want to grant access to the mic? Do you want to grant access to the phone? Do you want to grant access to this? That's done in WebAssembly as well. You have to explicitly grant it permissions.
But then we get to this idea of size and efficiency, we've done some comparisons. One we've done is most people are familiar with the Java, Spring Boot microservices example, which is using a pet clinic. If you're not, it's a simple REST API, with each part of the API being handled by a different service. That thing is generally pretty big, especially in Java.
In WebAssembly, it's tiny. We're talking no more than a couple of MBs most of the time, and it runs very light. That leads to increased speed of execution because this is near-native.
WebAssembly is a compiled target. It's not a specific binary or language. It is meant to be a compiled target that everything can go to. That's the high level of WebAssembly. I'm throwing the fire hose at you to keep you all up, I guess, at this time of day. But that's the basics of WebAssembly.
» wasmCloud
wasmCloud is the open source project that we base everything at Cosmonic on. It was designed to make it super quick and easy to build applications using WebAssembly — and do it in a real production way.
As some background for this, the entire Cosmonic platform is built with WebAssembly and then runs on top of Nomad. This is an actual production-quality thing you can build real things with.
It's important to note it's a CNCF Sandbox project. If you're not familiar with the CNCF, the Cloud Native Computing Foundation, it is an open foundation — the actual copyright for the code is owned by them. It's not like we're running this on our own. This is a completely open standard. You can run this core technology by yourself if you want. We use this wasmCloud stuff on top of Nomad. But let's talk about the architecture of wasmCloud, so you understand what we're running.
» wasmCloud Application Runtime
Underneath anything Wasm is the actual Wasm runtime. But on top of that — where a lot of us in the space are innovating — is what we call application runtimes. It allows you to run your application, and it leverages WebAssembly to do so.
We provide secure access to specific capabilities, which I'll go into some more detail in a second. This is running using Elixir and OTP. It's some extreme scalability that's been tested for decades at this point.
Last, it's horizontally and vertically scalable. We have stateless actors, and the code can be scaled on the same machine or across multiple machines with ease because, once again, it's completely cross-platform.
» Capabilities
These capabilities I mentioned, what does all of this mean? Let's think about what it takes to write an application. People who are using HashiCorp products are probably used to hosting developers’ applications.
Right now, if you want to do something — let's say you're trying to add some simple business logic. I always pretend in my mind that it's a cat blog; I'm adding new cat pictures or dog pictures. It's probably something like figuring out an interest rate if you're at a bank or how to ship something.
But when you think about it, that little bit of code is probably 100, 200 lines. But what do you have to do? You have to go find the blessed template from your company, pull that out, and you have to say — well I have this project set up. Let me write my code, and then I can compile it.
But then I have to choose a Docker file to use. I have my Docker file, and which one am I using? Is it the right one? And then I have to install the updates, and then I'm going to run in Kubernetes. It keeps going, and going, and going, and going — you have all this code that doesn't matter.
Most of us here are also probably familiar with Go. How many times have all of us here copy pasted that same 10-20 lines of Go code to start an HTTP server? I'm seeing some people react in the audience. We've all done that. That is what we're trying to avoid.
Those are non-functional requirements. They're code you do not write that has nothing to do with what you're doing. With these capabilities, we maintain and update this code centrally, and you're coding against something that we call Contract Driven Development.
Instead of saying I am going to write some data to a key-value store. Which key-value store do I use, Redis? — I need to find a Redis client. You say I'd like to write data to a key-value store, and you just write against a contract that says I'm going to write it to a key-value store.
You don't care what's on the other end of it. Same thing with an HTTP server. You can say I want to be an HTTP server. I don't care how that server is spun up, where it's coming from, or what's connected. I want to act as a server.
These capabilities allow us to abstract away that code they don't need to manage anymore. It gets rid of that boilerplate, making this choice a runtime decision — which people running platforms like many of us then get to handle. I'll show some practical examples of this a little bit later in the talk.
» Composable Actors
On top of these capabilities, you have composable actors. These composable actors are the actual business logic I'm referring to here. They're all stateless and reactive, and they're extremely easy to develop and do not have boilerplate. There are none of these things like connection strings and client setup — or which port I'm listening on. It’s just very low effort for developers to actually do.
Because they're written in WebAssembly, they have a very tiny footprint, and they're very portable and scalable across anything. When I say anything, we jokingly say that you can run WebAssembly anywhere from a light bulb to a supercomputer — and that's really not that much of a stretch. You actually can do that as long as you're using the right runtimes.
What this all ends up being — and this is the only snapshot I'm going to show you of our product — is this diagram right behind me. You'll see on the left that there's this purple box that's the KVCounter. In this case, it's a simple bit of logic that increments a counter when it gets an HTTP request inside a key-value store.
But what we're able to do — and we show this in the product — these lines and things are all there for you. It looks like a napkin sketch. That's purposeful. We want to make it as quick from sketch to scale as possible. You'll see that it's connected to an HTTP server that's called Wormhole. That's the way we expose an ingress point into our platform.
We were able to switch out the HTTP server without changing any of the code. Same thing with the key-value store. In this case, it's Redis. Doesn't matter which one it is, and that's going to come into play later. That was a big fire hose. Hopefully, you're not too confused about it, because we're now going to talk about Nomad.
» Where Does Nomad Fit In?
All of us working in this industry know that eventually, there's a server somewhere. We always snicker when we hear the term serverless, because we're like — yeah, right! That's the whole idea of this here — there's a server somewhere. Let's talk about what we're doing with our servers. That is where Nomad came in extremely helpful.
Like I mentioned, we're a Hosting Platform as a Service, and we're running these wasmCloud hosts — these application runtimes — for our customers. Because of this, we need to run other software too that supports it. There are things like NATS underneath the hood. There are other bits and components necessary for the production system that don't compile to WebAssembly, nor do we want to make them compile to WebAssembly — and we need to run those things as well.
Nomad is the perfect scheduler for this use case because it allows for mixed environments and can scale to any structure or thing you want to set it up with. We chose Nomad, to be clear here, as a job scheduler, not as the application runtime. Some people use it to run their applications, which is a great use case for it. We chose it as the job scheduler — the thing that schedules all the work we do inside our system.
Let's take a look at the architecture diagram for that: At the top, you'll see this is the control plane, the orchestrator layer, and we have Nomad, Consul and Vault running. It's like the holy trinity of HashiCorp. Don't quote me on that.
Those things are all running, and they're up there. We have two different parts of the system. We have one part that is a Firecracker VM — I'm going to dig deep into what all that is. The Firecracker VM serves as an isolation layer for multi-tenancy for the wasmCloud host. It's an additional layer of security on top that makes it a lot safer for those of us who are running shared code and shared environments.
But also, we're running other things in the normal container stuff, which is what you see on the right — like NATS, Vector, Node Exporter, OpenTelemetry things. All the stuff you'd expect inside an enterprise system.
We have to run those things — and those things already have containers and other things that make it easy to run. That's the high-level infrastructure view of how we set up. This thing right here is not possible with things like Kubernetes.
» Why Nomad?
It's simpler to run than Kubernetes. You'll hear that simplicity thing a lot. I was on a stream earlier today with one of the Nomad developers — Derek, who is, I think, here in the audience.
We talked about simplicity is something that's very hard to talk about in software. Because if you think about old cars like Model T Fords, you had to turn the crank, and that crank could break your arm. That was simple, but the crank could break your arm.
Nowadays, most new cars that I've seen, if they're anything above the super cheap entry-level car, have a brake pedal that you press on, and then you press the power button, and it goes on. That's simple. Underneath, there are microprocessors and computers that are handling the logic to start the car and all sorts of things going on.
It's complex underneath, but it is simple on top. That's what Nomad does. Nomad swallows a lot of this complexity and allows the setup to be very simple, and you're able to easily scale up in the future.
That's that constrained set of choices that I refer to in the bullet on the slide: you only have to pick the specific things you need instead of trying to swallow the world like you do with Kubernetes.
It also has that tie-in to the HashiCorp ecosystem. We needed the tight integration with Vault for the secrets management. We needed Consul for service discovery. We also needed it for some leader election of some of the underlying pieces of software we're running.
Like I mentioned before, we're not limited to just containers, and this is focused on WebAssembly, but there are many things you can run — Java, QEMU, and the Fork/Exec task driver. All of these things can all be run inside of Nomad. You do not need to be stuck.
I helped write Krustlet, and if you have been following WebAssembly, you've probably heard of it. If you haven't, you probably haven't heard of it. But Krustlet is a project that was one of the first attempts to bring Wasm into Kubernetes.
And let me tell you, Kubernetes does not like things that are not containers. It was a pain in the butt, and thousands and thousands of lines of code to get that working. You're not limited to that inside of Nomad.
» Applications Are Not Always Containers
A couple other things with this. It's the most extensible scheduler out there. Once again, I'm going to drive this point home several times in case it's not a theme you notice: Applications are not always containers.
If you're thinking that they're containers and the future's going to be that, I would invite you to start looking at WebAssembly — and, I think, many other trends in the industry. Applications are not always containers, and Kubernetes forces you into that. Nomad does not. The task driver framework is also easy to build your own. We were able to build a Firecracker task driver that's very much suited for our environment.
On top of that, you also have the support for community standards — think the container networking interface and container storage interface. These good abstractions that allow you to connect different things you use in any environment, no matter what scheduler thing you're doing. Nomad supports all those.
» Nomad Task Driver
Let's talk about the task driver itself. The task driver, if you haven't heard of what a task driver is in Nomad, it's the runtime component that executes your workloads. That's the stuff we mentioned before, including things like Docker, Podman, QEMU — it depends on your use case. Nomad has this built-in framework for authoring these task drivers and then plugging them straight in.
» Why Did We Need a Custom Driver in the First Place?
The problem for us was that we were building this multi-tenant PaaS — we are running these wasmCloud hosts, as I mentioned. Wasm itself is sandboxed. It has to communicate with the outside world only with specifically granted capabilities. Those capabilities — those capability providers that we talked about earlier — are untrusted Rust binaries. It's because Wasm is still new. There are still edges of this where you're going to have to run things that aren't WebAssembly yet.
Eventually, there will be, but even in a perfect world, we still want extra isolation. We said how do we make it so these untrusted Rust binaries can run anything they want to, as long as they're satisfying what we've specified as an interface? How do we do that?
That's where the Firecracker task driver came in. We intend to open source this. Right now, it's very specific to us — that stage when everything's happy and you've made it very much for your system. But we want to open source this.
» Firecracker
Firecracker, if you've not heard of it — I was surprised that a lot of people had not heard of it as I've talked to people at the conference. It's a virtual machine framework that was built to run virtual machines on top of KBM, which is part of Linux. It was built by AWS, and it's the same thing they're using to run all those Lambda functions and Fargate, and — I think — a couple of other things.
It's a very minimal implementation, and it's geared towards the idea that we want to have a secure environment or be running a multi-tenant environment. That sounds exactly like what we need for our use case.
They have on the main Firecracker site this wonderful diagram of everything that's going on. I'll call out a few details here. These are what we like to call a micro VM. They're extremely tiny. When you think VM, this isn't like the VMs of 10 years ago. These are very lightweight and spin up very quickly — and they have an isolation layer that provides all the different networking things you need. But there's also a REST API. Most virtual machine systems do not have a REST API. You can do that with Firecracker and interact with an API for it.
There are rate-limiting things. We don't use them yet, but we will very much use them in the future. You can limit things like network bandwidth, IO calls. All that's built into the actual VM system.
There's also this little thing you'll note that says the jailer. The jailer is an additional component to make it even more secure. This allows us to run customer hosts very tightly and securely, all next to each other on the same servers. This is a powerful technology. It's an interesting technology, and it's very useful for us.
» The Firecracker Task Driver
There is an open source Firecracker task driver — I heard there might be a couple. But it's languished a little bit, and it's not fully up to date with all the different features we needed. We needed the latest version of Firecracker to pick those up.
We needed support for the jailer component I mentioned. We needed the ability to run VMs from snapshots. Because we're spinning things up for customers, we don't want them to wait for a couple of minutes to start up a VM and to get a file system. We need that to be quick, so we start the VM from these snapshots — we needed that functionality. We also needed to be able to pass configuration to the virtual machine from our Nomad templates.
We had this cool thing called memory ballooning. Memory ballooning is something that could be its own talk. But a little story about what happened is as we were running these things — and running them very small — we would get to the issue where it would get memory pressure, and it would get OOM killed, which is everybody's favorite thing to have happen, I know.
When that would happen, we were like, dang it, we can't have things dying out from underneath customers. So, we said, what can we do with that? Firecracker has this ballooning support. You inflate a balloon of memory, and this memory can be eaten into by this guest until it hits a certain point. So, when you start up something, sometimes it takes a little bit of memory as it's loading things or doing stuff — and it can spike up and then come back down. This ballooning support allowed us to add that in there.
Someone would start something, and instead of getting killed, it would say, here's a little bit of memory. It would deflate the balloon and then reinflate it to reclaim the memory. That's a really cool feature, and that's the feature you can glue into Nomad if you need something like this using open source task driver-type things. We decided to write one for our own use case because of these features we needed.
» Job Templates and Firecracker
Let's talk about one of the problems we faced. This is one example of something we had to do, and we're going to dive deep into it. But there are many other things we could talk about here.
The first thing is how do you mount these templates in a virtual machine? You have your /alloc, your /secrets directory and /local — what do you do to get all of that mounted? The other problem was Firecracker requires these mounts to be on block devices. We had to say what do we need to do to set it up?
We started using the MMDS service — the microVM metadata service. If you looked carefully in the other diagram with Firecracker, you saw that. We use this to do the same thing that you do with the EC2 metadata service or the Azure metadata service — you're able to write data to it and get data from it as you need it.
We decided to write out template blocks to the file system’s VM at boot using MMDS. We did that with something very similar to CloudNet. It's a very tiny binary in the VM that we have that, when it starts, pulls down the data it needs and then puts it in the right places. It pulls it from that MMDS service and writes it out to disk.
» Sample MMDS PUT Request (MMDS v1)
It's surprisingly easy to use this. Remember how I mentioned the whole REST API and having clean APIs? This isn't the REST API itself, it's MMDS. But this is a simple request — a complete sample one. You can see it's curling the Unix Socket that this is running on and putting some JSON data in there.
If we compare that to what we did, it's not much more complex. We pass Nomad-specific data with base 64 encoded config files and a certificate that it needed to run. That's extremely simple and easy to use, which is cool when we have this all working in action.
» Nomad Configuration for a Task
I dove deep; I'm going to zoom back out a little bit and say what does it take to get this integrated into Nomad? This is the part where I always get a little bit super nerding out and excited — this is it.
All of that crazy stuff I talked about, memory ballooning and file systems — that's all compiled down to five lines of configuration, and you're done. That's all you have to do to use it in a task. You say, here's this kernel image I'm using, and here's this boot disk. That boot disk is a ZFS snapshot that we had to figure out how to make work.
Guess what? They don't have to worry about it. They say this is the one I need to use. And I'm going to use Cilium. We use Cilium underneath for using the eBPF and filtering things once again for the multi-tenancy stuff that's going on
» Nomad Agent Configuration for the Task Driver
We use Cilium, which is a CNI configuration thing. Then even easier if you look at what it takes to configure Nomad to use this custom task driver; it's even less. In this case, it's four lines of code. You say this is the task driver I'm exposing. Here's the config, the pool I want you to use, and then the memory ballooning that I want you to give to it — and you're able to launch it.
I really want to drive home this point; that is incredibly easy. Going back to that story I told about Krustlet. That was a nightmare. There were many different configuration items and bootstrapping things and all sorts of stuff, — and all I had to do here was write a binary, put it on the machine, and then say here's my configuration options. Then, in the task, I said I want you to run with these options and the task driver that I told you to run in. That's incredibly powerful.
» What We Learned
First off, writing a driver is surprisingly easy. There are a lot of boilerplate — and the Nomad team has even said that to me. They know that. But that's why they also have the skeleton driver project that has most of that boilerplate in place for you, and it's a great reference point of where to start.
If you're interested in writing your own driver, take a look at the Skeleton Driver project that's out there. You also have these built-in drivers that are good examples. Take a look at QEMU or the Docker drivers. Since I'm guessing most people have at least touched Go, you'll be able to read exactly how that's built.
The other thing is we've learned that Nomad being written in Go makes integrating with other technology fairly straightforward. I mean, you can have your feelings and thoughts on Go. I know what my personal opinion is of it. But it's everywhere, especially in the infrastructure field, and most of HashiCorp's tools are written in Go. So, these tools can all integrate with each other easily.
There were some other lessons we learned as well. When you're writing your own task driver, you have to be careful managing any of your task state. We really wanted to emphasize that specific point because you have to know exactly how do you recover from an error. How do you start things? How do you stop things? What data do you need to recover from specific situations or tasks because then you end up in a completely invalid state.
So, just be careful to know what state you need to manage. An example of this was managing our ZFS snapshots was a little bit difficult. We needed to know how to keep track of them and when to clean them up. Because if you don't clean them up, you keep growing your disc space until you fill up — not that that ever happens. We end up trying to figure out how to keep track of this state to get rid of the snapshot at this time.
Once again, I'm going to drive it home for a third, fourth, or fifth time. Working with VMs is very different from containers — so the don't assume containers thing again.
How do you pass environment variables into a virtual machine? We had to solve that by saying we're going to pass them in runtime. We used that with templates. There are a couple of different things. As you go away from the normal thing of doing something as simple as a container, you might have to keep this in account.
The other thing is Cilium. We run Cilium, but we had to put in a lot of work to get running with Nomad. The work I had helped out Dan with originally was getting this done. We're going to open source that as well.
There was something called a Cilium controller — but it was literally a Kubernetes controller. We tried to fork it and pull out the Kubernetes bits, and that was like removing the skeleton and expecting the body to still work. We had to write it completely from scratch.
Once again, I'm going to sound like a broken record, and I'm sorry, but less Kubernetes and more flexibility. Everyone assumes they're going to run it in Kubernetes. That's not the case. I would also have that as a recommendation for Nomad. Don't assume people are going to run it in Nomad. You want to make sure that it is as flexible as possible.
Let's talk about one last set of items that we learned here. The Wasm ecosystem is going to need schedulers. We adopted Nomad because it makes sense for what we're doing with wasmCloud. But other Wasm projects already have and probably will need to benefit from Nomad's flexibility.
That's a key point that we'll discuss some more in a little bit. But being able to easily extend the scheduler is a hugely compelling argument for anyone trying to look at a system they should use. If you're going to need to extend in any way, Nomad is one of the easiest to do that with.
Then for the hopefully final time, don't assume that containers are the smallest unit of compute. They are not. There are many different things out there that you can use that are not containers, which Nomad makes easy. Remember that that is a lesson we learned and want to pass on.
» The Vault Key-Value Provider
I want to go off into a little bit about Vault. Vault is like a lot of our friends — it keeps our secrets safe. We have this Vault Key-Value provider. The reminder from earlier here is that a capability provider in wasmCloud implements a contract.
We have a community supported —, and it's a first-party-supported contract — called the Key-Value capability contract. Our idea was, what if we give access to Vault secrets inside of wasmCloud? Once again, I'm going to show you the same diagram I showed you before.
You have the key-value counter that I explained, the HGB server, and a Redis key-value store. wasmCloud and Cosmonic allows you to take this Redis key-value store, remove that and swap it in for another key-value store. So, we created an implementation of that contract with a key-value that talks to Vault, and we swapped it out so that it looks like this.
Did you notice that there was no change in the code? If we had done this in the current paradigm, you would've had to get the person who wrote the code to grab the Vault client. They would've had to figure out how to use the Vault client, and you would've had to provision a token for them to use for them to use it.
In this case, we had written a bunch of code that stored secrets. For our first draft of doing it, we did this in Redis and a couple of other places But then we're like no, we need to store these somewhere secure. So, without changing our code, we swapped it out, and it talks to a key-value provider that now is Vault — and stores everything in Vault.
» The Benefits
Developers don't care about those things, the token rotation, the Vault policies, all of that stuff. They just don't. They do it because they have to, not because they want to. If you can take that away and make it easier, you get to that second idea of the SRE team or the security team can handle those details.
So, you've separated this out. If you're the one who's spinning up the provider for them that they can use, you can say we're using this Vault policy, and these things expire, have to rotate at this given time, and here's where things are stored. You can handle all of that. You don't have to get the devs to use something specific.
The developers can leverage the abstraction. They don't need to spin up Vault locally if they're testing this. They can spin it up against Redis; it doesn't matter because it'll behave the same. Then we can use Vault in production.
After that, we can also think of this as a way to increase the adoption of a technology. Once again, if you were trying to adopt Vault into somewhere that hadn't used Vault before — and this goes for any technology — so substitute whatever you're thinking of a technology that's been hard for your company to adopt.
You're going to have to go through this whole process of spinning it up and convincing people it's good. You have to do a proof of concept with one team — that's that one bleeding edge team — it always will do something with you, and then you have to slowly spread it out. It takes forever; it takes years to get people on.
With this system, literally, all you do is swap out the thing. The developers had no idea, and now you're all using Vault. That's a super easy way to adopt critical technology or things that you consider important for security, compliance. Whatever it might be, without even impacting your developers
All of this part is open source. That is the QR code and the link, in case you're curious and want to check that out. You'll see how simple it was to connect it in from a code standpoint, if you're familiar with the REST that we wrote this in — then you're able to connect to it from the outside. That's a little bit about Vault and how we're extending it.
» The Future of Nomad and Wasm.
First off, I'm going to start at the lowest level, and that's for us at Cosmonic and the maintainers of wasmCloud. First things first, we need to improve the communication for the task driver. Part of this is on us. Part of this is some things we hope we can improve the information that can get set in the task driver. But we want to be able to communicate to things running inside of our firecracker VM. That's VSOC support, controlling specific processes, those kind of things.
We also want to be able to pull ZFS snapshots on demand. That involves another technology. It's called Bindle, in case you're curious and want to go down a rabbit hole that we're working on. That's a possibility for how we can pull these things on demand with the specific version exactly how we want them. Those are all things we're doing personally.
» Nomad and the Wasm Community
But then we get to this bigger picture of Nomad and the whole WebAssembly community. Other WebAssembly projects and companies are already using Nomad. A lot of us are. I think that's a clear trend about what we think about Nomad and why we're all using it.
Keep in mind that that's there, and the interest is there. I bet that anything else in the WebAssembly space or other types of running things that are not containers will likely look at Nomad for the same reasons.
There is lots of potential here for direct Wasm integrations. There have been a couple of experimental ones. But it would be nice to have either an official one or a well-maintained community version of something like the WebAssembly runtime as a task driver.
Just having a direct thing where you can run a WebAssembly module would be incredibly useful. The other thing is non-Linux deployment environments. I did not know that Nomad runs on Windows, but it would be nice to have this run anywhere — and on any architecture.
There have been some cool updates that I learned about today from the Nomad Roadmap talk, where it was mentioned that there's going to be the edge — and that there are ideas of something can disconnect, please don't reschedule it. That's a very good step in that direction.
But we want to go even a little bit further. It's this idea of we want to go all the way out to the edge. We plan on bringing our technology, what we're doing with Cosmonic and wasmCloud, to the edge, and we want Nomad to follow us all the way there.
That's talking IoT and edge computing. The tiny datacenters that they call edge. There are three or four definitions of edge. We all know it's a buzzword, but we want to go out to that very end of everything in tiny devices, big devices, and everything in between. We want Nomad to follow us there because it's so good at doing this core job of scheduling what we're doing.
» Summing Up
To finish up, I want to call out a couple of links here that could be useful for people. One of them, the top one, is to the wasmCloud documentation if you want to give that a try. There is also the link to the Firecracker documentation and website. That is, like I said, a whole rabbit hole in and of itself. Highly recommend going to visit that.
Then we also have the WebAssembly site. That's the main WebAssembly actual specification and everything at webassembly.org. Then we have the links to the task drivers in case you want to check out what's required to do that.
To summarize all this, Nomad is an amazing scheduler. We've absolutely loved doing it, and we're looking forward to continuing using it and showing how it can enable these kinds of use cases.
This is a completely different use case. Hopefully, I didn't completely blow people's minds off the whole WebAssembly thing and distract you from the rest of the talk. But there are many different use cases like this.
We were able to take that and use it in multiple different ways, both with Vault integration and pulling in other HashiCorp tools. It can be any tool you want to pull in using this Contract Driven Development — and we were able to take these lessons and apply them to how other people might run into them.
I want to say thank you for all this, and if you want to talk, I'll be down here at the base of the stage afterward, and I can chat with you. I also have stickers because everybody loves stickers. Thank you for coming, and I hope you all had a great HashiConf and have a great rest of your time and trip home. Thank you, everyone.