Skip to main content
Case Study

Roche's transition from tools to platforms with Terraform and Vault

If you don't control tool sprawl, it will negatively impact security and it's very expensive to fix.

»Transcript

Good afternoon, everyone. Before I start, thank you, Sebastian, for inviting us here and to present our journey. My name is Harsha, and I'm a product manager at Roche Informatics. Today, I'm excited to share our journey from tools to platforms and the role of HashiCorp in the journey. If you have been hearing talks today,, we're all moving towards a centralized platform. Today I want to talk about why we are moving there and how we are approaching that. 

My background is in software engineering. I did my PhD in software engineering and artificial intelligence. I worked in different areas, starting with DevSecOps, edge computing, artificial intelligence, and now the new kid on the block—platform engineering as well. I also worked in various industries, starting in academia, research centers, and then financial institutions. Now, I work in the pharmaceutical industry.

»State of developer experience—2024

Before starting the actual topic today, I would like to ask, how many of you feel you are using far too many tools for developing software? I know—you are not alone. We are on the same journey.

When I look at it, I feel we are using far too many tools. I started looking into existing research. I stumbled upon some interesting numbers from a 2024 developer experience survey done by a company called Harness.

They surveyed 500 different organizations, asking their development teams what problems they encounter in day-to-day life. The numbers are very interesting. They found that, on average, developers manage 14 different tools to develop their software. They context between the tools every day because they use different tools from different vendors to develop their software. 

Another thing is that in large organizations, onboarding a new developer takes a hundred days of IT spending. It means configuring tools and then connecting different environments, patching them, and configuring the IDs.

97% of developers complain that they context switch because the tools are from multiple vendors. Because they don't work out of the box, you need to enable integrations. And it's not one central pane of glass, you need to switch interfaces. To summarize, they have to juggle different tools, which negatively impacts developer experience. 

These metrics show us that too many tools negatively impact developer experience, which in turn decreases development. Today, I want to talk about how we approach this problem and how we are trying to solve it. 

»A brief overview of Roche 

Maybe some of you know it, and some don't. For over 127 years, we have been developing diagnostics and medicines for a wide range of chronic and life-threatening conditions. We are a leader in healthcare R&D investment. In 2023 alone, we invested CHF13.2 billion into R&D. We have 103,000 employees, just internal employees working over 150 countries. I think externals are more than 150,000 employees. 

In 2023 alone, our diagnostics instruments conducted 29 billion tests. If I have to say in one sentence, what we do it’s that we advance science so that we all have time with the people we love.

»Introducing the Roche developer platform

One might wonder how software fits into all of this. For example, at a super high level, we develop software for our diagnostic and diabetes instruments and our digital health tests—like software—giving health tests to lab technicians and doctors as well.

We have a large on-prem footprint , for example, developing in-house platforms for proprietary LLMs and research for our research scientists to conduct their experiments. We also have labs running on edge devices, and how do we connect to them? Software is involved in all of this. 

The digital health insights market is growing very large lately, and we want to be a leader in the digital health insights business—to transform into this digital health market.

We try to offer a developer platform for the entire Roche software engineering community so that they can produce software faster. Then we want to build security and compliance into it so they don't need to worry about security. 

Security is a very important piece of developing software. But when you ask software developers, is your software secure, or did you apply scans? Most of them say no. They're not happy to do that because many of the tools give false positives and a lot of vulnerabilities. There's one reason many software developers don't want to enable scans. 

»The Roche developer platform—in numbers

To put in some numbers, at the moment, we have more than 1 billion lines of code in 100,000 Git repositories, and our 20,000 software developer community has run more than 33 million CI/CD jobs.

We have a large software community and try to support all of them with our platform. We support entire software engineers with source code and artifact management platforms. And we offer the automation toolchain—the CI/CD toolchain—that they need to accelerate their software development and delivery. We also offer them AppSec tools for delivering secure software. 

»Why are we consolidating the tools across the organization?

»Cost savings

A common assumption is that we are consolidating to save cost. Yes, that is true. By consolidating multiple tools, we can save human effort and reduce costs. We save costs for the organization. 

»Security 

We often miss the impact of the tools' problem—security. Because being in medical software and medical device development, secure software is a must. I don't know if any of you are in medical device development, but the FDA is very particular about developing software. They even ask for a SBOM—Software Bill of Materials—when we have to release a product, when you need to get approval from the FDA. 

»Developer experience 

Tool sprawl can negatively impact developer experience and, in turn, it could in turn lead to less operating margins for the company. Now, I want to show you some numbers on how tool sprawl impacts security and developer experience. 

»Tool sprawl can lead to a potential data breach

When we talk about tool sprawl or shadow IT, we often see multiple instances of tools running on a developer laptop or a VM that no one knows about. We see there are a lot of attack surfaces because those tools are not developed as per organization guidelines. No one is scanning them, and no one knows what they're running, which means for a hacker, it's a large attack surface. 

In a 2020 report, Gartner estimated that one-third of all successful cyberattacks come from shadow IT infrastructure. These cyberattacks are very expensive and very common. For example, in 2023, Q1, a report from Statistica shows that over 6.41 million data records were leaked in 2023. They're very expensive to fix. These are from 2023; again from Statistica. It says that the average cost of fixing a data breach is $4.45 million. To summarize, if you don't control tool sprawl, it will negatively impact security and it's very expensive to fix. 

»Why is it important to empower developers?

A report from McKinsey from 2020 states that improving developer experience or enabling developers in the organization can lead to high results.

For example, they say they notice companies that empower their software developers  achieve 55% higher innovation, 20% higher operating margins, and overall 60% higher stakeholder returns. This shows companies must empower developer experience to get higher returns.

»How are we approaching the consolidation? 

We think tool sprawl is a free-for-all approach. It offers too much freedom for development teams to choose whatever tool they like. I want to compare it to a restaurant analogy.

It's like you go to a restaurant, and then when you want to order, you are given a white paper to ask what you want. Then, in general, if you compare that to a developer, it's like you have a lot of options. You can choose whatever you want—and it’s too much chaos. 

But all the time, we noticed development teams want guidance. Like having a fixed menu. We want to restrict what developers can choose from and also give them options so that they have a paved path they can use to develop and deliver their software.

»4 tool sprawl scenarios 

  1. We have multiple contracts, so we try to align commercially. We also get a volume discount when working with the vendor because of multiple instances. 

  2. In many cases, we have the same tool from the same vendor running on more than ten different sites. We try to merge them to a central location and then use a platform team to manage that.

  3. We often have tools with the same capabilities. We try to work with our business development teams and get into a binary option so that it's either one or none. 

  4. We acknowledge that there are 20% of cases where we cannot fit all of the development teams. But we want to cover the 80% of our software development community with more centralized platforms so that they can focus more on the development rather than configuring or switching between the tools.

»Our approach to consolidation 

How do we do that? If any of you are working in IT, it's very challenging to convince the business development teams to accept change. They don't want to move from one tool to another because they're used to working with it.

We want to make any other option seem like a lot more work. We want to bake security in and add easy onboarding—and give them autonomy to experiment or innovate. 

We also want to enhance the developer experience. As mentioned earlier, we are also focusing on removing interfaces for them. They get a fixed menu and we offer an internal developer portal to give them a paved path to develop and deliver their software.

»What is HashiCorp's role? 

This was our secret management landscape a few years ago.

Each project used to manage its own secret management tool. And in some cases, there was no secret management at all. We had leaked secrets in Git repositories. We still do, but we reduced it a lot. 

As a regulated industry, security is very important for us and we need to rotate secrets continuously. It was a big challenge because of static secrets. Also, as teams were using different tools, it was very difficult to standardize how often you have to change secrets or follow best practices because teams were not even talking to each other. These are some of the problems that we had a few years ago.

»Managing secrets across the organization 

What have we done to solve this? We started looking at the existing market situation with respect to secret management tools. We want to offer a central platform for managing secrets, but we have diverse customer needs. We do edge and  cloud. We have on-prem cloud as well and a lot of different use cases. We cannot offer one platform for each of them. It would become a tool sprawl problem again. 

We want to use one solution that can fit most of our needs, and we have done this using Vault Enterprise. The namespace or the multi-tenancy nature of Vault Enterprise solves a lot of our use cases. 

Now, we use Vault across different areas. We use it in on-prem, enterprise, cloud, and edge. One platform for managing secrets across our entire organization. And now, I want to show you how we did it. 

»How our development teams use Vault 

We have Vault Enterprise, and we use it  for integrating with all developer platforms. The source code management platform, CI/CD platforms, our observability platforms all use Vault to retrieve secrets or talk to other systems. 

All the in-house development solutions, like supporting researchers and machine learning solutions, use Vault for fetching and updating their secrets. Even enterprise software like ERP systems, SAP systems—store their secrets in Vault for frequently updating—and also API systems. Earlier, if an API had to be updated, we needed to have downtime. But now we try to dynamically get the secret from Vault so that we reduce the downtime as well.

We also deploy solutions to multi-cloud. Our digital solutions run in multi-cloud. We want to have one central place where we store secrets—even there, we use Vault for managing secrets or for solutions.

Finally, at our edge locations. For example, we have a lot of field service units who want to fix our devices. They usually get a mobile device where they can identify the instrument—or how to log into the instrument. We even use Vault there. They get access to your web page where they go and see what device it is. That’s how they log into the device. Also on our instruments running at the lab—even there we manage those instruments’ secrets using Vault. This is the story of Vault. 

»Automation and infrastructure management challenges

Similar to secrets management, our infrastructure management was quite heterogeneous a few years ago. Each project team used to follow different practices. There were never common standards. In some cases, there was no automation at all. This all leads to frustration among the development teams. If you have to wait like one month to request infrastructure, it's so frustrating for development teams. Again, it reduces developer experience.

Also, we often see that when companies start migrating to cloud, they just do lift and shift. They don't even think about the TCO of running an application in an on-premises cloud. It was very difficult for us to control our cloud costs. This was an interesting problem for us to solve as well. The other thing is that how do we give teams  like infrastructure the same chance to experiment and innovate as quickly as possible?

»How Terraform solved these challenges

These were our challenges with infrastructure management a few years ago. We solved this problem by starting to use Terraform quite heavily across the organization. Now we use Terraform for provisioning on-prem, cloud, and even edge resources. 

Terraform is a foundational layer for our secure cloud platform, where we run all of our digital solutions. We use the ephemeral sandbox from Terraform Cloud (now called HCP Terraform) as well so that we can quickly spin up and destroy the cloud environments. 

Using automation and Terraform heavily for onboarding our customers in various parts of their journey helped us to accelerate software delivery and also makes them happy because now they don't need to wait for months to get access to infrastructure. And finally, I think cloud cost and security has improved a lot because of these things.

We use it across on-prem.  For example, we have on-prem cloud from different sources—like VM providers or virtualization providers. We use Terraform there to manage on-prem infrastructure and multi-cloud. We even use Terraform for configuring your infrastructure at edge locations. 

»Measuring Terraform’s value 

You might ask, are we able to quantify the impact? We have over 200 development teams across the globe using HashiCorp Vault. Without one central platform, we estimate that we would have at least ten open-source Vault instances with different teams managing in different locations— because we run in over 150 countries and we have IT operations in more than ten countries. We estimate that we save around $1.5 million annually in terms of development and maintenance efforts by using one managed Vault from a central location. 

It increased the security posture of our organization by 30% because earlier, we used to have secret  sharing from Slack or even Gmail. Even now, many of the development teams store the secrets in a notepad or something. Now it has changed a lot. The majority of the teams have moved to Vault, and we rotate the secrets regularly. We enforce organizational-wide guidelines for rotating secrets, and it is growing every day.

How is it going with Terraform? Every day, we deploy thousands of resources through multi-clouds. We have a multi-cloud strategy; we deploy resources across cloud and on-prem, and we estimate that given the amount of automation we put in Terraform, each month, we save around 15-20% off our development team effort.

»What's next for us? 

We want to focus on consolidating heavily in the next two years. We have consolidated with respect to the HashiCorp stack because the multi-tenancy nature was good for us, but that's not true for other platforms.

Now, we are trying to consolidate the landscape and then focusing heavily on building our internal developer portal. We want to utilize Terraform heavily there as well to automate the paved path for the developers. That's all I want to say.  

To summarize,  a developer's time is very precious. If they have too many tools, a lot of their precious time goes to context switching. Try to reduce the amount of tools and move towards common platforms or a centralized platform to improve developer experience and also enhance productivity for the organization. Thank you.

More resources like this one

  • 9/2/2022
  • Case Study

Securing & connecting healthcare platforms with HashiCorp Vault at Roche

  • 12/21/2020
  • Case Study

Multi-Tenant Workloads & Automated Compliance with Nomad & the HashiStack at Exact Sciences

  • 7/13/2020
  • Case Study

Vault Configuration as Code via Terraform: Stories From the Trenches at Hippo Technologies

  • 7/29/2019
  • Case Study

How Whiterabbit.ai uses Terraform and Packer to Fight Cancer With Machine Learning