Roadmap Preview: What’s Next for Consul Service Mesh
Consul provides a multi-cloud service networking layer to connect, configure, and secure services. For dynamic applications and infrastructure, it provides a distributed service mesh to securely connect services across any runtime platform and cloud.
What sets Consul apart from other service meshes is that, regardless of where, or on what you are running your workload, Consul allows you to link together all of your applications in a single dedicated network layer. Consul is also multi-datacenter capable; it gives you the ability to link regions, clouds, and hybrid environments.
This post is one of a multi-part blog series to introduce the core use cases and capabilities provided by Consul. You will learn about the benefits of the service mesh, whether you should adopt it for your projects and how you can apply it throughout this blog series.
The previous blog post published at initial release talked about the security side of the service mesh problem and how it gives users a simple way to manage service identity through automatically configured mutual TLS and to manage service level segmentation with intention-based security policies.
This blog previews a set of exciting new features which expand the service mesh capabilities of Consul:
- Observability. Configurable metrics and tracing for sidecar proxies.
- ACL Auth Methods. Enables applications to use trusted third-party identities like Kubernetes Service Accounts to automatically obtain ACL tokens with the correct permissions.
- Centralized Configuration for Proxies. Centrally manage proxy configuration like observability, routing and traffic shifting.
- HTTP Path-Based Routing. Advanced path based routing for sidecar proxies. Traffic can be routed to different upstreams based on the HTTP request path.
- Traffic Shifting. Weighted service selection for sidecar proxies, this feature enables Canary Testing and other weighted load balancing patterns.
- Gateways to Bridge Multiple Clusters. Gateways are a new feature allowing simple and secure connectivity between applications in different datacenters.
- Advanced Failover Options. When the services in a local datacenter are not healthy, Advanced Failover Options allow failover to healthy services in different datacenters.
- UI Enhancements. New service and sidecar proxy views, automatic refreshing of service state and external dashboard links.
- Proxy Ecosystem Support. New plugin interface which opens support for more proxy vendors such as HAproxy, and community contributed proxies.
Our first release will include Observability and the ACL Auth Method for Kubernetes. Other features mentioned in this post will be included in subsequent releases. We will discuss each key feature in detail in the rest of the series.
» Observability
When using a service mesh for service to service communication, all traffic in and out of a service flows through a side-car proxy. The proxies form a data plane which transport requests, while Consul Connect acts as a control plane that ensures all proxies are correctly configured and respond to dynamic changes in your workloads and network. This approach also allows the externalization of reliability patterns; the data plane can manage retries, timeouts, circuit breaking, etc. This improves the observability of your application because now metrics related to networking and reliability, which were previously emitted by many different applications, are now emitted in a consistent way by the data plane.
Metrics such as the number of connections, bytes transferred, retries, timeouts, open circuits, and request rates, broken down by HTTP or gRPC response codes, are transparently emitted to a time series database like Prometheus, or software as a service platforms such as DataDog. In addition to metrics, Consul Connect allows you to configure distributed tracing which allows developers to visualize call flows through the services in the application; this helps you to pinpoint failures and identify latency.
From version 1.5, Consul will support configuration of metrics and distributed tracing for layer 4, and at layer 7 HTTP and gRPC.
» ACL Auth Methods
In an unsecured system, if an attacker is able to bypass your perimeter firewall and has the ability to communicate with Consul, they would be able to register fake services and reconfigure intentions. Access Control List (ACL) tokens allow restricted access to only the Consul data and APIs which a client needs. While ACLs are incredibly important for securing Consul data they are essential when using Consul Connect. Consul Connect ensures the identity of your service and this allows you to restrict network access between services. It is crucial to ensure that a service identity can only be obtained by the intended service. In the event of a breach, ACL tokens limit the blast radius by restricting the identity a service can obtain and the operations an attacker can execute on your Consul cluster.
The main complexity users face when setting up such a system is how to create and distribute a valid ACL token for a service which has the correct access to register itself with the service catalog, and obtain mTLS certificates used secure traffic. To simplify this problem Consul is introducing ACL auth methods. ACL auth methods enable applications to use a trusted third-party identity like Kubernetes Service Accounts to obtain an ACL token with the correct permissions automatically.
The first auth method will be for Kubernetes which allows a service to obtain a valid ACL token using Kubernetes Service Accounts. ACL Auth Methods enhance the excellent user experience of running Consul Connect on Kubernetes and vastly simplify the operational approach to securing services.
» Centralized Configuration for Proxies
With the existing Consul data model, configuration for a service is managed by the local Consul agent. This tightly couples configuration with the deployment of an application. For services which only register with Consul’s service catalog, this does not pose too much of a problem; however, with advanced service mesh configuration there needs to be independence between service configuration and application deployment.
For example, using the current data model, should you need to change the traffic shifting settings, then the application would need to be re-deployed with the updated configuration. Another challenge is globally configuring observability settings for your services. With centralized configuration, you can manage this from a central location and the agents listen for config changes in the Consul server. When the configuration is updated, all proxies referencing this configuration automatically reload their settings without needing to be redeployed.
The following example shows a snippet of Centralized Configuration which allows the configuration of a StatsD collector for observability metrics.
policy.hcl
kind = "proxy-defaults"
name = "global"
config {
envoy_dogstatsd_url = "udp://127.0.0.1:9125"
}
This configuration can be written to the Consul server using the command consul config write proxy.hcl
. Whenever a new Connect service is registered, the configuration is automatically appended to any service-specific configuration.
» HTTP Path-Based Routing
Currently, the upstream configuration in Consul Connect only allows basic layer 4 networking; the data plane exposes upstream services to your application via local ports. A new feature of Consul introduces the ability for advanced layer 7 routing; traffic can be routed to different upstream services based on Path or HTTP header.
While a 1-1 relationship between a local port and upstream service is fine for many purposes, the capability for HTTP routing makes it easy to build aggregated services such as API gateways or to replace centralized load balancers with a smarter decentralized approach.
» Traffic Shifting
Traffic Shifting allows you to route a percentage of traffic to tagged instances of a service or completely different services. Traffic shifting easily enables Canary Testing, rather than a big bang release, you can send a percentage of all traffic to the new version of the application and monitor it to ensure acceptable performance. As confidence grows in the new version, the percentage of traffic is increased until it reaches 100%, at which point the old version can be deprecated.
Consul Connect will enable full control over the configuration of Traffic Shifting. You will also be able to combine HTTP Path-Based Routing and Traffic Shifting to create advanced routing models.
» Gateways to Bridge Multiple Clusters
There can be a great deal of complexity when an application spans multiple platforms or environments and you need to link these different services together. Spanning different clouds, cloud to on-premise and connections between a modern scheduler and virtual machines require careful network and security configuration. This configuration ensures that subnets between the two environments do not overlap, and network segmentation is enforced.
Traditionally the approach would be to connect the different environments using a point to point VPN connection; however, this adds complexity in network planning and correctly configuring and operating the VPN tunnels.
Gateways are a new feature of Consul Connect which simplify this problem by allowing secure connectivity between platforms and environments. When Service A needs to communicate with Service B which is located in a different physical datacenter, the traffic is proxied securely through the gateway which then forwards it to the destination service.
Because traffic between the datacenters is proxied by the Connect gateway, the local service does not need a direct connection to the destination service. In addition to this system administrators do not need to worry about overlapping subnets or network address translation.
For improved security and performance, Gateways do not have access to the destination servers certificate or keys. Gateways take advantage of Server Name Indication (SNI) which is an extension to the TLS protocol. With SNI, the client sends the destination hostname as part of the TLS handshake. The Gateway peaks at the first message of the handshake and reads the SNI headers, this allows it to determine the destination without having to decrypt and then re-encrypt the traffic. It then transparently proxies the encrypted traffic to the destination at L4. This approach allows the client to have a real end-to-end TLS connection - even if the Gateway host is compromised, the traffic flowing over it cannot be decrypted.
Connect gateways dramatically simplify multi-datacenter network communication, service discovery, and security, reducing the need for complex VPN and network configurations.
» Advanced Failover Options
Service discovery and health checking allow you to maintain a catalog of healthy services to which upstream traffic can be routed. Related services are usually located in the same datacenter, however; it is not uncommon for large environments to have services spread across multiple datacenters and even multiple clouds. In the instance that no services are healthy in the local datacenter, service traffic needs to be routed to a different datacenter.
Advanced failover options allow you to configure multi-datacenter failover without the need for writing and configuring prepared queries.
Advanced Failover dramatically simplifies the configuration of high availability for services in multi-datacenter environments, enabling regional, and even cloud failover.
» UI Enhancements
We are introducing many enhancements to the Consul UI, improving the way service instances and their associated Consul Connect proxies are displayed, adding configurable links to external dashboards, and automatic refreshing.
Selecting a service now shows all the instances in a list view, you can drill down into the individual service instance to view its checks and tags. When a service has an associated sidecar proxy, this is now directly navigable. Sidecar proxies also have a brand new view which allows you to inspect their health and see the configured upstreams.
» Proxy Ecosystem Support
At present Consul Connect supports two proxies, the built-in L4 proxy, and Envoy. Proxy ecosystem support intends to open up this capability to other proxies. We have defined a plugin interface which allows support for more proxy vendors such as HAproxy, and community contributed proxies to integrate with Consul Connect. Proxy plugins will be maintained in a separate codebase, similar to how Terraform plugins work, while they will not be baked into the main consul binary, but they will be simple to download and configure.
To see how HAProxy integrated with Consul Connect, check out their talk from HashiConf 2018: https://www.youtube.com/watch?v=GeU-7sgukRU
» Summary
We hope you find these new features useful; they will be rolled out in the next few releases starting with Observability, Centralized Configuration for Proxies, UI enhancements, and the ACL Auth Method for Kubernetes in Consul 1.5. HTTP Routing, Traffic Shifting, and Advanced Failover options are coming in subsequent releases early summer.
In the next article in the series, we will look at how to monitor networking performance and improve reliability through Observability.
Sign up for the latest HashiCorp news
More blog posts like this one
HashiCorp at AWS re:Invent: Your blueprint to cloud success
If you’re attending AWS re:Invent in Las Vegas, Dec. 2 - Dec. 6th, visit us for breakout sessions, expert talks, and product demos to learn how to take a unified approach to Infrastructure and Security Lifecycle Management.
Consul 1.20 improves multi-tenancy, metrics, and OpenShift deployment
HashiCorp Consul 1.20 is a significant upgrade for the Kubernetes operator and developer experience, including better multi-tenant service discovery, catalog registration metrics, and secure OpenShift integration.
New SLM offerings for Vault, Boundary, and Consul at HashiConf 2024 make security easier
The latest Security Lifecycle Management (SLM) features from HashiCorp Vault, Boundary, and Consul help organizations offer a smoother path to better security practices for developers.