Traefik Proxy 2.6 and HTTP: Trust, Verification, and the Future
For as wild and crazy as the internet is, the underlying infrastructure and protocols that make it work evolve at a snail’s pace. Or maybe instead of a snail, whose motion is slow and steady, it evolves at the pace of a sailing stone. If you watch one of these stones over a period of years, you’ll never see it move, despite there being a long trail of movement behind it. Then one year, when conditions are right, the rock can slide for tens of meters over several days.
Curiously, stability doesn’t mean that something always works. If you look at the console of the web browser on your computer, you’ll find that every website throws a myriad of errors, and it’s HTTP’s resilience that weathers the issues and (usually) renders usable content.
HTTP/2, you old dog
The HTTP/2 standard was created to solve shortcomings in HTTP/1, and the primary benefits are in speed. It multiplexes connections, transfers data in binary, and uses header compression to reduce overhead. Since its publication as RFC 7540 in 2015, it has powered the massive growth in internet consumption that we saw in the second half of the last decade.
After years out in the wild, we might expect HTTP/2 to be stable, but as one of our users discovered, not everyone’s implementation of HTTP/2 works properly. In their case the problem was with an AWS Network Load Balancer (NLB) that sat between Traefik Proxy and the backend servers. Traefik Proxy communicated through the NLB, and when our user rotated out the backend servers, the NLB failed to terminate connections that Traefik Proxy held open. The result was that Traefik Proxy continued to try and communicate with servers who had moved on, and this showed up as timeouts, errors, and other ugliness.
The Kubernetes project had already encountered this problem in a different scenario. As a result of their work, the http2 Golang library now supports two health check timeouts for communication with HTTP/2 backends. One is an idle timeout for reading data, and the other is an idle timeout for a health check ping. Traefik Proxy 2.6 taps into these checks for HTTP/2 connections, and when one of them fails, it tears down the connection and rebuilds it on the next request.
Six years is a lifetime in ιnternet time, yet problems like this still exist in how vendors implement the standard. (There’s irony in that statement – that a standard could be subject to interpretation.) Regardless, the issues that HTTP/2 solves aren’t the issues that are most prevalent today, so people are already turning their eyes toward the next big thing: HTTP/3.
HTTP/3 saves the day
Although still an internet draft, HTTP/3 is already supported by 73% of internet browsers and 24% of the top 10 million websites. HTTP/3 runs on top of QUIC, a transport layer network protocol that uses UDP with HTTP/2 multiplexed connections for faster performance than TCP. (QUIC was published as RFC 9000 in May, 2021.)
QUIC does smart things to optimize for speed and performance. For example, it recognizes that most websites now use TLS encryption and bakes in TLS support as part of the initial handshake. This eliminates the overhead of negotiating a security protocol after setting up the connection, since the client has already received everything it needs to set up an encrypted connection.
QUIC uses UDP instead of TCP and shifts error handling and flow control into userspace instead of relying on the kernel to handle it as part of the communication stream. QUIC also includes a connection identifier in every packet, so if a client moves between networks (such as moving from a home wireless network to a cellular network), communication with the server will continue without connections timing out and being recreated.
Traefik Proxy shipped with experimental support for HTTP/3 in 2.5, and it worked out of the box for every orchestrator except Kubernetes. Kubernetes users had to jump through some extra hoops, and before I explain how Traefik Proxy 2.6 makes that process less painful, let me first explain how HTTP/3 negotiation works and where the problems were.
HTTP/3 is an opt-in protocol. When the browser makes its first connection, it does so over TCP, using HTTP/2. The server responds, and if it supports HTTP/3, it includes a header that tells the browser the port to use to continue with HTTP/3. If the browser wants to continue with HTTP/3, then it makes subsequent connections to the advertised port.
Note: I’m phrasing this with consideration of the browser’s desires because communicating over HTTP/3 depends on a number of factors that only the browser can consider, like cached content and previous server communication status. If your browser refuses to use HTTP/3, try restarting it and see if it behaves better.
The response header from Traefik Proxy looks like this:
alt-svc: h3=":8443"; ma=2592000,h3-29=":8443"; ma=2592000
The port is usually the same port as the HTTP/2 TLS port (443), but it doesn’t have to be. Traefik Proxy in Kubernetes listens on 8000 for HTTP and 8443 for HTTPS, and Traefik Proxy 2.5 announced 8443 as the HTTP/3 port. We did this because some external load balancers don’t allow the same port to have two protocols, so all you would have to do is add a UDP listener on your load balancer, on port 8443.
This was enough for everything but Kubernetes. Kubernetes is special.
Usually, a reverse proxy is exposed via a LoadBalancer Service, and Kubernetes doesn’t allow a LoadBalancer Service to use two different protocols, even on different ports. An issue for this was created back in 2016, and the actual change is being tracked in a Kubernetes Enhancement Proposal (KEP). It was scheduled to go into every release starting with 1.18, but for one reason or another, it never made it in. Now it’s targeting 1.24, due out in April, 2022.
Who ever said the world of cloud native technology wasn’t rich with drama?
Until the PR for this KEP gets merged, if you want to use HTTP/3 with Kubernetes, you’ll have to find a way to get UDP traffic from your external load balancer back to your hosts and into Traefik Proxy. There are a few ways that you can do this, depending on where you’re running Kubernetes and how you have Traefik Proxy deployed.
Cloud environments
If you convert the Traefik Service from a LoadBalancer Service to a NodePort Service, you’ll lose the cloud load balancer that Kubernetes deployed. You can then manually deploy a cloud load balancer that allows UDP and TCP traffic on the same port (such as a Network Load Balancer in AWS), or if your cloud provider doesn’t offer a solution, you can configure a different UDP port for the HTTP/3 traffic.
Kubernetes allows NodePort and ClusterIP Services to have both TCP and UDP ports, so you would just add the configuration for the HTTP/3 port when making the change. We have a detailed guide on this planned for next week, so make sure to keep an eye on the Traefik Labs Blog.
On-prem environments
If you’re on-prem, consider using a load balancer solution like MetalLB or a Kubernetes distribution like K3s which contains its own service load balancer called klipper-lb. Both MetalLB and klipper-lb can run multiple LoadBalancer Services behind a single IP, as long as the ports and/or protocols are different.
You’d create two LoadBalancer Services, one for the TCP traffic and the other for the UDP traffic, both pointed at Traefik Proxy. The ports would all listen on the same external IP and correctly route back to Traefik Proxy.
Traefik Proxy 2.6 is flexible
We thought that the KEP would be merged in 1.22, but when we discovered that it had been pushed out, we enabled a configuration option to advertise the port where HTTP/3 is listening:
--experimental.http3=true
--entrypoints.websecure.http3
--entrypoints.websecure.http3.advertisedPort=443
Now you can have Traefik Proxy announce any HTTP/3 port you can configure on your external LoadBalancer, and when you look in the browser console, you’ll see that your site is served over HTTP/3.
Trust, but verify
Both of these situations highlight the danger of blind trust and what I’m calling the blindfold of expectation.
As humans, we expect things to be a certain way, and we trust that others who hold the same beliefs will act upon them in the same way. If I trust you, and you and I believe the same things, then it’s reasonable to presume that you will act a certain way. This is one of the shortcuts that our brains use to model the world around us.
Unfortunately, that trust and those expectations can lead us to a place where we might not notice that reality is quite different.
In our case, we had an expectation that Amazon’s NLB would correctly shut down TCP connections when the backends were decommissioned because that’s the correct (and obvious) thing to do. It doesn’t, but now we’ll detect that behavior in NLBs and any other device connected to Traefik Proxy and close those connections ourselves.
We also had an expectation that Kubernetes 1.22 would include support for mixed protocol LoadBalancer Services, and it didn’t. We can’t control the timeline of that project, but we can make Traefik Proxy announce any port that you can configure on your edge. Until Kubernetes catches up, that will solve the issue for any HTTP/3 environment where Traefik Proxy is running.
It’s good to trust the world in which you live. If you didn’t, life would be a lot more challenging than it already is. Sometimes, though, it’s important to peek out from under the blindfold and make sure you’re walking where you think you are. If you find that things don’t look the way they should, correct your course and continue on your way.
Ready to try out Traefik Proxy 2.6? Learn how you can easily get started today and join our upcoming webinar where we'll explore everything new in Traefik Proxy 2.6.