Skip to main content
Version: 2.21

Telepresence and VPNs

Telepresence creates a virtual network interface (VIF) when it connects. This VIF is configured to route the cluster's service and pod subnets so that the user can access resources in the cluster. It's not uncommon that the workstation where Telepresence runs already has network interfaces that route subnets that will overlap. Such conflicts must be resolved deterministically.

Unless configured otherwise, Telepresence will resolve subnet conflicts by moving the cluster's subnet out of the way using network address translation. For a majority of use-cases, this will be enough, but there are some caveats to be aware of.

For more info, see the section on how to avoid the conflict below.

VPN Configuration

Let's begin by reviewing what a VPN does and imagining a sample configuration that might come to conflict with Telepresence. Usually, a VPN client adds two kinds of routes to your machine when you connect. The first serves to override your default route; in other words, it makes sure that packets you send out to the public internet go through the private tunnel instead of your ethernet or wifi adapter. We'll call this a public VPN route. The second kind of route is a private VPN route. These are the routes that allow your machine to access hosts inside the VPN that are not accessible to the public internet. Generally speaking, this is a more circumscribed route that will connect your machine only to reachable hosts on the private network, such as your Kubernetes API server.

This diagram represents what happens when you connect to a VPN, supposing that your private network spans the CIDR range: 10.0.0.0/8.

VPN routing

Kubernetes configuration

One of the things a Kubernetes cluster does for you is assign IP addresses to pods and services. This is one of the key elements of Kubernetes networking, as it allows applications on the cluster to reach each other. When Telepresence connects you to the cluster, it will try to connect you to the IP addresses that your cluster assigns to services and pods. Cluster administrators can configure, on cluster creation, the CIDR ranges that the Kubernetes cluster will place resources in. Let's imagine your cluster is configured to place services in 10.130.0.0/16 and pods in 10.132.0.0/16:

VPN Kubernetes config

Telepresence conflicts

When you run telepresence connect to connect to a cluster, it talks to the API server to figure out what pod and service CIDRs it needs to map in your machine. If it detects that these CIDR ranges are already mapped by a VPN's private route, it will produce an error and inform you of the conflicting subnets:

Terminal
$ telepresence connect
telepresence connect: error: connector.Connect: failed to connect to root daemon: rpc error: code = Unknown desc = subnet 10.43.0.0/16 overlaps with existing route "10.0.0.0/8 via 10.0.0.0 dev utun4, gw 10.0.0.1"

Telepresence offers three different ways to resolve this:

Avoiding the conflict

Telepresence can perform Virtual Network Address Translation (henceforth referred to as VNAT) of the cluster's subnets when routing them from the workstation, thus moving those subnets so that conflicts are avoided. Unless configured not to, Telepresence will use VNAT by default when it detects conflicts.

VNAT is enabled by passing a --vnat flag (introduced in Telepresence 2.21) toteleprence connect. When using this flag, Telepresence will take the following actions:

  • The local DNS-server will translate any IP contained in a VNAT subnet to a virtual IP.
  • All access to a virtual IP will be translated back to its original when routed to the cluster.
  • The container environment retrieved when using ingest or intercept will be mangled, so that all IPs contained in VNAT subnets are replaced with corresponding virtual IPs.

The --vnat flag can be repeated to make Telepresence translate more than one subnet.

Terminal
$ telepresence connect --vnat CIDR

The CIDR can also be a symbolic name that identifies a well-known subnet or list of subnets:

SymbolMeaning
alsoAll subnets added with --also-proxy
serviceThe cluster's service subnet
podsThe cluster's pod subnets.
allAll of the above.

Virtual Subnet Configuration

Telepresence will use a special subnet when it generates the virtual IPs that are used locally. On a Linux or macOS workstation, this subnet will be a class E subnet (not normally used for any other purposes). On Windows, the class E is not routed, and Telepresence will instead default to 211.55.48.0/20.

The default subnet used can be overridden in the client configuration.

In config.yml on the workstation:

routing:
virtualSubnet: 100.10.20.0/24

Or as a Helm chart value to be applied on all clients:

client:
routing:
virtualSubnet: 100.10.20.0/24

Example

Let's assume that we have a conflict between the cluster's subnets, all covered by the CIDR 10.124.0.0/9 and a VPN using 10.0.0.0/9. We avoid the conflict using:

Terminal
$ telepresence connect --vnat all

The cluster's subnets are now hidden behind a virtual subnet, and the resulting configuration will look like this:

VPN Telepresence

Proxying via a specific workload

Telepresence is capable of routing all traffic to a VNAT to a specific workload. This is particularly useful when the cluster's DNS is configured with domains that resolve to loop-back addresses. This is sometimes the case when the cluster uses a mesh configured to listen to a loopback address and then reroute from there.

The --proxy-via flag (introduced in Telepresenc 2.19) is similar to --vnat, but the argument must be in the form CIDR=WORKLOAD. When using this flag, all traffic to the given CIDR will be routed via the given workstation.

The WORKLOAD is the deployment, replicaset, statefulset, or argo-rollout in the cluster whose traffic-agent will be used for targeting the routed subnets.

Example

Let's assume that we have a conflict between the cluster's subnets, all covered by the CIDR 10.124.0.0/9 and a VPN using 10.0.0.0/9. We avoid the conflict using:

Terminal
$ telepresence connect --proxy-via all=echo

The cluster's subnets are now hidden behind a virtual subnet, and all traffic is routed to the echo workload.

Caveats when using VNAT

Telepresence may not accurately detect cluster-side IP addresses being used by services running locally on a workstation in certain scenarios. This limitation arises when local services obtain IP addresses from remote sources such as databases or configmaps, or when IP addresses are sent to it in API calls.

Disabling default VNAT

The default behavior of using VNAT to resolve conflicts can be disabled by adding the following to the client config.

In config.yml on the workstation:

routing:
autoResolveConflicts: false

Or as a Helm chart value to be applied on all clients:

client:
routing:
autoResolveConflicts: false

Explicitly allowing all conflicts will also effectively prevent the default VNAT behavior.

Allowing the conflict

A conflict can be resolved by carefully considering what your network layout looks like, and then allow Telepresence to override the conflicting subnets. Telepresence is refusing to map them, because mapping them could render certain hosts that are inside the VPN completely unreachable. However, you (or your network admin) know better than anyone how hosts are spread out inside your VPN.

Even if the private route routes ALL of 10.0.0.0/8, it's possible that hosts are only being spun up in one of the sub-blocks of the /8 space. Let's say, for example, that you happen to know that all your hosts in the VPN are bunched up in the first half of the space -- 10.0.0.0/9 (and that you know that any new hosts will only be assigned IP addresses from the /9 block). In this case you can configure Telepresence to override the other half of this CIDR block, which is where the services and pods happen to be.

To do this, all you have to do is configure the client.routing.allowConflictingSubnets flag in the Telepresence helm chart. You can do this directly via telepresence helm upgrade:

In config.yml on the workstation:

routing:
allowConflictingSubnets: 10.128.0.0/9

Or as a Helm chart configuration value to be applied on all clients:

client:
routing:
allowConflictingSubnets: 10.128.0.0/9

Or pass the Helm chart configuration using the --set flag

Terminal
$ telepresence helm upgrade --set client.routing.allowConflictingSubnets="{10.128.0.0/9}"

The end result of this (assuming an allowlist of /9) will be a configuration like this:

VPN Telepresence

Using docker

Use telepresence connect --docker to make the Telepresence daemon containerized, which means that it has its own network configuration and therefore no conflict with a VPN. Read more about docker here.

Some helpful hints when dealing with conflicts

When resolving a conflict by allowing it, you might want to validate that the routing is correct during the time when Telepresence is connected. One way of doing this is to retrieve the route for an IP in a conflicting subnet.

This example assumes that Telepresence detected a conflict with a VPN using subnet 100.124.0.0/16, and that we then decided to allow a conflict in a small portion of that using allowConflictingSubnets 100.124.150.0/24. Without telepresence being connected, we check the route for the IP 100.124.150.45, and discover that it's running through a Tailscale device.

Terminal
$ ip route get 100.124.150.45
100.64.2.3 dev tailscale0 table 52 src 100.111.250.89 uid 0

Now, run the same command with telepresence connected. The output should differ and instead show that the same IP Is routed via the Telepresence Virtual Network. This should always be the case for an allowed conflict.

note

If you instead choose to avoid the conflict using VNAT, then the IP will be unaffected and still get routed via Tailscale. The cluster resource using that IP will be available to you from another subnet, using another IP.