Telepresence Release Notes
Version 2.21.1 (December 17)
Allow ingest of serverless deployments without specifying an inject-container-ports annotation
The ability to intercept a workload without a service is built around the
telepresence.getambassador.io/inject-container-ports
annotation, and it was also required in order to ingest such a workload. This was counterintuitive and the requirement was removed. An ingest doesn't use a port.Upgrade module dependencies to get rid of critical vulnerability.
Upgrade module dependencies to latest available stable. This includes upgrading golang.org/x/crypto, which had critical issues, from 0.30.0 to 0.31.0 where those issues are resolved.
Version 2.21.0 (December 13)
Automatic VPN conflict avoidance
Telepresence not only detects subnet conflicts between the cluster and workstation VPNs but also resolves them by performing network address translation to move conflicting subnets out of the way.
Virtual Address Translation (VNAT).
It is now possible to use a virtual subnet without routing the affected IPs to a specific workload. A new
telepresence connect --vnat CIDR
flag was added that will perform virtual network address translation of cluster IPs. This flag is very similar to the --proxy-via CIDR=WORKLOAD
introduced in 2.19, but without the need to specify a workload.Intercepts targeting a specific container
In certain scenarios, the container owning the intercepted port differs from the container the intercept targets. This port owner's sole purpose is to route traffic from the service to the intended container, often using a direct localhost connection.
This update introduces a
--container <name>
option to the intercept command. While this option doesn't influence the port selection, it guarantees that the environment variables and mounts propagated to the client originate from the specified container. Additionally, if the --replace
option is used, it ensures that this container is replaced.New telepresence ingest command
The new
telepresence ingest
command, similar to telepresence intercept
, provides local access to the volume mounts and environment variables of a targeted container. However, unlike telepresence intercept
, telepresence ingest
does not redirect traffic to the container and ensures that the mounted volumes are read-only.
An ingest requires a traffic-agent to be installed in the pods of the targeted workload. Beyond that, it's a client-side operation. This allows developers to have multiple simultaneous ingests on the same container.New telepresence curl command
The new
telepresence curl
command runs curl from within a container. The command requires that a connection has been established using telepresence connect --docker
, and the container that runs curl
will share the same network as the containerized telepresence daemon.New telepresence docker-run command
The new
telepresence docker-run <flags and arguments>
requires that a connection has been established using telepresence connect --docker
It will perform a docker run <flags and arguments>
and add the flag necessary to ensure that started container shares the same network as the containerized telepresence daemon.Mount everything read-only during intercept
It is now possible to append ":ro" to the intercept
--mount
flag value. This ensures that all remote volumes that the intercept mounts are read-only.Unify client configuration
Previously, client configuration was divided between the config.yml file and a Kubernetes extension. DNS and routing settings were initially found only in the extension. However, the Helm client structure allowed entries from both.
To simplify this, we've now aligned the config.yml and Kubernetes extension with the Helm client structure. This means DNS and routing settings are now included in both. The Kubernetes extension takes precedence over the config.yml and Helm client object.
While the old-style Kubernetes extension is still supported for compatibility, it cannot be used with the new style.
Use WebSockets for port-forward instead of the now deprecated SPDY.
Telepresence will now use WebSockets instead of SPDY when creating port-forwards to the Kubernetes Cluster, and will fall back to SPDY when connecting to clusters that don't support SPDY. Use of the deprecated SPDY can be forced by setting
cluster.forceSPDY=true
in the config.yml
.
See Streaming Transitions from SPDY to WebSockets for more information about this transition.Make usage data collection configurable using an extension point, and default to no-ops
The OSS code-base will no longer report usage data to the proprietary collector at Ambassador Labs. The actual calls to the collector remain, but will be no-ops unless a proper collector client is installed using an extension point.
Add deployments, statefulSets, replicaSets to workloads Helm chart value
The Helm chart value
workloads
now supports the kinds deployments.enabled
, statefulSets.enabled
, replicaSets.enabled
. and rollouts.enabled
. All except rollouts
are enabled by default. The traffic-manager will ignore workloads, and Telepresence will not be able to intercept them, if the enabled
of the corresponding kind is set to false
.Improved command auto-completion
The auto-completion of namespaces, services, and containers have been added where appropriate, and the default file auto completion has been removed from most commands.
Docker run flags --publish, --expose, and --network now work with docker mode connections
After establishing a connection to a cluster using
telepresence connect --docker
, you can run new containers that share the same network as the containerized daemon that maintains the connection. This enables seamless communication between your local development environment and the remote services.
Normally, Docker has a limitation that prevents combining a shared network configuration with custom networks and exposing ports. However, Telepresence now elegantly circumvents this limitation so that a container started with telepresence docker-run
, telepresence intercept --docker-run
, or telepresence ingest --docker-run
can use flags like --network
, --publish
, or --expose
.
To achieve this, Telepresence temporarily adds the necessary network to the containerized daemon. This allows the new container to join the same network. Additionally, Telepresence starts extra socat containers to handle port mapping, ensuring that the desired ports are exposed to the local environment.Prevent recursion in the Telepresence Virtual Network Interface (VIF)
Network problems may arise when running Kubernetes locally (e.g., Docker Desktop, Kind, Minikube, k3s), because the VIF on the host is also accessible from the cluster's nodes. A request that isn't handled by a cluster resource might be routed back into the VIF and cause a recursion.
These recursions can now be prevented by setting the client configuration property
routing.recursionBlockDuration
so that new connection attempts are temporarily blocked for a specific IP:PORT pair immediately after an initial attempt, thereby effectively ending the recursion.Allow Helm chart to be included as a sub-chart
The Helm chart previously had the unnecessary restriction that the .Release.Name under which telepresence is installed is literally called "traffic-manager". This restriction was preventing telepresence from being included as a sub-chart in a parent chart called anything but "traffic-manager". This restriction has been lifted.
Add Windows arm64 client build
Telepresence client is now available for Windows ARM64. Updated the release workflow files in github actions to build and publish the Windows ARM64 client.
The --agents flag to telepresence uninstall is now the default.
The
telepresence uninstall
was once capable of uninstalling the traffic-manager as well as traffic-agents. This behavior has been deprecated for some time now and in this release, the command is all about uninstalling the agents. Therefore the --agents
flag was made redundant and whatever arguments that are given to the command must be name of workloads that have an agent installed unless the --all-agents
is used, in which case no arguments are allowed.Performance improvement for the telepresence list command
The
telepresence list
command will now retrieve its data from the traffic-manager, which significantly improves its performance when used on namespaces that have a lot of workloads.During an intercept, the local port defaults to the targeted port of the intercepted container instead of 8080.
Telepresence mimics the environment of a target container during an intercept, so it's only natural that the default for the local port is determined by the targeted container port rather than just defaulting to 8080.
A default can still be explicitly defined using the
config.intercept.defaultPort
setting.Move the telepresence-intercept-env configmap data into traffic-manager configmap.
There's no need for two configmaps that store configuration data for the traffic manager. The traffic-manager configmap is also watched, so consolidating the configuration there saves some k8s API calls.
Tracing was removed.
The ability to collect trace has been removed along with the
telepresence gather-traces
and telepresence upload-traces
commands. The underlying code was complex and has not been well maintained since its inception in 2022. We have received no feedback on it and seen no indication that it has ever been used.Remove obsolete code checking the Docker Bridge for DNS
The DNS resolver checked the Docker bridge for messages on Linux. This code was obsolete and caused problems when running in Codespaces.
Fix telepresence connect confusion caused by /.dockerenv file
A
/.dockerenv
will be present when running in a GitHub Codespaces environment. That doesn't mean that telepresence cannot use docker, or that the root daemon shouldn't start.Cap timeouts.connectivityCheck at 5 seconds.
The timeout value of
timeouts.connectivityCheck
is used when checking if a cluster is already reachable without Telepresence setting up an additional network route. If it is, this timeout should be high enough to cover the delay when establishing a connection. If this delay is higher than a second, then chances are very low that the cluster already is reachable, and if it can, that all accesses to it will be very slow. In such cases, Telepresence will create its own network interface and do perform its own tunneling.
The default timeout for the check remains at 500 millisecond, which is more than sufficient for the majority of cases.Prevent that traffic-manager injects a traffic-agent into itself.
The traffic-manager can never be a subject for an intercept, ingest, or proxy-via, because that means that it injects the traffic-agent into itself, and it is not designed to do that. A user attempting this will now see a meaningful error message.
Don't include pods in the kube-system namespace when computing pod-subnets from pod IPs
A user would normally never access pods in the
kube-system
namespace directly, and automatically including pods included there when computing the subnets will often lead to problems when running the cluster locally. This namespace is therefore now excluded in situations when the pod subnets are computed from the IPs of pods. Services in this namespace will still be available through the service subnet.
If a user should require the pod-subnet to be mapped, it can be added to the client.routing.alsoProxy
list in the helm chart.Let routes belonging to an allowed conflict be added as a static route on Linux.
The
allowConflicting
setting didn't always work on Linux because the conflicting subnet was just added as a link to the TUN device, and therefore didn't get subjected to routing rule used to assign priority to the given subnet.Version 2.20.3 (November 18)
Ensure that Telepresence works with GitHub Codespaces
GitHub Codespaces runs in a container, but not as root. Telepresence didn't handle this situation correctly and only started the user daemon. The root daemon was never started.
Mounts not working correctly when connected with --proxy-via
A mount would try to connect to the sftp/ftp server using the original (cluster side) IP although that IP was translated into a virtual IP when using
--proxy-via
.Version 2.20.2 (October 21)
Crash in traffic-manager configured with agentInjector.enabled=false
A traffic-manager that was installed with the Helm value
agentInjector.enabled=false
crashed when a client used the commands telepresence version
or telepresence status
. Those commands would call a method on the traffic-manager that panicked if no traffic-agent was present. This method will now instead return the standard Unavailable
error code, which is expected by the caller.Version 2.20.1 (October 10)
Some workloads missing in the telepresence list output (typically replicasets owned by rollouts).
Version 2.20.0 introduced a regression in the
telepresence list
command, resulting in the omission of all workloads that were owned by another workload. The correct behavior is to just omit those workloads that are owned by the supported workload kinds Deployment
, ReplicaSet
, StatefulSet
, and Rollout
. Furthermore, the Rollout
kind must only be considered supported when the Argo Rollouts feature is enabled in the traffic-manager.Allow comma separated list of daemons for the gather-logs command.
The name of the
telepresence gather-logs
flag --daemons
suggests that the argument can contain more than one daemon, but prior to this fix, it couldn't. It is now possible to use a comma separated list, e.g. telepresence gather-logs --daemons root,user
.Version 2.20.0 (October 3)
Add timestamp to telepresence_logs.zip filename.
Telepresence is now capable of easily find telepresence gather-logs by certain timestamp.
Enable intercepts of workloads that have no service.
Telepresence is now capable of intercepting workloads that have no associated service. The intercept will then target container port instead of a service port. The new behavior is enabled by adding a
telepresence.getambassador.io/inject-container-ports
annotation where the value is a comma separated list of port identifiers consisting of either the name or the port number of a container port, optionally suffixed with /TCP
or /UDP
.Publish the OSS version of the telepresence Helm chart
The OSS version of the telepresence helm chart is now available at ghcr.io/telepresenceio/telepresence-oss, and can be installed using the command:
helm install traffic-manager oci://ghcr.io/telepresenceio/telepresence-oss --namespace ambassador --version 2.20.0
The chart documentation is published at ArtifactHUB.Control the syntax of the environment file created with the intercept flag --env-file
A new
--env-syntax <syntax>
was introduced to allow control over the syntax of the file created when using the intercept flag --env-file <file>
. Valid syntaxes are "docker", "compose", "sh", "csh", "cmd", and "ps"; where "sh", "csh", and "ps" can be suffixed with ":export".Add support for Argo Rollout workloads.
Telepresence now has an opt-in support for Argo Rollout workloads. The behavior is controlled by
workloads.argoRollouts.enabled
Helm chart value. It is recommended to set the following annotation telepresence.getambassador.io/inject-traffic-agent: enabled
to avoid creation of unwanted revisions.Enable intercepts of containers that bind to podIP
In previous versions, the traffic-agent would route traffic to localhost during periods when an intercept wasn't active. This made it impossible for an application to bind to the pod's IP, and it also meant that service meshes binding to the podIP would get bypassed, both during and after an intercept had been made. This is now changed, so that the traffic-agent instead forwards non intercepted requests to the pod's IP, thereby enabling the application to either bind to localhost or to that IP.
Use ghcr.io/telepresenceio instead of docker.io/datawire for OSS images and the telemount Docker volume plugin.
All OSS telepresence images and the telemount Docker plugin are now published at the public registry ghcr.io/telepresenceio and all references from the client and traffic-manager has been updated to use this registry instead of the one at docker.io/datawire.
Use nftables instead of iptables-legacy
Some time ago, we introduced iptables-legacy because users had problems using Telepresence with Fly.io where nftables wasn't supported by the kernel. Fly.io has since fixed this, so Telepresence will now use nftables again. This in turn, ensures that modern systems that lack support iptables-legacy will work.
Root daemon wouldn't start when sudo timeout was zero.
The root daemon refused to start when
sudo
was configured with a timestamp_timeout=0
. This was due to logic that first requested root privileges using a sudo call, and then relied on that these privileges were cached, so that a subsequent call using --non-interactive
was guaranteed to succeed. This logic will now instead do one single sudo call, and rely solely on sudo to print an informative prompt and start the daemon in the background.Detect minikube network when connecting with --docker
A
telepresence connect --docker
failed when attempting to connect to a minikube that uses a docker driver because the containerized daemon did not have access to the minikube
docker network. Telepresence will now detect an attempt to connect to that network and attach it to the daemon container as needed.Version 2.19.1 (July 12)
Add brew support for the OSS version of Telepresence.
The Open-Source Software version of Telepresence can now be installed using the brew formula via
brew install telepresenceio/telepresence/telepresence-oss
.Add --create-namespace flag to the telepresence helm install command.
A
--create-namespace
(default true
) flag was added to the telepresence helm install
command. No attempt will be made to create a namespace for the traffic-manager if it is explicitly set to false
. The command will then fail if the namespace is missing.Introduce DNS fallback on Windows.
A
network.defaultDNSWithFallback
config option has been introduced on Windows. It will cause the DNS-resolver to fall back to the resolver that was first in the list prior to when Telepresence establishes a connection. The option is default true
since it is believed to give the best experience but can be set to false
to restore the old behavior.Brew now supports MacOS (amd64/arm64) / Linux (amd64)
The brew formula can now dynamically support MacOS (amd64/arm64) / Linux (amd64) in a single formula
Add ability to provide an externally-provisioned webhook secret
Added
supplied
as a new option for agentInjector.certificate.method
. This fully disables the generation of the Mutating Webhook's secret, allowing the chart to use the values of a pre-existing secret named agentInjector.secret.name
. Previously, the install would fail when it attempted to create or update the externally-managed secret.Let PTR query for DNS server return the cluster domain.
The
nslookup
program on Windows uses a PTR query to retrieve its displayed "Server" property. This Telepresence DNS resolver will now return the cluster domain on such a query.Add scheduler name to PODs templates.
A new Helm chart value
schedulerName
has been added. With this feature, we are able to define some particular schedulers from Kubernetes to apply some different strategies to allocate telepresence resources, including the Traffic Manager and hooks pods.Race in traffic-agent injector when using inject annotation
Applying multiple deployments that used the
telepresence.getambassador.io/inject-traffic-agent: enabled
would cause a race condition, resulting in a large number of created pods that eventually had to be deleted, or sometimes in pods that didn't contain a traffic agent.Fix configuring custom agent security context
-> The traffic-manager helm chart will now correctly use a custom agent security context if one is provided.
Version 2.19.0 (June 15)
Warn when an Open Source Client connects to an Enterprise Traffic Manager.
The difference between the OSS and the Enterprise offering is not well understood, and OSS users often install a traffic-manager using the Helm chart published at getambassador.io. This Helm chart installs an enterprise traffic-manager, which is probably not what the user would expect. Telepresence will now warn when an OSS client connects to an enterprise traffic-manager and suggest switching to an enterprise client, or use
telepresence helm install
to install an OSS traffic-manager.Add scheduler name to PODs templates.
A new Helm chart value
schedulerName
has been added. With this feature, we are able to define some particular schedulers from Kubernetes to apply some different strategies to allocate telepresence resources, including the Traffic Manager and hooks pods.Improve traffic-manager performance in very large clusters.
-> The traffic-manager will now use a shared-informer when keeping track of deployments. This will significantly reduce the load on the Kublet in large clusters and therefore lessen the risk for the traffic-manager being throttled, which can lead to other problems.
Kubeconfig exec authentication failure when connecting with --docker from a WSL linux host
Clusters like Amazon EKS often use a special authentication binary that is declared in the kubeconfig using an
exec
authentication strategy. This binary is normally not available inside a container. Consequently, a modified kubeconfig is used when telepresence connect --docker
executes, appointing a kubeauth
binary which instead retrieves the authentication from a port on the Docker host that communicates with another process outside of Docker. This process then executes the original exec
command to retrieve the necessary credentials.
This setup was problematic when using WSL, because even though telepresence connect --docker
was executed on a Linux host, the Docker host available from host.docker.internal
that the kubeauth
connected to was the Windows host running Docker Desktop. The fix for this was to use the local IP of the default route instead of host.docker.internal
when running under WSL..Fix bug in workload cache, causing endless recursion when a workload uses the same name as its owner.
The workload cache was keyed by name and namespace, but not by kind, so a workload named the same as its owner workload would be found using the same key. This led to the workload finding itself when looking up its owner, which in turn resulted in an endless recursion when searching for the topmost owner.
FailedScheduling events mentioning node availability considered fatal when waiting for agent to arrive.
The traffic-manager considers some events as fatal when waiting for a traffic-agent to arrive after an injection has been initiated. This logic would trigger on events like "Warning FailedScheduling 0/63 nodes are available" although those events indicate a recoverable condition and kill the wait. This is now fixed so that the events are logged but the wait continues.
Improve how the traffic-manager resolves DNS when no agent is installed.
The traffic-manager is typically installed into a namespace different from the one that clients are connected to. It's therefore important that the traffic-manager adds the client's namespace when resolving single label names in situations where there are any agents to dispatch the DNS query to.
Removal of ability import legacy artifact into Helm.
A helm install would make attempts to find manually installed artifacts and make them managed by Helm by adding the necessary labels and annotations. This was important when the Helm chart was first introduced but is far less so today, and this legacy import was therefore removed.
Docker aliases deprecation caused failure to detect Kind cluster.
The logic for detecting if a cluster is a local Kind cluster, and therefore needs some special attention when using
telepresence connect --docker
, relied on the presence of Aliases
in the Docker network that a Kind cluster sets up. In Docker versions from 26 and up, this value is no longer used, but the corresponding info can instead be found in the new DNSNames
field.Include svc as a top-level domain in the DNS resolver.
It's not uncommon that use-cases involving Kafka or other middleware use FQNs that end with "svc". The core-DNS resolver in Kubernetes can resolve such names. With this bugfix, the Telepresence DNS resolver will also be able to resolve them, and thereby remove the need to add ".svc" to the include-suffix list.
Add ability to enable/disable the mutating webhook.
A new Helm chart boolean value
agentInjector.enable
has been added that controls the agent-injector service and its associated mutating webhook. If set to false
, the service, the webhook, and the secrets and certificates associated with it, will no longer be installed.Add ability to mount a webhook secret.
A new Helm chart value
agentInjector.certificate.accessMethod
which can be set to watch
(the default) or mount
has been added. The mount
setting is intended for clusters with policies that prevent containers from doing a get
, list
or watch
of a Secret
, but where a latency of up to 90 seconds is acceptable between the time the secret is regenerated and the agent-injector picks it up.Make it possible to specify ignored volume mounts using path prefix.
Volume mounts like
/var/run/secrets/kubernetes.io
are not declared in the workload. Instead, they are injected during pod-creation and their names are generated. It is now possible to ignore such mounts using a matching path prefix.Make the telemount Docker Volume plugin configurable
A
telemount
object was added to the intercept
object in config.yml
(or Helm value client.intercept
), so that the automatic download and installation of this plugin can be fully customised.Add option to load the kubeconfig yaml from stdin during connect.
This allows another process with a kubeconfig already loaded in memory to directly pass it to
telepresence connect
without needing a separate file. Simply use a dash "-" as the filename for the --kubeconfig
flag.Add ability to specify agent security context.
A new Helm chart value
agent.securityContext
that will allow configuring the security context of the injected traffic agent. The value can be set to a valid Kubernetes securityContext object, or can be set to an empty value (
) to ensure the agent has no defined security context. If no value is specified, the traffic manager will set the agent's security context to the same as the first container's of the workload being injected into.Tracing is no longer enabled by default.
Tracing must now be enabled explicitly in order to use the
telepresence gather-traces
command.Removal of timeouts that are no longer in use
The
config.yml
values timeouts.agentInstall
and timeouts.apply
haven't been in use since versions prior to 2.6.0, when the client was responsible for installing the traffic-agent. These timeouts are now removed from the code-base, and a warning will be printed when attempts are made to use them.Search all private subnets to find one open for dnsServerSubnet
This resolves a bug that did not test all subnets in a private range, sometimes resulting in the warning, "DNS doesn't seem to work properly."
Docker aliases deprecation caused failure to detect Kind cluster.
The logic for detecting if a cluster is a local Kind cluster, and therefore needs some special attention when using
telepresence connect --docker
, relied on the presence of Aliases
in the Docker network that a Kind cluster sets up. In Docker versions from 26 and up, this value is no longer used, but the corresponding info can instead be found in the new DNSNames
field.Creation of individual pods was blocked by the agent-injector webhook.
An attempt to create a pod was blocked unless it was provided by a workload. Hence, commands like
kubectl run -i busybox --rm --image=curlimages/curl --restart=Never -- curl echo-easy.default
would be blocked from executing.Fix panic due to root daemon not running.
If a
telepresence connect
was made at a time when the root daemon was not running (an abnormal condition) and a subsequent intercept was then made, a panic would occur when the port-forward to the agent was set up. This is now fixed so that the initial telepresence connect
is refused unless the root daemon is running.Get rid of telemount plugin stickiness
The
datawire/telemount
that is automatically downloaded and installed, would never be updated once the installation was made. Telepresence will now check for the latest release of the plugin and cache the result of that check for 24 hours. If a new version arrives, it will be installed and used.Use route instead of address for CIDRs with masks that don't allow "via"
A CIDR with a mask that leaves less than two bits (/31 or /32 for IPv4) cannot be added as an address to the VIF, because such addresses must have bits allowing a "via" IP.
The logic was modified to allow such CIDRs to become static routes, using the VIF base address as their "via", rather than being VIF addresses in their own right.
Containerized daemon created cache files owned by root
When using
telepresence connect --docker
to create a containerized daemon, that daemon would sometimes create files in the cache that were owned by root, which then caused problems when connecting without the --docker
flag.Remove large number of requests when traffic-manager is used in large clusters.
The traffic-manager would make a very large number of API requests during cluster start-up or when many services were changed for other reasons. The logic that did this was refactored and the number of queries were significantly reduced.
Don't patch probes on replaced containers.
A container that is being replaced by a
telepresence intercept --replace
invocation will have no liveness-, readiness, nor startup-probes. Telepresence didn't take this into consideration when injecting the traffic-agent, but now it will refrain from patching symbolic port names of those probes.Don't rely on context name when deciding if a kind cluster is used.
The code that auto-patches the kubeconfig when connecting to a kind cluster from within a docker container, relied on the context name starting with "kind-", but although all contexts created by kind have that name, the user is still free to rename it or to create other contexts using the same connection properties. The logic was therefore changed to instead look for a loopback service address.
Version 2.18.0 (February 9)
Include the image for the traffic-agent in the output of the version and status commands.
The version and status commands will now output the image that the traffic-agent will be using when injected by the agent-injector.
Custom DNS using the client DNS resolver.
A new telepresence connect --proxy-via CIDR=WORKLOAD
flag was introduced, allowing Telepresence to translate DNS responses matching specific subnets into virtual IPs that are used locally. Those virtual IPs are then routed (with reverse translation) via the pod's of a given workload. This makes it possible to handle custom DNS servers that resolve domains into loopback IPs. The flag may also be used in cases where the cluster's subnets are in conflict with the workstation's VPN.
The CIDR can also be a symbolic name that identifies a subnet or list of subnets:
also | All subnets added with --also-proxy |
service | The cluster's service subnet |
pods | The cluster's pod subnets. |
all | All of the above. |
Ensure that agent.appProtocolStrategy is propagated correctly.
The
agent.appProtocolStrategy
was inadvertently dropped when moving license related code fromm the OSS repository the repository for the Enterprise version of Telepresence. It has now been restored.Include non-default zero values in output of telepresence config view.
The
telepresence config view
command will now print zero values in the output when the default for the value is non-zero.Restore ability to run the telepresence CLI in a docker container.
The improvements made to be able to run the telepresence daemon in docker using
telepresence connect --docker
made it impossible to run both the CLI and the daemon in docker. This commit fixes that and also ensures that the user- and root-daemons are merged in this scenario when the container runs as root.Remote mounts when intercepting with the --replace flag.
A
telepresence intercept --replace
did not correctly mount all volumes, because when the intercepted container was removed, its mounts were no longer visible to the agent-injector when it was subjected to a second invocation. The container is now kept in place, but with an image that just sleeps infinitely.Intercepting with the --replace flag will no longer require all subsequent intercepts to use --replace.
A
telepresence intercept --replace
will no longer switch the mode of the intercepted workload, forcing all subsequent intercepts on that workload to use --replace
until the agent is uninstalled. Instead, --replace
can be used interchangeably just like any other intercept flag.Kubeconfig exec authentication with context names containing colon didn't work on Windows
The logic added to allow the root daemon to connect directly to the cluster using the user daemon as a proxy for exec type authentication in the kube-config, didn't take into account that a context name sometimes contains the colon ":" character. That character cannot be used in filenames on windows because it is the drive letter separator.
Provide agent name and tag as separate values in Helm chart
The
AGENT_IMAGE
was a concatenation of the agent's name and tag. This is now changed so that the env instead contains an AGENT_IMAGE_NAME
and AGENT_INAGE_TAG
. The AGENT_IMAGE
is removed. Also, a new env REGISTRY
is added, where the registry of the traffic- manager image is provided. The AGENT_REGISTRY
is no longer required and will default to REGISTRY
if not set.Environment interpolation expressions were prefixed twice.
Telepresence would sometimes prefix environment interpolation expressions in the traffic-agent twice so that an expression that looked like
$(SOME_NAME)
in the app-container, ended up as $(_TEL_APP_A__TEL_APP_A_SOME_NAME)
in the corresponding expression in the traffic-agent.Panic in root-daemon on darwin workstations with full access to cluster network.
A darwin machine with full access to the cluster's subnets will never create a TUN-device, and a check was missing if the device actually existed, which caused a panic in the root daemon.
Show allow-conflicting-subnets in telepresence status and telepresence config view.
The
telepresence status
and telepresence config view
commands didn't show the allowConflictingSubnets
CIDRs because the value wasn't propagated correctly to the CLI.It is now possible use a host-based connection and containerized connections simultaneously.
Only one host-based connection can exist because that connection will alter the DNS to reflect the namespace of the connection. but it's now possible to create additional connections using
--docker
while retaining the host-based connection.Ability to set the hostname of a containerized daemon.
The hostname of a containerized daemon defaults to be the container's ID in Docker. You now can override the hostname using
telepresence connect --docker --hostname <a name>
.New --multi-daemon
flag to enforce a consistent structure for the status command output.
The output of the
telepresence status
when using --output json
or --output yaml
will either show an object where the user_daemon
and root_daemon
are top level elements, or when multiple connections are used, an object where a connections
list contains objects with those daemons. The flag --multi-daemon
will enforce the latter structure even when only one daemon is connected so that the output can be parsed consistently. The reason for keeping the former structure is to retain backward compatibility with existing parsers.Make output from telepresence quit more consistent.
A quit (without -s) just disconnects the host user and root daemons but will quit a container based daemon. The message printed was simplified to remove some have/has is/are errors caused by the difference.
Fix "tls: bad certificate" errors when refreshing the mutator-webhook secret
The
agent-injector
service will now refresh the secret used by the mutator-webhook
each time a new connection is established, thus preventing the certificates to go out-of-sync when the secret is regenerated.Keep telepresence-agents configmap in sync with pod states.
An intercept attempt that resulted in a timeout due to failure of injecting the traffic-agent left the
telepresence-agents
configmap in a state that indicated that an agent had been added, which caused problems for subsequent intercepts after the problem causing the first failure had been fixed.The telepresence status
command will now report the status of all running daemons.
A
telepresence status
, issued when multiple containerized daemons were active, would error with "multiple daemons are running, please select one using the --use <match> flag". This is now fixed so that the command instead reports the status of all running daemons.The telepresence version
command will now report the version of all running daemons.
A
telepresence version
, issued when multiple containerized daemons were active, would error with "multiple daemons are running, please select one using the --use <match> flag". This is now fixed so that the command instead reports the version of all running daemons.Multiple containerized daemons can now be disconnected using telepresence quit -s
A
telepresence quit -s
, issued when multiple containerized daemons were active, would error with "multiple daemons are running, please select one using the --use <match> flag". This is now fixed so that the command instead quits all daemons.The DNS search path on Windows is now restored when Telepresence quits
The DNS search path that Telepresence uses to simulate the DNS lookup functionality in the connected cluster namespace was not removed by a
telepresence quit
, resulting in connectivity problems from the workstation. Telepresence will now remove the entries that it has added to the search list when it quits.The user-daemon would sometimes get killed when used by multiple simultaneous CLI clients.
The user-daemon would die with a fatal "fatal error: concurrent map writes" error in the
connector.log
, effectively killing the ongoing connection.Multiple services ports using the same target port would not get intercepted correctly.
Intercepts didn't work when multiple service ports were using the same container port. Telepresence would think that one of the ports wasn't intercepted and therefore disable the intercept of the container port.
Root daemon refuses to disconnect.
The root daemon would sometimes hang forever when attempting to disconnect due to a deadlock in the VIF-device.
Fix panic in user daemon when traffic-manager was unreachable
The user daemon would panic if the traffic-manager was unreachable. It will now instead report a proper error to the client.
Removal of backward support for versions predating 2.6.0
The telepresence helm installer will no longer discover and convert workloads that were modified by versions prior to 2.6.0. The traffic manager will and no longer support the muxed tunnels used in versions prior to 2.5.0.
Version 2.17.0 (November 14)
Additional Prometheus metrics to track intercept/connect activity
This feature adds the following metrics to the Prometheus endpoint:
connect_count
, connect_active_status
, intercept_count
, and intercept_active_status
. These are labeled by client/install_id. Additionally, the intercept_count
metric has been renamed to active_intercept_count
for clarity.Make the Telepresence client docker image configurable.
The docker image used when running a Telepresence intercept in docker mode can now be configured using the setting
images.clientImage
and will default first to the value of the environment TELEPRESENCE_CLIENT_IMAGE
, and then to the value preset by the telepresence binary. This configuration setting is primarily intended for testing purposes.Use traffic-agent port-forwards for outbound and intercepted traffic.
The telepresence TUN-device is now capable of establishing direct port-forwards to a traffic-agent in the connected namespace. That port-forward is then used for all outbound traffic to the device, and also for all traffic that arrives from intercepted workloads. Getting rid of the extra hop via the traffic-manager improves performance and reduces the load on the traffic-manager. The feature can only be used if the client has Kubernetes port-forward permissions to the connected namespace. It can be disabled by setting
cluster.agentPortForward
to false
in config.yml
.Improve outbound traffic performance.
The root-daemon now communicates directly with the traffic-manager instead of routing all outbound traffic through the user-daemon. The root-daemon uses a patched kubeconfig where
exec
configurations to obtain credentials are dispatched to the user-daemon. This to ensure that all authentication plugins will execute in user-space. The old behavior of routing everything through the user-daemon can be restored by setting cluster.connectFromRootDaemon
to false
in config.yml
.New networking CLI flag --allow-conflicting-subnets
telepresence connect (and other commands that kick off a connect) now accepts an --allow-conflicting-subnets CLI flag. This is equivalent to client.routing.allowConflictingSubnets in the helm chart, but can be specified at connect time. It will be appended to any configuration pushed from the traffic manager.
Warn if large version mismatch between traffic manager and client.
Print a warning if the minor version diff between the client and the traffic manager is greater than three.
The authenticator binary was removed from the docker image.
The
authenticator
binary, used when serving proxied exec
kubeconfig credential retrieval, has been removed. The functionality was instead added as a subcommand to the telepresence
binary.Version 2.16.1 (October 12)
Add --docker-debug flag to the telepresence intercept command.
This flag is similar to
--docker-build
but will start the container with more relaxed security using the docker run
flags --security-opt apparmor=unconfined --cap-add SYS_PTRACE
.Add a --export option to the telepresence connect command.
In some situations it is necessary to make some ports available to the host from a containerized telepresence daemon. This commit adds a repeatable
--expose <docker port exposure>
flag to the connect command.Prevent agent-injector webhook from selecting from kube-xxx namespaces.
The
kube-system
and kube-node-lease
namespaces should not be affected by a global agent-injector webhook by default. A default namespaceSelector
was therefore added to the Helm Chart agentInjector.webhook
that contains a NotIn
preventing those namespaces from being selected.Backward compatibility for pod template TLS annotations.
Users of Telepresence < 2.9.0 that make use of the pod template TLS annotations were unable to upgrade because the annotation names have changed (now prefixed by "telepresence."), and the environment expansion of the annotation values was dropped. This fix restores support for the old names (while retaining the new ones) and the environment expansion.
Built with go 1.21.3
Built Telepresence with go 1.21.3 to address CVEs.
Match service selector against pod template labels
When listing intercepts (typically by calling
telepresence list
) selectors of services are matched against workloads. Previously the match was made against the labels of the workload, but now they are matched against the labels pod template of the workload. Since the service would actually be matched against pods this is more correct. The most common case when this makes a difference is that statefulsets now are listed when they should.Version 2.16.0 (October 2)
The helm sub-commands will no longer start the user daemon.
The
telepresence helm install/upgrade/uninstall
commands will no longer start the telepresence user daemon because there's no need to connect to the traffic-manager in order for them to execute.Routing table race condition
A race condition would sometimes occur when a Telepresence TUN device was deleted and another created in rapid succession that caused the routing table to reference interfaces that no longer existed.
Stop lingering daemon container
When using
telepresence connect --docker
, a lingering container could be present, causing errors like "The container name NN is already in use by container XX ...". When this happens, the connect logic will now give the container some time to stop and then call docker stop NN
to stop it before retrying to start it.Add file locking to the Telepresence cache
Files in the Telepresence cache are accesses by multiple processes. The processes will now use advisory locks on the files to guarantee consistency.
Lock connection to namespace
The behavior changed so that a connected Telepresence client is bound to a namespace. The namespace can then not be changed unless the client disconnects and reconnects. A connection is also given a name. The default name is composed from
<kube context name>-<namespace>
but can be given explicitly when connecting using --name
. The connection can optionally be identified using the option --use <name match>
(only needed when docker is used and more than one connection is active).Deprecation of global --context and --docker flags.
The global flags
--context
and --docker
will now be considered deprecated unless used with commands that accept the full set of Kubernetes flags (e.g. telepresence connect
).Deprecation of the --namespace flag for the intercept command.
The
--namespace
flag is now deprecated for telepresence intercept
command. The flag can instead be used with all commands that accept the full set of Kubernetes flags (e.g. telepresence connect
).Legacy code predating version 2.6.0 was removed.
The telepresence code-base still contained a lot of code that would modify workloads instead of relying on the mutating webhook installer when a traffic-manager version predating version 2.6.0 was discovered. This code has now been removed.
Add telepresence list-namespaces
and telepresence list-contexts
commands
These commands can be used to check accessible namespaces and for automation.
Implicit connect warning
A deprecation warning will be printed if a command other than
telepresence connect
causes an implicit connect to happen. Implicit connects will be removed in a future release.Version 2.15.1 (September 6)
Rebuild with go 1.21.1
Rebuild Telepresence with go 1.21.1 to address CVEs.
Set security context for traffic agent
Openshift users reported that the traffic agent injection was failing due to a missing security context.
Version 2.15.0 (August 29)
Add ASLR to telepresence binaries
ASLR hardens binary sercurity against fixed memory attacks.
Added client builds for arm64 architecture.
Updated the release workflow files in github actions to including building and publishing the client binaries for arm64 architecture.
KUBECONFIG env var can now be used with the docker mode.
If provided, the KUBECONFIG environment variable was passed to the kubeauth-foreground service as a parameter. However, since it didn't exist, the CLI was throwing an error when using
telepresence connect --docker
.Fix deadlock while watching workloads
The
telepresence list --output json-stream
wasn't releasing the session's lock after being stopped, including with a telepresence quit
. The user could be blocked as a result.Change json output of telepresence list command
Replace deprecated info in the JSON output of the telepresence list command.
Version 2.14.4 (August 21)
Nil pointer exception when upgrading the traffic-manager.
Upgrading the traffic-manager using
telepresence helm upgrade
would sometimes result in a helm error message executing "telepresence/templates/intercept-env-configmap.yaml" at <.Values.intercept.environment.excluded>: nil pointer evaluating interface .excluded"
Version 2.14.2 (July 26)
Telepresence now use the OSS agent in its latest version by default.
The traffic manager admin was forced to set it manually during the chart installation.
Version 2.14.1 (July 7)
Envoy's http idle timout is now configurable.
A new
agent.helm.httpIdleTimeout
setting was added to the Helm chart that controls the proprietary Traffic agent's http idle timeout. The default of one hour, which in some situations would cause a lot of resource consuming and lingering connections, was changed to 70 seconds.Add more gauges to the Traffic manager's Prometheus client.
Several gauges were added to the Prometheus client to make it easier to monitor what the Traffic manager spends resources on.
Agent Pull Policy
Add option to set traffic agent pull policy in helm chart.
Resource leak in the Traffic manager.
Fixes a resource leak in the Traffic manager caused by lingering tunnels between the clients and Traffic agents. The tunnels are now closed correctly when terminated from the side that created them.
Fixed problem setting traffic manager namespace using the kubeconfig extension.
Fixes a regression introduced in version 2.10.5, making it impossible to set the traffic-manager namespace using the telepresence.io kubeconfig extension.
Version 2.14.0 (June 12)
DNS configuration now supports excludes and mappings.
The DNS configuration now supports two new fields, excludes and mappings. The excludes field allows you to exclude a given list of hostnames from resolution, while the mappings field can be used to resolve a hostname with another.
Added the ability to exclude environment variables
Added a new config map that can take an array of environment variables that will then be excluded from an intercept that retrieves the environment of a pod.
Fixed traffic-agent backward incompatibility issue causing lack of remote mounts
A traffic-agent of version 2.13.3 (or 1.13.15) would not propagate the directories under
/var/run/secrets
when used with a traffic manager older than 2.13.3.Fixed race condition causing segfaults on rare occasions when a tunnel stream timed out.
A context cancellation could sometimes be trapped in a stream reader, causing it to incorrectly return an undefined message which in turn caused the parent reader to panic on a
nil
pointer reference.Routing conflict reporting.
Telepresence will now attempt to detect and report routing conflicts with other running VPN software on client machines. There is a new configuration flag that can be tweaked to allow certain CIDRs to be overridden by Telepresence.
test-vpn command deprecated
Running telepresence test-vpn will now print a deprecation warning and exit. The command will be removed in a future release. Instead, please configure telepresence for your VPN's routes.
Version 2.13.3 (May 25)
Add imagePullSecrets to hooks
Add .Values.hooks.curl.imagePullSecrets and .Values.hooks curl.imagePullSecrets to Helm values.
Change reinvocation policy to Never for the mutating webhook
The default setting of the reinvocationPolicy for the mutating webhook dealing with agent injections changed from Never to IfNeeded.
Fix mounting fail of IAM roles for service accounts web identity token
The eks.amazonaws.com/serviceaccount volume injected by EKS is now exported and remotely mounted during an intercept.
Correct namespace selector for cluster versions with non-numeric characters
The mutating webhook now correctly applies the namespace selector even if the cluster version contains non-numeric characters. For example, it can now handle versions such as Major:"1", Minor:"22+".
Enable IPv6 on the telepresence docker network
The "telepresence" Docker network will now propagate DNS AAAA queries to the Telepresence DNS resolver when it runs in a Docker container.
Fix the crash when intercepting with --local-only and --docker-run
Running telepresence intercept --local-only --docker-run no longer results in a panic.
Fix incorrect error message with local-only mounts
Running telepresence intercept --local-only --mount false no longer results in an incorrect error message saying "a local-only intercept cannot have mounts".
specify port in hook urls
The helm chart now correctly handles custom agentInjector.webhook.port that was not being set in hook URLs.
Fix wrong default value for disableGlobal and agentArrival
Params .intercept.disableGlobal and .timeouts.agentArrival are now correctly honored.