An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
AKS konnectivity-agent tunnel instability causing intermittent 503s on admin operations
Bernard Kowalski
0
Reputation points
Cluster aks-asclepius-prod (resource group rg-aks, East US 2, K8s 1.35.1, private cluster with KMS etcd encryption) experiences persistent konnectivity tunnel failures.
Symptoms:
- konnectivity-agent pods enter cannot connect once error loops, unable to re-establish tunnel to control plane proxy servers
- rpc error: code = Unavailable desc = connection error: desc = "error reading server preface: EOF"
- Intermittent 503 Service Unavailable on kubectl exec, port-forward, and logs
- Pod-to-pod application traffic is unaffected
Environment:
- Single node: Standard_D4ps_v6 (ARM64)
- KMS etcd encryption via public Key Vault (asclepius-infra-prod) — ruled out as cause (0 throttled requests, ~36ms avg latency)
- Dev cluster (same region, also private, no KMS) shows identical EOF churn but functions correctly
Troubleshooting performed:
- Rollout restart of konnectivity-agent — no improvement
- az aks update --yes control plane reconcile — temporarily restores tunnel, but issue recurs
Ask: Investigate why konnectivity-agent pods intermittently fail to reconnect to control plane proxy servers, requiring manual control plane reconciliation to restore admin operations.
Azure Kubernetes Service
Azure Kubernetes Service
Sign in to answer