An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
Multi-cluster L4 load balancing with Fleet Manager relies on a few specific primitives and network assumptions that can be used to troubleshoot this scenario.
- Sample image
The documentation’s walkthrough uses a simple HTTP workload (
kuard) only as an example to demonstrate that traffic is being distributed across pods in multiple member clusters. Any HTTP container image that exposes a TCP port can be used instead, as long as:
- The same
Servicename, namespace, and port are used on all member clusters. - The
ServiceExportandMultiClusterServiceobjects reference thatService.
The docs don’t prescribe a Microsoft Container Registry (MCR) image for this scenario; they only require that the workload be reachable on the configured port so that the curl <EXTERNAL-IP>:<port> test described in the guide works.
- Network requirements (private + public AKS in same VNet) From the conceptual and how-to documentation:
- All member clusters that participate in L4 multi-cluster load balancing must:
- Use Azure CNI networking so that pod IPs are directly routable on the VNet.
- Be on the same virtual network or on peered VNets.
- Fleet’s L4 capability then configures each member cluster’s Azure Load Balancer to route traffic not only to local endpoints but also to endpoints of the same
Servicein other member clusters.
The documentation does not add extra NSG or route-table requirements beyond those implied by “same VNet / peered VNets with Azure CNI.” In particular:
- Pod IPs must be reachable over the VNet between clusters.
- Azure Load Balancer must be able to reach those pod IPs.
When mixing private and public AKS clusters, this means:
- Ensure that any NSGs applied to the subnets or NICs of the public cluster do not block inbound traffic from the private cluster’s subnet(s) to the pod CIDR ranges used by that public cluster.
- Ensure that any custom route tables do not blackhole traffic between the clusters’ subnets.
The Fleet documentation does not define additional, Fleet-specific NSG or UDR rules beyond these general Azure CNI/VNet requirements.
- Verifying that ServiceImport has aggregated endpoints from both clusters The L4 flow is:
-
ServiceExportis created on each member cluster. - Fleet creates a corresponding
ServiceImporton the hub cluster and other member clusters to build awareness of the service. - A
MultiClusterServiceis created to configure the Azure Load Balancer in each member cluster to distribute traffic across endpoints in multiple clusters.
The how-to guide shows a basic validation step:
- On a member cluster, run
kubectl get multiclusterservice <name> -n <namespace>and confirmIS-VALIDistrueand that anEXTERNAL-IPis assigned. - Then repeatedly
curl <EXTERNAL-IP>:<port>and observe that the pod IPs serving the request change and correspond to pods in multiple member clusters.
In addition, the conceptual documentation explains that ServiceExport creation “results in a ServiceImport being created on the fleet cluster, and all other member clusters to build the awareness of the service.” This means that, for troubleshooting aggregation:
- Confirm that a
ServiceImportobject exists for the service on the hub cluster. - Confirm that
ServiceImportobjects exist on the member clusters that should be aware of the service.
The documentation does not provide a specific field-level example of the ServiceImport status or an explicit command to list per-cluster endpoint membership, but the presence of ServiceImport on hub and members, combined with IS-VALID: True on the MultiClusterService and successful cross-cluster pod IP rotation when curling the external IP, is the documented way to validate that endpoints are being aggregated and used.
References: