Amazon VPC Lattice - Multi-cluster secure communication¶
This pattern showcases secure multi-cluster communication between two EKS clusters in different VPCs using VPC Lattice with IAM authorization. It illustrates service discovery and highlights how VPC Lattice facilitates communication between services in EKS clusters with overlapping CIDRs, eliminating the need for networking constructs like private NAT Gateways and Transit Gateways.
Scenario¶
With this solution we showcase how to configure Amazon VPC Lattice using the AWS Gateway API Controller in order to manage Amazon VPC Lattice resources through native Kubernetes Gateway API objects. This pattern deploys two distinct VPCs each having it's own EKS cluster, which contain an application that will be used to demonstrate cross-cluster communication.
The cross-cluster communication will be established through Amazon VPC Lattice, using a private Amazon Route53 domain name protected by a TLS Certificate issued by Certificate Manager (ACM) and supported by an AWS Private Certificate Authority.
Deploy¶
This pattern is composed of 3 Terraform stacks, that needs to be deployed in order.
1. Create shared Environment¶
First, we need to deploy the shared environment:
Deploy terraform environment stack:
2. Create EKS cluster 1¶
Now we will deploy the 2 EKS clusters to match our architecture diagram
Deploy terraform cluster1 stack:
Configure Kubectl:
3. Create EKS cluster 2¶
Deploy terraform cluster2 stack:
Configure Kubectl:
Validate¶
In order to validate the connectivity between our services, we can use the following commands:
- From cluster1 app1, call cluster2 app2 -> success
kubectl --context eks-cluster1 exec -ti -n apps deployments/demo-cluster1-v1 -c demo-cluster1-v1 -- curl demo-cluster2.example.com
- From cluster1 app1, call cluster1 app1 -> forbidden
We can see the response if we call the service but don't have the authorization from VPC lattice:
kubectl --context eks-cluster1 exec -ti -n apps deployments/demo-cluster1-v1 -c demo-cluster1-v1 -- curl demo-cluster1.example.com
AccessDeniedException: User: arn:aws:sts::12345678910:assumed-role/vpc-lattice-sigv4-client/eks-eks-cluste-demo-clust-1b575f8d-fb77-486a-8a13-af5a2a0f78ae is not authorized to perform: vpc-lattice-svcs:Invoke on resource: arn:aws:vpc-lattice:eu-west-1:12345678910:service/svc-002349360ddc5a463/ because no service-based policy allows the vpc-lattice-svcs:Invoke action
This is because in the VPC lattice IAMAuth Policy of Application1 we only authorize call from namespace apps from cluster2:
kubectl --context eks-cluster1 get IAMAuthPolicy -n apps demo-cluster1-iam-auth-policy -o json | jq ".spec.policy | fromjson"
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::12345678910:root"
},
"Action": "vpc-lattice-svcs:Invoke",
"Resource": "*",
"Condition": {
"StringEquals": {
"aws:PrincipalTag/eks-cluster-name": "eks-cluster2",
"aws:PrincipalTag/kubernetes-namespace": "apps"
}
}
}
]
}
- From cluster2 app2, call cluster1 app1 -> success
We can also validate the reverse flow, from cluster2 to cluster1:
kubectl --context eks-cluster2 exec -ti -n apps deployments/demo-cluster2-v1 -c demo-cluster2-v1 -- curl demo-cluster1.example.com
- From cluster1 app2, call cluster1 app2 -> forbidden
kubectl --context eks-cluster2 exec -ti -n apps deployments/demo-cluster2-v1 -c demo-cluster2-v1 -- curl demo-cluster2.example.com
AccessDeniedException: User: arn:aws:sts::12345678910:assumed-role/vpc-lattice-sigv4-client/eks-eks-cluste-demo-clust-a5c2432b-b84a-492f-8cbc-16f1fa5053eb is not authorized to perform: vpc-lattice-svcs:Invoke on resource: arn:aws:vpc-lattice:eu-west-1:12345678910:service/svc-00b57f32ed0a7b7c3/ because no service-based policy allows the vpc-lattice-svcs:Invoke action
Important¶
In this setup, we used a Kyverno rule to inject iptables rules, and an envoy sidecar proxy into our application pod:
- The iptables rule will route traffic from our application to the envoy proxy (the rule don't apply if source process gid is 0, so we provide a different gid for the application: runAsGroup: 1000
).
- The envoy proxy retrieve our Private CA certificate on startup, and install it so that it trust our VPC lattice service, through it's startup script:
kubectl --context eks-cluster1 exec -it deploy/demo-cluster1-v1 -c envoy-sigv4 -n apps -- cat /usr/local/bin/launch_envoy.sh
Output:
#!/bin/sh
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: MIT-0
cat /etc/envoy/envoy.yaml.in | envsubst \$AWS_REGION,\$JWT_AUDIENCE,\$JWT_JWKS,\$JWT_ISSUER,\$JWKS_HOST,\$APP_DOMAIN > /etc/envoy/envoy.yaml
aws acm-pca get-certificate-authority-certificate --certificate-authority-arn $CA_ARN --region $AWS_REGION --output text > /etc/pki/ca-trust/source/anchors/internal.pem
update-ca-trust extract
cat /etc/envoy/envoy.yaml
/usr/local/bin/envoy --base-id 1 -l trace -c /etc/envoy/envoy.yaml
- Then envoy sign the request with sigv4, and proxy it in https to the targeted service.
You can find here the Dockerfiles used to create the envoy proxy, the iptables and the http-server images used in this demo.
Cleanup¶
1. We start by deleting the cluster2 terraform stack.¶
Note that we need to do this in order, so that our kubernetes controllers will be able to clean external resources before deleting the controller, and kubernetes nodes.
If the VPC was not able to destroy, you may want to re-run the destroy command a second time
2. We can then delete the cluster1 terraform stack.¶
If the VPC was not able to destroy, you may want to re-run the destroy command a second time
If the VPC lattice service network still exists, you can remove it with the following command:
SN=$(aws vpc-lattice list-service-networks --query 'items[?name==`lattice-gateway`].id' --output text)
if [ -n "$SN" ]; then
aws vpc-lattice delete-service-network --service-network-id "$SN"
fi