Amazon Elastic Kubernetes Service (EKS) provides managed Kubernetes clusters on AWS. Terraform allows planning, deploying, and managing EKS clusters just as it does for other AWS resources. With over 4.9k GitHub stars, Terraform-AWS-EKS is a very popular third-party Terraform module for deploying EKS clusters. This module provides many features and configurability, but at the cost of an extra dependency and creating an extra layer of abstraction. If you would rather not use this module, or would just like to understand how to configure EKS resources on your own, I have published a minimal reference Terraform module: https://github.com/supracarol/terraform-eks-reference/.

Features

  • Creates an EKS cluster in an existing VPC
  • Configures necessary EKS add-ons for the cluster
  • Creates a managed node group from a custom launch template
  • Provides the ability to add extra security groups to your node group
  • Configures ability to connect to worker node instances through SSM

Add-Ons

The following plugins are included to enable basic functionality:

1. VPC CNI Plugin

This is installed by default on EKS clusters, but I manually install it here to enable prefix delegation. This significantly increases the number of pods you can run per node:

# Snippet from add-ons.tf
resource "aws_eks_addon" "vpc_cni" {
  cluster_name                = aws_eks_cluster.cluster.name
  addon_name                  = "vpc-cni"
  configuration_values = jsonencode({
    env = {
      ENABLE_PREFIX_DELEGATION = "true"
      WARM_PREFIX_TARGET       = "1"
    }
  })
  ...

You can verify prefix delegation is installed correctly after deploying with:

kubectl describe ds -n kube-system aws-node | grep ENABLE_PREFIX_DELEGATION: -A 3

You can also verify with:

kubectl get node <node> -o jsonpath='{.status.allocatable.pods}'

For a t3.medium instance, this command returned 110 allocatable pods.

2. EBS CSI Driver

Provisions EBS volumes for pod storage. If you’re using a different storage backend, don’t use this plugin.

Both of these add-ons utilize IRSA to obtain the required AWS IAM permissions to function. See my post IRSA Made Simple for an explanation.

Extra Security Groups

By default, node groups are assigned the cluster’s security group, and there is no easy way to assign extra security groups to your nodes. You cannot just add extra security groups to your aws_node_group resource directly as you might expect. To work around this limitation, and for more control over other items of the deployment, create a launch template for your node group and assign the security groups to the launch template:

resource "aws_launch_template" "node_group" {
  ...
  vpc_security_group_ids = concat(
    [aws_eks_cluster.cluster.vpc_config[0].cluster_security_group_id],
    var.node_group_extra_security_groups
  )

This is the way I have found to solve the issue described in this Reddit post, but without using the terraform-aws-eks module.

Disk Size

Since we’re using a custom template, set the disk size for the nodes in the launch template. You can set disk size in the node group definition directly if you’re not using a launch template, but here we set in the launch template:

resource "aws_launch_template" "node_group" {
...
block_device_mappings {
    device_name = var.launch_template_device_name

    ebs {
      volume_size           = var.node_group_disk_size
      volume_type           = "gp3"
      delete_on_termination = true
      encrypted             = true
    }
  }

}

Enabling Worker Node Access to Cluster API

Add your VPC’s NAT IP address to var.cluster_public_access_cidrs so your worker nodes can access the cluster API to initialize and communicate.

resource "aws_eks_cluster" "cluster" {
  ...
  vpc_config {
    public_access_cidrs = var.cluster_public_access_cidrs
  }
}

SSM Access

Add this policy to your node group to allow SSM access:

resource "aws_iam_role_policy_attachment" "ssm" {
  role       = aws_iam_role.eks_node.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore"
}

Now you can shell into your worker nodes over SSM. For example, using awscli directly:

aws ssm start-session --target <instance_id>

Allow Pulling Images from ECR

Add this policy to your node group to allow your nodes to pull images from Elastic Container Registry (ECR):

resource "aws_iam_role_policy_attachment" "AmazonEC2ContainerRegistryReadOnly" {
  role       = aws_iam_role.eks_node.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
}