Skip to content
Cloudkrunch
Linkedin

Tailscale Subnet Router Improvements

Networking, AWS, Infrastructure5 min read

Networking

It's been awhile since I've written and I wanted to talk a little bit about what I've been up to. I've been pretty busy finding a job so a lot of my time has been put into that. Over that time though, I've been building a project that I won't elaborate too much on, but wanted to share some improvements to my previous article on tailscale subnet routers. It is a pretty short article, but if you don't feel like reading it the summary is that I wrote some Terraform code to provision a Tailscale subnet router.

What is Tailscale and a subnet router

If you don't know what Tailscale is, it is basically a VPN service that allows you to securely connect to remove devices and networks. You can learn more about it here, I'm not affiliated with them, but like their product and free tier. The subnet router is a networking component that allows me to connect to an EC2 instance in my AWS VPC and traverse to internally advertised routes.

In my new project, I had the need to connect to a development database from a few applications. My development services are running in minikube for the time being since I don't want to pay to host them in EKS. Reopening my previous Tailscale project, I found myself feeling like there were many improvements I could make.

What I wanted to improve

  1. Remove the need for SSH (key pair and security group rules)
  2. Configure the VM to run the subnet router without manually doing it

Removing SSH

This was the easier of the two because EC2 can naturally support Systems Manager connection by running the SSM agent on the VM. This required two things, making sure the SSM agent was installed in my user-data script on startup and creating an IAM profile for the VM to run as. To download the SSM agent I added the following line to aws-user-data.sh.

mkdir /tmp/ssm
cd /tmp/ssm
wget https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/debian_amd64/amazon-ssm-agent.deb
sudo dpkg -i amazon-ssm-agent.deb
sudo systemctl enable amazon-ssm-agent

This will download the SSM agent and make systemctl manage the application.

Secondly, the VM needed to have the policy AmazonSSMManagedEC2InstanceDefaultPolicy for proper permissions to allow SSM to connect to the instance. In my infrastructure repository I added the following Terraform code that attaches the policy and allows EC2 to assume the role when you attach the IAM profile.

resource "aws_iam_role" "tailscale_iam_role" {
name = "tailscale-iam-role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
},
"Effect": "Allow",
"Sid": ""
}
]
}
EOF
managed_policy_arns = [
"arn:aws:iam::aws:policy/AmazonSSMManagedEC2InstanceDefaultPolicy"
]
tags = var.tags
}
resource "aws_iam_instance_profile" "tailscale_iam_profile" {
name = "tailscale-iam-profile"
role = aws_iam_role.tailscale_iam_role.name
tags = var.tags
}

Automatically start the subnet router

This implementation was a little trickier. The plan I came up with to address this improvement was to take these steps.

  1. Store a Tailscale auth key in Secrets Manager
  2. Pull the secret from the user-data script
  3. Pull the package into the VM
  4. Have systemctl manage tailscale
  5. Pipe variables into the script for the secret ARN and the internal routes

For 1 and 2, I created a simple customer managed secret in Secrets Manager by provisioning a KMS Key for encrypting/decrypting the secret. That was created in Terraform and I used the outputs from those resources to pass the ARN to my IAM changes that needed to be made. Since the VM needed to pull this secret, I changed the previously mentioned IAM profile to also have a new role that looked like this.

data "aws_iam_policy_document" "tailscale_auth_secret_policy" {
# Get access to specific secret
statement {
actions = ["secretsmanager:GetSecretValue"]
resources = [var.tailscale_auth_arn]
effect = "Allow"
}
statement {
actions = ["kms:Decrypt"]
resources = [var.tailscale_secret_key]
effect = "Allow"
}
}
resource "aws_iam_policy" "tailscale_iam_policy" {
name = "tailscale-policy"
description = "A policy that allows tailscale to pull its auth key from secrets manager"
policy = data.aws_iam_policy_document.tailscale_auth_secret_policy.json
}

I also added the awscli to the user-data script by using the command apt install awscli -y.

With the IAM profile updated, I turned to the user-data script to tackle 3, 4, and 5. To download the necessary packages to run Tailscale I added these lines.

# Download Tailscale
curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/bionic.gpg | sudo apt-key add -
curl -fsSL https://pkgs.tailscale.com/stable/ubuntu/bionic.list | sudo tee /etc/apt/sources.list.d/tailscale.list
sudo apt-get update
sudo apt-get install tailscale -y
# Configure for IP forwarding
echo 'net.ipv4.ip_forward = 1' | sudo tee -a /etc/sysctl.d/99-tailscale.conf
echo 'net.ipv6.conf.all.forwarding = 1' | sudo tee -a /etc/sysctl.d/99-tailscale.conf
sudo sysctl -p /etc/sysctl.d/99-tailscale.conf

The downloading of the packages is fairly simple and I followed Tailscale's documentation on configuring the sysctl kernel tunables for IP forwarding.

To start utilizing the package I had systemctl manage the service and called tailscale up with my desired subnets to advertise and the auth key.

# Tailscale up with auth key
sudo systemctl enable --now tailscaled
AUTH_KEY=$(aws secretsmanager get-secret-value --secret-id {{ secret_arn }} --query SecretString --output text --region us-west-2)
sudo tailscale up --advertise-routes={{ advertised_routes }} --authkey $AUTH_KEY

Now the last part, doing variable replacement for the above code. I used the replace function in terraform to put my desired values into the script.

user_data = replace(replace(file("aws-user-data.sh"),
"{{ advertised_routes }}", local.tailscale_advertised_routes),
"{{ secret_arn }}", var.tailscale_auth_arn)

The local.tailscale_advertised_routes was formatted in a local field by using the splat operator to decompose my list of subnets objects.

locals {
tailscale_advertised_routes = join(",", var.rds_private_subnets[*].cidr_block)
}

This made the tailscale_advertised_routes in the form of "10.0.0.0/24,10.0.1.0/24,10.0.2.0/24" that the Tailscale up command was expecting.

Summary

As a devops focused engineer, I couldn't resist improving this Terraform code when I revisited it. It wasn't too much of a time commitment and I hope this can help someone else with their project or work one day. If you enjoyed the article, please send it to a friend or share it on social media. Thanks for reading!