Static Hosting With Lambda@Edge

Static hosting a website on AWS used to be a major hassle. The service lacked many of the key features expected for managed static hosting—solutions that were readily available from companies like Netlify and Surge.

Setting up a static website on AWS required hours of fighting with policies and configs to get things up and running — and offered limited redirect functionality, no pushState routing, and wacky CloudFront setups.

Those days are now behind us with the witchcraft of Lambda@Edge.

Lambda@Edge is an extension of AWS Lambda, a compute service that lets you execute functions that customize the content delivered through CloudFront.

To demonstrate the power of Lambda@Edge, I’ll walk-through how to implement an actual edge function to rewrite the request for pushState routing and clean URLs.

And the best part — it costs next to nothing with infinite scalability and no restrictions or limitations. You only pay minimal AWS rates for file storage, data transfers, and occasionally running a cloud function. If your account is on the free tier, it will literally cost you nothing — unless you get really popular really fast.

How to Get Started

For the purpose of this tutorial, I’ll be using Create React App to generate a single page application and will define the infrastructure as code using Terraform. Feel free to use whatever front-end framework you want, as well as create everything in the AWS console if that’s your thing.

Dependencies

To get started, create and build a production-ready version of your app.

$ create-react-app my-app
$ cd my-app
$ npm run build

Writing the Edge Function

Lambda@Edge functions are very similar to regular Lambda functions. The exported handler is called, CloudFront data is passed to the event object, and you’re expected to invoke the callback function with data for a request or response — depending on whether you want to hit the origin or not.

With this basic understanding, it’s simple to modify the request before we pass it through to the origin. Go to the root of your project and create a file at cloud/rewrite.js to get started.

cloud/rewrite.js
exports.handler = (evt, ctx, cb) => {
	const { request } = evt.Records[0].cf
	request.uri = "/index.html"

	cb(null, request)
}

Simple enough. Now every request that goes through CloudFront will use the index file for the origin. But do you see the problem?

We don’t want to rewrite everything. We still need to request assets from the origin, and only want to rewrite when using routes defined in our SPA. So instead of rewriting everything like we did above, let’s assume any URL without a file extension should be rewritten.

cloud/rewrite.js
const path = require("path")

exports.handler = (evt, ctx, cb) => {
	const { request } = evt.Records[0].cf
	if (!path.extname(request.uri)) {
		request.uri = "/index.html"
	}

	cb(null, request)
}

Cool. Now we can request our images and CSS files without a problem.

But what if someone goes to my-app.com/about.html? Obviously we want this to use the /about route—but with our current function it will try to request about.html from the origin since it has a file extension.

Let’s add a condition to redirect any requests that have the .html extension.

cloud/rewrite.js
const path = require("path")
const { STATUS_CODES } = require("http")

function redirect(to) {
	return {
		status: "301",
		statusDescription: STATUS_CODES["301"],
		headers: {
			location: [
				{ key: "Location", value: to },
			],
		},
	}
}

exports.handler = (evt, ctx, cb) => {
	const { request } = evt.Records[0].cf
	
	const htmlExtRegex = /(.*)\.html?$/
	if (htmlExtRegex.test(request.uri)) {
		const uri = request.uri.replace(htmlExtRegex, "$1")
		return cb(null, redirect(uri))
	}
	
	if (!path.extname(request.uri)) {
		request.uri = "/index.html"
	}
	
	cb(null, request)
}

Now when someone makes a request with the suffix .html (or .htm), the function will cut short and they will be redirected to that same route without the extension. This is a simple example of how to modify requests and send responses to the CloudFront distribution.

Setting up the Infrastructure

Now that we have our edge function code written, let’s create the AWS infrastructure for our site. We’ll be using S3 for file storage, and obviously CloudFront and Lambda for our CDN.

To get started, go to your project root and create a file at infra/main.tf.

infra/main.tf
variable "app_name" {
	default = "my-app"
}

variable "profile" {
	default = "default"
}

provider "aws" {
	region = "us-east-1"
	profile = "${var.profile}"
}

You can change the name of your app, but I’ll go with my-app for now.

If you setup a named profile when you configured the AWS CLI, you’ll need to pass it to Terraform as a variable when applying the infrastructure, or change the default value above. There are several ways to authenticate the AWS provider if you don’t use profiles.

The region always needs to be us-east-1—all Lambda@Edge functions are created in that region and automatically replicated to the other regions by CloudFront.

S3 and CloudFront

The backbone of any static hosting; an object storage to host your files, and a CDN to serve them. It wouldn’t be the same without those services.

Create a file at infra/s3.tf and define your S3 bucket.

infra/s3.tf
resource "aws_s3_bucket" "main" {
	bucket_prefix = "${var.app_name}"
	acl = "private"
	force_destroy = true
	acceleration_status = "Enabled"
}

If you’ve used S3 hosting before, you’ll notice that we don’t define any special website hosting config on the bucket — it’s actually set to private.

This is intentional, because we don’t want public access on the bucket. We would rather force all visitors to view our content through CloudFront — otherwise our rewrites won’t work and this will all be for nothing.

We’ll also enable Transfer Acceleration, since we’re creating the bucket in us-east-1—you don’t want your deployments to take ages if you’re not in the United States. Technically things will still work if you create your bucket in a different region, but you may run into temporary redirect issues that will break things.

Since our bucket is not public, we’ll need to create a bucket policy that allows our CloudFront distribution to access our files. This can be done using an Origin Access Identity and a special bucket policy.

infra/s3.tf
resource "aws_cloudfront_origin_access_identity" "main" {
	comment = "Created for ${var.app_name}"
}

data "aws_iam_policy_document" "s3" {
	statement {
		actions = [
			"s3:ListBucket",
			"s3:GetObject"
		]
		resources = [
			"${aws_s3_bucket.main.arn}",
			"${aws_s3_bucket.main.arn}/*"
		]
		principals {
			type = "AWS"
			identifiers = ["${aws_cloudfront_origin_access_identity.main.iam_arn}"]
		}
	}
}

resource "aws_s3_bucket_policy" "main" {
	bucket = "${aws_s3_bucket.main.id}"
	policy = "${data.aws_iam_policy_document.s3.json}"
}

// ...

Now any CloudFront distributions associated with that identity will be allowed to download and cache our objects.

Lets create our distribution in infra/cloudfront.tf. I’m not using a custom domain for now, but if you are, you’ll add an aliases field as well as your SSL certificate setup.

infra/cloudfront.tf
resource "aws_cloudfront_distribution" "main" {
	enabled = true
	http_version = "http2"
	price_class = "PriceClass_All"
	is_ipv6_enabled = true
	origin {
		origin_id = "s3-origin"
		domain_name = "${aws_s3_bucket.main.bucket_domain_name}"
		s3_origin_config {
			origin_access_identity = "${aws_cloudfront_origin_access_identity.main.cloudfront_access_identity_path}"
		}
	}
	default_cache_behavior {
		target_origin_id = "s3-origin"
		allowed_methods = ["GET", "HEAD"]
		cached_methods = ["GET", "HEAD"]
		viewer_protocol_policy = "redirect-to-https"
		min_ttl = 0
		default_ttl = 3600
		max_ttl = 86400
		compress = true
		forwarded_values {
			query_string = false
			cookies {
				forward = "none"
			}
		}
	}
	viewer_certificate {
		cloudfront_default_certificate = true
	}
	restrictions {
		geo_restriction {
			restriction_type = "none"
		}
	}
}

Lambda and IAM

If you were hosting a boring static site with no edge functions, this is where you’d stop. But since you’re cooler than that, go ahead and create the Lambda@Edge function in infra/lambda.tf using the code we wrote earlier.

Terraform is able to zip our Javascript file and store it in a temporary location, which allows us to upload it directly to Lambda. Just make sure you ignore the temporary location from your version control — in this case infra/.zip/.

infra/lambda.tf
data "archive_file" "rewrite" {
	type = "zip"
	output_path = "${path.module}/.zip/rewrite.zip"
	source {
		filename = "index.js"
		content = "${file("${path.module}/../cloud/rewrite.js")}"
	}
}

resource "aws_lambda_function" "rewrite" {
	function_name = "${var.app_name}-rewrite"
	filename = "${data.archive_file.rewrite.output_path}"
	source_code_hash = "${data.archive_file.rewrite.output_base64sha256}"
	role = "${aws_iam_role.main.arn}"
	runtime = "nodejs10.x"
	handler = "index.handler"
	memory_size = 128
	timeout = 3
	publish = true
}

And of course, we’ll need to create our IAM role in infra/iam.tf to associate with our Lambda function. AWS requires this, even though we don’t have any specific permissions.

infra/iam.tf
data "aws_iam_policy_document" "lambda" {
	statement {
		actions = ["sts:AssumeRole"]
		principals {
			type = "Service"
			identifiers = [
				"lambda.amazonaws.com",
				"edgelambda.amazonaws.com"
			]
		}
	}
}

resource "aws_iam_role" "main" {
	name_prefix = "${var.app_name}"
	assume_role_policy = "${data.aws_iam_policy_document.lambda.json}"
}

resource "aws_iam_role_policy_attachment" "basic" {
	role = "${aws_iam_role.main.name}"
	policy_arn = "arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole"
}

Function Association

Now that we have our CloudFront distribution and our Lambda function created, the last step is to associate our function with a specific distribution trigger.

There are four different trigger types we can associate. The only one we care about for this tutorial is Origin Request — which is right before CloudFront sends the request to our origin server. Any cached requests won’t reach this point, so our function won’t need to run very often.

To associate our function, replace the default_cache_behavior block defined in our CloudFront distribution with this:

default_cache_behavior {
	target_origin_id = "s3-origin"
	allowed_methods = ["GET", "HEAD"]
	cached_methods = ["GET", "HEAD"]
	viewer_protocol_policy = "redirect-to-https"
	min_ttl = 0
	default_ttl = 3600
	max_ttl = 86400
	compress = true
	lambda_function_association {
		event_type = "origin-request"
		lambda_arn = "${aws_lambda_function.rewrite.qualified_arn}"
	}
	forwarded_values {
		query_string = false
		cookies {
			forward = "none"
		}
	}
}

That’s it! CloudFront will now send all uncached requests through our function and we can enable pushState routing and redirects in our app!

To create the actual infrastructure in your AWS account, run this command after installing Terraform:

$ cd infra
$ terraform apply

Deployments

You can deploy your site using one of the many S3 upload methods out there, but a simple way is to create a shell script like the one below. The script allows you to get output values from Terraform, invalidates the CDN, and prevents you from having to hard-code AWS-specific values.

infra/main.tf
// ...

output "bucket_name" {
	value = "${aws_s3_bucket.main.id}"
}

output "cloudfront_id" {
	value = "${aws_cloudfront_distribution.main.id}"
}

output "cloudfront_domain" {
	value = "${aws_cloudfront_distribution.main.domain_name}"
}

and then create a script located at deploy.sh. You can run it with the command sh ./deploy.sh AWS_PROFILE_NAME if you are using an AWS profile.

deploy.sh
#!/bin/bash

if [ ! -z $1 ]; then
  export AWS_DEFAULT_PROFILE=$1
fi

cd infra
BUCKET_NAME=$(terraform output bucket_name)
CLOUDFRONT_ID=$(terraform output cloudfront_id)
CLOUDFRONT_DOMAIN=$(terraform output cloudfront_domain)

cd ../
npm run build

aws s3 sync ./build s3://$BUCKET_NAME \
	--delete

aws cloudfront create-invalidation \
	--paths "/*" \
	--distribution-id $CLOUDFRONT_ID

echo "=> Deployed to $CLOUDFRONT_DOMAIN"

Lambda@Edge opens a lot of doors for functionality that wasn’t previously possible with static hosting on AWS. I hope you found this tutorial useful, and I’m looking forward to seeing what else can be built with it.

You can see the full source code for this tutorial on GitHub. For a great new library that adds some cool static hosting features to Lambda@Edge, check out Lambit.

Minimalism Isn't About Less Stuff