Table of contents
If you follow me on social media, you might’ve noticed a few tweets talking about how this blog is an overly complicated, multi-cloud and entirely serverless blog.
There are many platforms that provide everything I’ve built by hand out of the box, such as Amplify, Netlify or Vercel, and I would definitively recommend you to use one of these platforms if you’re building a personal website.
I wanted to dive deeper into how these platforms work and ended up building my own. In this article, I’ll work through the overall architecture, and dive into how some of the features work.
When starting with this project, I had a few requirements in mind:
- Multi-origin: the routing layer should be able to send traffic to various origins, across multiple locations and cloud providers.
- Entirely serverless: I didn’t want to update servers or take care of scaling them. That meant using object storage, functions, etc.
- Cheap: use free plans or services that bill per request as much as possible.
- Atomic deployments: users should only see files related to a single version at a time.
- Automated deployments: when I push to the main branch of the website’s repository, it should start a pipeline that will update the website.
- Previews: when I work on a new article or feature, I should have a preview version available online.
In the end, I settle on this architecture:
Users’ requests hit an originless CloudFlare worker, that will dispatch traffic to an origin close to them, which could be either Amazon S3, Google Cloud Storage, or Azure Blob Storage. I’m also using Honeycomb for telemetry and GitHub actions for continuous deployment.
For previews, there is another GitHub actions workflow that deploys to a single origin whenever I create a new pull request.
I decided to pick CloudFlare as my CDN since it has a pretty comprehensive free tier, and supports modifying response body on the fly (which will come in handy for content security policy later on).
CloudFlare provides a library to help build a worker in Rust and in WASM, but it’s still missing a few key features that I need – such as a way to interact with the Cache API. As such, I wrote the worker in Typescript instead.
I’m using 8 different origins across three cloud providers for this website. To keep with the spirit of making things as serverless as possible, I’ve opted for each cloud provider’s object storage solution.
There are a few tricky things when working with object storage directly, such as routing
/ requests to
/index.html, or each provider sending a bunch of headers that are not needed. Thankfully, I could transform that as needed in the worker.
Since I have eight different origins in total, I need to decide to which origin to send traffic. There are a few criteria I wanted to use:
- Geo-location: I divided origins into three main regions: America, Europe, and Asia. CloudFlare provides geo-coordinates for each request, which I could then use with the Haversine formula to find the closest origin to the user.
- Availability: One of the major points of having multiple origins is to handle whenever one of them goes down. CloudFlare Workers have two features that helped a lot here: CloudFlare Workers KV and Cron triggers, so I could check all origins periodically if they are still working available. I’ve also configured it to fail open if all origins are down.
In the end, the algorithm first removes origins that aren’t available, then filter for those in the same region as the end-user. If no origin is found after those two filters, I return all origins instead.
There are some flaws with this approach: if all origins in a region are down, I could return only the available origins in the other regions instead. However, as the probability of all origins in a given region being down is very low, I think this is acceptable.
From there, the worker randomly picks one of the origins, with a small twist. If I would just pick an origin randomly, each request could go to a different origin. As loading the website for the first time consists of a few different requests, I didn’t want them to just hit different origins.
To solve this, I derive an anonymous, stateless and time-windowed identifier for the user, which I used to pick an origin. It’s based on the properties that CloudFlare exposes in the request, plus the request’s timestamp divided by a time interval.
It’s possible that a user will send requests right at the border of two time windows, which would cause them to get responses from two different origins. However, this is a fairly rare occurrence. For example, if it takes 1 second to send all requests and you use a 300 seconds time window, that has a 0.33% chance of happening.
When uploading new files to an object storage solution, you might encounter some issues during the upload process:
- You’ll have a mix of old and new files.
- You could upload a file referring to another one that hasn’t been uploaded.
- You could be left with a lot of files that are no longer needed.
To address these, I decided to use prefixes in the object storage solutions. Whenever I push to the main branch of the repository, GitHub actions will look at the commit ID and use it as a prefix to upload all files. That means that each new version will be in its own folder.
Within the worker, I then use CloudFlare Workers KV once again to store the latest prefix. During an update, it can take up to 60 seconds to propagate to all edge locations, but this is fine as multiple requests from the user should be handled by the same location.
After the update, I can then go and delete old prefixes. The exact way to do that differs a bit for each cloud provider. Object storages typically use a flat storage structure, with no awareness of folders – but it’s sometimes possible to list prefixes nonetheless. This is part of the many papercuts of building a multi-cloud solution.
# List prefixes in an Amazon S3 bucket | | | | # List prefixes in an Azure Storage container | | | | # List prefixes in a Google Cloud Storage bucket
The AWS experts amongst you might notice that I’m using
aws s3api instead of
aws s3 ls. This is because it doesn’t provide machine-readable output. Also, please note that these queries might have a maximum number of results per API call, so you might need to paginate through them.
I haven’t opted to do so, but you could build a more complex system that will preserve a certain number of older prefixes in case you want to support fast rollbacks.
One of the big advantages of using a CDN like CloudFlare is its ability to cache responses from the origin. Since this is a static website, the content only changes when I push a new version. CloudFlare Workers already take care of this, but there is also a Cache API for fine-grained control.
One important thing with atomic deployments is to automatically clear the cache upon release to ensure consistency. The user experience would be pretty bad if someone is served files from different versions at the same time.
On the backend side, I opted to control the cache manually and add the prefix of the current version to determine if a file is cached or not. Upon release, I don’t need to invalidate the cache: the prefix gets updated, the next request doesn’t match anymore, and thus it will fetch the newest version from the origin. Since I don’t invalidate the cache, if I need to roll back for any reason, I can just change the prefix back.
Client-side, I use a technique called cache-busting. Appending a hash of the file in the query string tricks the browser into thinking it’s a completely new file and thus bypasses the client-side cache.
In this first article about the blog, I explored the core architecture that makes this website multi-cloud and serverless at the same time – with a decent amount of custom engineering to recreate existing features from other platforms.
There are other things I’d like to talk about, such as the frontend and design, security features, and more. If you’re interested in knowing more, make sure to follow me on Twitter or check the syndication feed.