Forever hosting

A recent Hacker News thread, "Ask HN: Best way to host a website for 500 years?", prompts me to dust off this article that I’ve had sitting in my drafts folder for about four years now and haven’t been able to finish (apart from this offshoot piece, "The 100-year challenge", which is about how hard it is to make anything last over really long time scales). The basic question posed in the thread is this:

Say you wanted to host a personal page that can outlive you and be seen by the children of your grandchildren. Other than asking your progeny to keep paying the hosting bills, is there another way?

The thread devolves into two main categories of comments; either:

  1. Proposed technical solutions; or:
  2. Observations that it’s unlikely that anybody is going to care 500 years from now about anything that you wrote.

You can see plenty of evidence in the HN thread that the first category is difficult, whether it be in terms of technology challenges, or setting up durable legal structures, or financially.

As far as the second category goes, just look at the growth of the web:

Year Websites
2021 1.9B+
2015 863M
2010 207M
2005 65M
2000 17M
1995 23K
1990 0

By any metric you might care to measure (sites, pages, users etc), the amount of "stuff" — and producers of "stuff" — on the internet is growing exponentially. It’s difficult enough to attract attention in the present day; exponential growth makes the odds of producing something that will be considered noteworthy 500 years from today vanishingly small.

But all this is very abstract. I can make this much more concrete by looking at how well I’ve done at preserving data over my own puny lifespan.

  • First up, on technical difficulty: I’ve tried hard to keep all the email I ever sent or received, and failed impressively, despite being obsessive about backups; through numerous lossy migrations between applications, providers, hosts, the oldest emails I can find date back about 15 years — this means I’ve straight up lost my first 10 or so years of email.
  • Second, on noteworthiness: You would think I’d be interested in my own photos, but the sad truth is that even a "modestly sized" collection like mine (43.5K photos) is lethargy-inducingly large; I occasionally dig back through it looking for specific things, but computer-assisted search still ain’t that great despite our advances in machine learning — it’s hard for me to imagine my heirs having enough interest to trawl through all this stuff, with all the other stuff that will be competing for their attention in the future.

All of this leads me to conclude that while it is a fun intellectual exercise to think about how to make one’s data as durable as possible, one shouldn’t be deceived about any of this stuff having transcendental significance: none of this really matters enough to warrant making anything but a "reasonable best effort" at keeping things preserved, and one that doesn’t distract us too much as we head outside to go and smell the roses.

Writing is getting harder

For a while now I’ve been noticing that it is harder and harder to actually finish a blog post. I’ll arrive in my text editor with some inspiration and instead of sharing my flash of insight with the world, I find myself burying it under layer after layer of context and historical explanation. That is, instead of just saying, "I think this", I say, "I think this, but before I explain in detail what I mean, let me tell you how I got here — originally, I thought A, then B, and after a while C, D, and E". Not only does this make the posts longer and harder to write, it makes them harder to read as well, because it’s really hard to weave all of those elements into a coherent narrative — it’s all too easy for the piece to wind up being a meandering, rambling nostalgia-trek as opposed to a forceful statement. I’d rather be in the business of making memorable, forceful statements that stick in people’s minds than writing 20,000-word long-form blog posts that may or may not hold a reader’s interest to the end.

This writing style which not only expresses an idea but also contrasts it with previous ideas might have served me well in my 20s when the format basically boiled down to "I used to think A but now I think B", or when writing at university, where your literature review was supposed to cover all the significant ideas that had been thought up to that date. But now, it’s a real hindrance to getting things out there. I mean, sure, there is (obviously!) some value in exploring nuances and contrasting different view points, in seeing how parameters that may have seemed fixed ended up being flexible, and how experience, new information, and changing circumstances endlessly shape one’s view of the world. In fact, a growing awareness of this is what we’re referring to when we speak of the wisdom of elders. But no matter how interesting it may be, it’s no good if it gets in the way of actually publishing something.

Moving forward, I’m going to see if I can fight off this demon (ha, I actually wrote daemon for a second there 😂) and impose some discipline on myself. Shorter posts (like this one), even if it means jettisoning a bunch of potentially interesting context.

Simplifying my Ansible set-up

Ansible is the worst automation platform out there, except for all others.

— Sir Winston Churchill

You could say I have a "love-hate" relationship with Ansible. After using Puppet and Chef in work environments, I found them to be utter overkill for any personal projects. In contrast, Ansible promised to be a "radically simple IT automation platform" (filthy lies!), and compared to the others, it is. For my use case (maintaining a couple of EC2 instances), it works pretty well. There is no "Chef server" or "Puppet master" orchestrating things at the heart of the system: there is just a Git repo with some configuration files in it (just on my local laptop) and an Ansible executable that I can run directly and which will ssh up into EC2 to do the work.

But it is still pretty complicated. The project itself is huge, and its dependency footprint is big too. The whole thing is in Python, limiting my ability to debug or modify it when things go wrong (seeing as I am not a "Pythonista"). And it is pretty slow: every little command you run requires a new SSH connection to the server (even if you reduce the overhead by using SSH’s ControlMaster functionality, it’s still slow). In the end I’ve had to implement cumbersome workarounds to address the performance issues, like telling Ansible to upload a Bash script to the server that does something to 40 different Git repos all at once, instead of telling Ansible itself to do the work. It kind of feels like having a fancy mesh WiFi network in your home, but then running ethernet cables all over the floor connecting all the rooms together.

The sheer amount of code involved in Ansible makes upgrades scary. Last time I looked, a clean copy of the Ansible repo clocked in at well over 200 megabytes. For a while I was even using Ansible to set up my local laptop, but my trepidation about its footprint and the fear of things breaking on updates eventually led me to throw it out and write my own framework instead. All I need to do on my local machine is edit a file here and there, set up some links, maybe install some things or run some scripts, so my tiny home-grown tool suffices.

For my EC2 use case, however, I’m still not ready to throw out Ansible. I don’t want to have to deal with platform differences and network communications, which are two of the things that Ansible basically has totally figured out at this point.

Amazon has "Amazon Linux 2" now, and the "Amazon Linux" machines that I’ve been using for many years need to be migrated. You can’t just upgrade; you have to set up everything again. There have been some reasonably important changes between versions (like switching to systemd), which mean I may as well start from scratch and take the opportunity to redo, update and simplify things as much as possible. This is an opportunity to pay off technical debt, do some upgrades, and set things up "The Right Way™".

Before starting, I sought to simplify my arrangements on the instances as much as possible. For example, I had some static sites hosted on one of these machines which could be offloaded to GitHub pages. And I had some private Git repos that I was backing up by taking EBS snapshots of their volumes, which I could also just mirror off to GitHub as private repos (and once I had that offsite backup, I could stop doing the EBS snapshots). And this in turn meant that I could simplify the volume structure: instead of having a separate XFS-formatted /data volume, I could just have everything on the root filesystem (XFS is now the default format, and I don’t even care about keeping things separate as I can now recreate any instance from scratch based on data available elsewhere).

I’ve always been skeptical of putting too many eggs into corporate baskets, taking great pains to minimize my dependence on Google, for example. For the longest time I didn’t push anything private to GitHub for this reason, even though their servers are most certainly safer and better maintained than my "lone wolf" amateur EC2 instances. But over the years, I’ve also realized that the real value of a lot of this private data that I’ve been pushing to my secret repos isn’t actually so great after all. It could be irrecoverably lost to virtually no consequence, and it could be leaked or exposed with only a little discomfort and inconvenience. Added to that, I actually started working for GitHub last month and I figure that if a company with a multi-trillion-dollar market cap like Microsoft is prepared to place a bet on GitHub, then little old me shouldn’t have any qualms about it — I have much less to lose, after all.

One of these EC2 instances hosts this blog, and I was able to simplify that too. When I set up the old instance (back in 2015) a large chunk of the content in the blog was written in "wikitext" format, and that was turned into HTML using a Sinatra (Ruby) microservice. Since then, I migrated all the wikitext to Markdown (a fun story in itself) and spun down the microservice. That means the instance no longer needs Ruby or RubyGems.

The other EC2 instance was running PHP for a couple of domains (www.wincent.com and secure.wincent.com). I simplified my set-up on that instance by making a static mirror of all the files and folding them into wincent.com itself, running on the other instance). This is the 5,000-file/1,000,000-line commit where I brought all that content across. The follow-up commit where I ran all the static HTML/"PHP" through Prettier is pretty epic, clocking in at over 3,000,000 lines. I also updated a quarter of a million links in this commit. Fun times.

The great thing about all these simplifications and migrations is that my instances are now close to being, effectively, "stateless". That is, I don’t really have to worry about backing them up any more because I can recreate them from scratch by a combination of bootstrapping with Ansible, and git push-ing data to them to seed them with content. If I lose my laptop and GitHub destroys my data then I’m in trouble, but I feel reasonably safe with three-fold redundancy (ie. the instance + my local copy + GitHub’s). It’s not infallible by any means, but it definitely meets the bar of "good enough"; at least, good enough that I’m not going to lose any sleep over all this.

Moving to Amazon Linux 2 was a pain in some ways (ie. having to rewrite Upstart scripts as systemd units) and great in others (eg. having access to recent versions of Monit, Redis and other software without having to build from source; in the end, the only software I had to actually build was a recent version of NodeJS on one of the hosts). Along the way, I also moved from acme.sh (which recently sold out to commercial interests) to acme-tiny (which sounds like my personal Let’s Encrypt spirit animal, being "a tiny, auditable script … currently less than 200 lines"), and made numerous improvements to make the certificate renewal process more robust. I even went so far as to finally set up a proper "non-root" IAM user for doing my admin work in the AWS console. Key pairs were rotated, security groups cleaned up, Subject Alternate Names trimmed, and so on. Basically, took the opportunity to pay off as much tech debt as I could as I went.

The above simplifications meant that my overall requirements were now basic enough that I could dispense with most of the abstractions that Ansible provides (like group variables, roles, and so on) and just put everything in a single playbook. This is really marvellous for maintenance: it is a 1.5k-line YAML file, but that includes everything (tasks, handlers and variables for two hosts), and it all reads 100% linearly with no abstraction clouding or concealing what’s actually happening — you can just read it from top to bottom and see exactly what is going to happen on both hosts. Now, there is some repetition in there that could be factored out, but the repetition in this case is what keeps the whole thing simple. I’m probably not going to touch it. Additionally, getting rid of roles means that all of my templates and files are consolidated in a single location in the repo root instead of being dispersed over a dozen or so subdirectories hidden three-levels deep.

I was a bit worried that in moving from Ansible 2 to Ansible 4 I was going to have to deal with a huge amount of breakage, but in the end it wasn’t too bad at all. Most stuff still works, and I was able to do almost everything I need using the ansible.builtin collection alone (only dipping into the ansible.posix collection for one task on each host, concretely, using the ansible.posix.authorized_key module). I do find the whole collections thing to be unpleasantly over-engineered, and I wish I didn’t have to know that "Ansible Galaxy" was a thing, but in the end I was able to mostly pretend that Galaxy doesn’t exist, by adding the ansible.posix repo as a Git submodule checked out at vendor/ansible_collections/ansible/posix, and setting collections_paths = ./vendor in my ansible.cfg.

A similar dance with Python, moving from virtualenv (a separate tool) to venv (bundled with Python) for creating a sandbox environment, allowed me to use the aws-cli tool from a submodule without having to reach out over the network with pip every time I wanted to do something. I still wish that isolation and reproducibility were easier to achieve in the Python ecosystem (and maybe it is, for experts), but I was able to get done what I needed to do, in the end.

So with that, that brings to a conclusion my migration from a pair of trusty EC2 instances that had been launched all the way back in 2015. We’ll see whether their 2021 successors also last nearly 6 years, and whether the move to "Amazon Linux 3" ends up being any more straightforward thanks to the simplification and updates I’ve undertaken now. Hopefully, major system components like systemd and yum will still be there, so the next update will be a breeze.