A recent Hacker News thread, "Ask HN: Best way to host a website for 500 years?", prompts me to dust off this article that I’ve had sitting in my drafts folder for about four years now and haven’t been able to finish (apart from this offshoot piece, "The 100-year challenge", which is about how hard it is to make anything last over really long time scales). The basic question posed in the thread is this:
Say you wanted to host a personal page that can outlive you and be seen by the children of your grandchildren. Other than asking your progeny to keep paying the hosting bills, is there another way?
The thread devolves into two main categories of comments; either:
- Proposed technical solutions; or:
- Observations that it’s unlikely that anybody is going to care 500 years from now about anything that you wrote.
You can see plenty of evidence in the HN thread that the first category is difficult, whether it be in terms of technology challenges, or setting up durable legal structures, or financially.
As far as the second category goes, just look at the growth of the web:
By any metric you might care to measure (sites, pages, users etc), the amount of "stuff" — and producers of "stuff" — on the internet is growing exponentially. It’s difficult enough to attract attention in the present day; exponential growth makes the odds of producing something that will be considered noteworthy 500 years from today vanishingly small.
But all this is very abstract. I can make this much more concrete by looking at how well I’ve done at preserving data over my own puny lifespan.
- First up, on technical difficulty: I’ve tried hard to keep all the email I ever sent or received, and failed impressively, despite being obsessive about backups; through numerous lossy migrations between applications, providers, hosts, the oldest emails I can find date back about 15 years — this means I’ve straight up lost my first 10 or so years of email.
- Second, on noteworthiness: You would think I’d be interested in my own photos, but the sad truth is that even a "modestly sized" collection like mine (43.5K photos) is lethargy-inducingly large; I occasionally dig back through it looking for specific things, but computer-assisted search still ain’t that great despite our advances in machine learning — it’s hard for me to imagine my heirs having enough interest to trawl through all this stuff, with all the other stuff that will be competing for their attention in the future.
All of this leads me to conclude that while it is a fun intellectual exercise to think about how to make one’s data as durable as possible, one shouldn’t be deceived about any of this stuff having transcendental significance: none of this really matters enough to warrant making anything but a "reasonable best effort" at keeping things preserved, and one that doesn’t distract us too much as we head outside to go and smell the roses.