AI

Once in a while I take a stab at a big, uncertain topic — like COVID or Bitcoin — as a way of recording a snapshot of my thinking. Now it’s time to do the same for AI (Artificial Intelligence), in a post that will surely be massively out-of-date almost as soon as I’ve published it. Be though, as it may, a doomed enterprise, I still want to do this, if nothing else because my own job as a software engineer is one of the ones that is most likely to be dramatically affected by the rise of AI. And while I could go on about image and video generation or any of a number of other applications of the new wave of AI products, I’m mostly going to focus on the area that is currently most relevant to the business of software engineering; that is, LLMs (Large Language Models) as applied to tasks in and around software development.

The current state of AI in software development

In the world of programming, LLMs are being crammed and wedged into every available gap. I say "crammed" because the textual, conversational model doesn’t necessarily always feel like a natural fit within our existing user interfaces. Products like GitHub Copilot seek to make the interaction as natural as possible — for example, proposing completions for you when you do something like type a comment describing what the code should do — but fundamentally the LLM paradigm imposes a turn-based, conversational interaction pattern. You ask for something by constructing a prompt, and the LLM provides a (hopefully) reasonable continuation. In various places you see products trying to make this interaction seem more seamless and less turn-like — sometimes the AI agent is hidden behind a button, a menu, or a keyboard shortcut — but I generally find these attempts to be clumsy and intrusive.

And how good is this state of affairs? At the time of writing, the answer is "it depends". There are times when it can produce appallingly buggy but reasonable-seeming code (note: humans can do this too), and others where it knocks out exactly what you would have written yourself, given enough time. Use cases that have felt anywhere from "good" to "great" for me have been things like:

  1. Low-stakes stuff like Bash and Zsh scripts for local development. Shell scripts that run locally, using trusted input only, doing not-mission-critical things. Shells have all sorts of esoteric features and hard-to-remember syntax that an LLM can generally churn out quite rapidly; and even if it doesn’t work, the code it gives you is often close enough that it can give you an idea of what to do, or a hint about what part of the manual page you should be reading to find out about, say, a particular parameter expansion feature. The conversational model lends itself well to clarifying questions too. You might ask it to give you the incantation needed for your fancy shell prompt, and when it gives you something that looks indistinguishable from random noise, you can ask it to explain each part.
  2. React components. Once again, for low-stakes things (side-projects, for example), the LLM is going to do just fine here. I remember using an LLM after a period of many months of not doing React, and it helped me rapidly flesh out things like Error Boundary components that I would otherwise have had to read up on in order to refresh my memory.
  3. Dream interpretation. Ok, so I snuck in a non-programming use case. If you’ve ever had a weird dream and asked Google for help interpreting it, you’ll find yourself with nothing more than a bunch of links to low-quality "listicles" and SEO-motivated goop that you’ll have to wade into like a swamp, with little hope of actually coming out with useful answers; ask an LLM on the other hand, and you’ll obtain directed, on-point answers of a calibre equal to that of an experienced charlatan professional dream interpreter.
  4. Writing tests. Tests are often tedious things filled with painful boilerplate, but you want them to be that way (ie. if they fail, you want to be able to jump straight to the failing test and be able to read it straightforwardly from top to bottom, as opposed to having to jump through hoops reverse-engineering layers of cleverness and indirection). An LLM is good for churning out these things, and the risk of it hallucinating and producing something that doesn’t actually verify the correct behavior is far more benign than a comparable flaw making it into the implementation code that’s going to run in production. The bar is lower here because humans are at least as capable of writing bad tests as LLMs are. This is probably because it’s harder to ship a flagrant but undetected implementation bug because if anybody actually uses the software then the bug will be flushed out in short order: on the other hand, all manner of disgusting tests can get shipped and live on for extended periods in a test suite as long was they remain green. We’ve all seen ostensibly green tests that ended up verifying the wrong behavior, not verifying anything meaningful at all, or being mere facsimiles of the form and structure of the thing they purport to test, but utterly failing to express, exercise, specify, or constrain the expected behavior.

But it’s not all roses. One of the problems with LLMs is they’re only as good as the data used to train them. So, given a huge corpus of code written by humans (code with bugs), it’s only to be expected that LLM code can be buggy too. The dark art of tuning models can only get you so far, and curating the training data is hard to scale-up without a kind of chicken-and-egg problem in which you rely on (untrustworthy) AI to select the best training material to feed into your AI model. In my first experiences with LLMs, I found they had two main failure modes: one was producing something that looks reasonable, appears to be what I asked for, and is indeed "correct", but was subtly ill-suited for the task; the other was producing code that again had the right shape that I’d expect to see in a solution, but which actually had some fatal bug or flaw (ie. is objectively "incorrect"). This means you have to be skeptical of everything that comes out of an LLM; just because the tool seemed "confident" about it is no guarantee of it actually being any good! And as anybody who has had an interaction with an LLM has seen, the apparent confidence with which they answer your questions is the flimsiest of veneers, rapidly blown away by the slightest puff of questioning air:

Programmer: Give me a function that sorts this list in descending order, lexicographically and case-insensitively.

Copilot: Sure thing, the function you ask for can be composed of the following elements… (shows and explains function in great detail).

Programmer: This function sorts the list in ascending order.

Copilot: Oh yes, that is correct. My apologies for farting out broken garbage like that. To correct the function, we must do the following…

In practice, the double-edge sword of current LLMs mean that I mostly don’t use tools like GitHub Copilot in my day-to-day work, but I do make light use of ChatGPT like I described in a recent YouTube video. As I’ve hinted at already, I’m more likely to use LLMs for low-stakes things (local scripts, tests), and only ever as a scaffolding that I then scrutinize as closely or more closely than I would code from a human colleague. Sadly, when I observe my own colleagues’ usage of Copilot I see that not everybody shares my cautious skepticism; some people are wary of the quality of LLM-generated code and vet it carefully, but others gushingly accept whatever reasonable-seeming hallucination it sharts out.

One thing I’m all too keenly aware of right now is that my approach to code review will need to change. When I look at a PR, I still look at it with the eyes of a human who thinks they are reading code written by another human. I allow all sorts of circumstantial factors to influence my level of attention (who wrote the code? what do I know about their strengths, weaknesses, and goals? etc), and I rarely stop to think and realize that some or all of what I’m reviewing may actually have been churned out by a machine. I’m sure this awareness will come naturally to me over time, but for now I have to make a conscious effort in order to maintain that awareness.

Am I worried about losing my job?

I’m notoriously bad at predicting the future, but it seems it would be derelict of me not to at least contemplate the possibility of workforce reductions in the face of the rise of the AI juggernaut. I don’t think any LLM currently can consistently produce the kind of results I’d expect of a skilled colleague, but it’s certainly possible that that could change within a relatively short time-scale. It seems that right now the prudent cause is to judiciously use AI to get your job done faster, allowing you to focus on the parts where you can clearly add more value than the machine can.

At the moment, LLMs are nowhere near being able to do the hard parts of my job, precisely because those parts require me to keep and access a huge amount of context that is not readily accessible to the machine itself. In my daily work, I routinely have to analyze and understand information coming from local sources (source code files, diffs) and other sources spread out spatially and temporally across Git repos (commit messages from different points in history, files spread across repositories and organizations), pull requests, issues, Google Docs, Slack conversations, documentation, lore, and many other places. It’s only a matter of time before we’ll be able to provide our LLMs with enough of that context for them to become competitive with a competent human when it comes to those tricky bug fixes, nuanced feature decisions, and cross-cutting changes that require awareness not just of code but also of how distributed systems, teams, and processes are structured.

It’s quite possible that, as with other forms of automation, AI will displace humans when it comes to the low-level tasks, but leave room "up top" for human decision-makers to specialize in high-leverage activities. That is, humans getting the machines to do their bidding, or using the machine to imbue them with apparent "superpowers" to get stuff done more quickly. Does this mean that the number of programming jobs will go down? Or that we’ll just find new things — or harder things — to build with all that new capacity? Will it change the job market, compensation levels, supply and demand? I don’t have the answer to any of those questions, but it makes sense to remain alert and seek to keep pace with developments so as not to be left behind.

Where will this take us all?

There have been some Twitter memes going around about how AI is capable of churning out essentially unreadable code, and how we may find ourselves in a future where we no longer understand the systems that we maintain. To an extent, it’s already true that we have systems large and complicated enough that they are impossible for any one person to understand exhaustively, but AI might be able to build something considerably worse: code that compiles and apparently behaves as desired but is nevertheless not even readable at the local level, when parts of it are examined in isolation. Imagine a future where, in the same way that we don’t really know how LLMs "think", they write software systems for us that we also can’t explain. I don’t really want to live in a world like that (too scary), although it may be that that way lies the path to game-changing achievements like faster-than-light travel, usable fusion energy, room-temperate superconductors and so on. I think that at least in the short term we humans have to impose the discipline required to ensure that LLMs are used for "good", in the sense of producing readable, maintainable code. The end goal should be that LLMs help us to write the best software that we can, the kind of software we’d expect an expert human practitioner to produce. I am in no hurry to rush forwards into a brave new world where genius machines spit out magical software objects that I can’t pull apart, understand, or aspire to build myself.

The other thing I am worried about is what’s going to happen once the volume of published code produced by LLMs exceeds that produced by humans, especially given that we don’t have a good way of indicating the provenance of any particular piece — everything is becoming increasingly mixed up, and it is probably already too late to hope to rigorously label it all. I honestly don’t know how we’ll train models to produce "code that does X" once our training data becomes dominated by machine-generated examples of "code that does X". The possibility that we might converge inescapably on suboptimal implementations is just as concerning as the contrary possibility (that we might see convergence in the direction of ever greater quality and perfection) is exciting. There could well be an inflection point somewhere up ahead, if not a singularity, beyond which all hope of making useful predictions breaks down.

Where would this take us all in an ideal world?

At the moment, I see LLMs being used for many programming-adjacent applications; for example, AI-summarization. There is something about these summaries that drains my soul. They end up being so pedestrian, so bland. I would rather read a thoughtful PR description written by a human than a mind-numbingly plain AI summary any day. Yet, in the mad rush to lead the race into the new frontier lands, companies are ramming things like summarization tools down our throats with the promise of productivity, in the hope of becoming winners in the AI gold rush.

Sadly, I don’t think the forces of free-market capitalism are going to drive AI towards the kinds of applications I really want, at least not in the short term, but here is a little wish list:

  • I’d like the autocomplete on my phone to be actually useful as opposed to excruciating. Relatedly, I’d like speech-to-text to be at least as good at hearing what I’m saying as a human listener. Even after all these years, our existing implementations feel like they’ve reached some kind of local maximum beyond which progress is exponentially harder. 99% of all messages I type on my phone require me to backspace and correct at least once. As things currently stand, I can’t imagine ever trusting a speech-to-text artifact without carefully reviewing it.
  • Instead of a web populated with unbounded expanses of soulless, AI-generated fluff, I want a search engine that can guide me towards the very best human-generated content. Instead of a dull AI summary, I’d like an AI that found, arranged, and quoted the best human content for me, in the same way a scholar or a librarian might curate the best academic source material.
  • If I must have an AI pair-programmer, I’d want it to be a whole lot more like a skilled colleague than things like Copilot currently are. Right now they feel like a student that’s trying to game the system, producing answers that will get them the necessary marks and not deeply thinking and caring about producing the right answer[1].
  • AI can be useful not just for guiding one towards the best information on the public internet. Even on my personal computing device, I already have an unmanageably large quantity of data. Consider, for example, the 50,000 photos I have on my laptop, taken over the last 20 years. I’d like a trustworthy ally that I can rely on to sort and classify these; not the relatively superficial things like face detection that software has been able to do for a while now, but something capable of reliably doing things like "thinning" the photo library guided only by vague instructions like "reduce the amount of near-duplication in here by identifying groups of similar photos taken around the same time and place, and keep the best ones, discarding the others". Basically, the kind of careful sorting you could do yourself if only you had a spare few dozen hours and the patience and resolve to actually get through it all.

I’m bracing myself for a period of intensive upheaval, and I’m not necessarily expecting any of this transformation to lead humanity into an actually-better place. Will AI make us happier? I’m not holding my breath. I’d give this an excitement score of 4 out of 10. For comparison, my feelings around the birth of personal computing (say, in the 1980s) were a 10 out of 10, and the mainstream arrival of the internet (the 1990s) were a 9 out of 10. But to end on a positive note, I will say that we’ll probably continue to have some beautiful, monumental, human-made software achievements to be proud of and to continue using into the foreseeable future (that is, during my lifetime): things like Git, for example. I’m going to cherish those while I still can.


  1. And yes, I know I’m anthropomorphizing AI agents by using words like "thinking". At the moment we have only a primitive understanding of how consciousness works, but it seems clear to me that in a finite timespan, machines will certainly pass all the tests that we might subject them to in order to determine whether they are conscious. At that point, the distinction becomes meaningless: what is conciousness? It’s that thing that agents who appear to have conciousness have. ↩︎

25 years and counting

I’ve been publishing writing on the web for almost 25 years now — at least, the oldest snapshot I can find for a website of mine dates back to December 3, 1998 (it’s possible I published this even earlier, because the footer notes that I wrote the article in "November 1997"). I look back at those attempts at academic writing by my 22-year-old-self and sometimes have to grimace at how strained the wording is, but I don’t feel that bad about it. I graduated in the end, after all.

Lately, life has been too busy to write anything on here. I have a number of topics I’d like to dip into, but I can’t do them justice in the time I’m willing and able to allocate them. So for now, this will have to do.

Command-T v6.0 — the Lua rewrite

For a while now I’ve wanted to do a ground-up rewrite of Command-T in Lua. After sitting on the back-burner for many months, I finally got around to doing some work on it. While the rewrite isn’t done yet, it is so close to being an "MVP"[1] now that I can talk about the new version without worrying too much about the risk of it being vaporware. So, let’s start.

History

This isn’t the first time I’ve written about Command-T on this blog. Back in 2016 I wrote about how I’d been optimizing the project over many years. As that post tells, ever since I created Command-T in 2010, its primary goal has been to be the fastest fuzzy finder out there. Over the years, I’ve found many wins both small and large which have had a compounding effect. If you make something 10% faster, then 10% more, then you find a way to make it 2x faster than that, and then you find a way to make it 10x faster than that, the end result winds up being "ludicrously" fast. At the time I wrote the optimization post, some of the major wins included:

  • Writing the performance-critical sections (the matching and scoring code) in C.
  • Improving perceived performance, somewhat counterintuitively, by spending extra cycles exhaustively computing possible match scores, so that the results the user is searching for are more likely to appear at the top.
  • Memoizing intermediate results, to make the aforementioned "exhaustive computation" actually feasible.
  • Parallelizing the search across multiple threads.
  • Debouncing user input to improve UI responsiveness by avoiding wasteful computation.
  • Improving scanning speed (ie. finding candidate items to be searched) by delegating it to fast native executables (like find or git).
  • Avoiding scanning costs by querying an always up-to-date index provided by Watchman.
  • Reducing cost of talking to Watchman by implementing support for its BSER (Binary Serialization Protocol) in C, rather than dealing with JSON.
  • Prescanning candidates to quickly eliminate non-matches; during this pre-scan, record the rightmost possible location for each character in the search term, which allows us to bail out early during the real matching process when a candidate can’t possibly match.
  • Recording bitmasks for both candidates and search terms so that we can quickly discard non-matches as users extend their search terms.
  • Using smaller/faster data types (eg. float instead of double) where the additional precision isn’t beneficial.
  • Using small, size-limited heap data structures for each thread, keeping small partial result sets ordered as we go rather than needing a big and expensive sort over the entire result set at the end.

After all that, I was running out of ideas, short of porting bits of the C code selectively into assembly (and even then, I was doubtful I could hand-craft assembly that would be better than what the compiler would produce). There was one PR proposing switching to a trie data structure, which would allow the search space to be pruned much more aggressively, but at the cost of having to set up the structure in the first place; in the end that one remained forever in limbo because it wasn’t clear whether it actually would be a win across the board.

Why rewrite in Lua?

Neovim comes with Lua (or more precisely, LuaJIT), which is well known for being speedy. It’s an extremely minimal language that optimizes well. I previously saw huge wins from porting the Corpus plug-in from Vimscript to Lua (GIF demo). While I wasn’t planning on throwing away my C code and rewriting it in Lua, I could throw out a bunch of Ruby code — mostly responsible for managing the UI — and rewrite that. This, combined with the fact that Neovim now offers neat APIs for doing things like floating windows, means that a Lua-powered rewrite could be expected to have a much snappier UI.

The reason Command-T had Ruby code in it is that, in 2010, it was the easiest way to package C code in a form that could be accessed from Vim. You build a C extension that integrates with the Ruby VM (ie. you can call C functions to do things like create and manipulate arrays, access hashes, raise exceptions, call Ruby methods, and so on), and then you can call into the Ruby from Vimscript. There is overhead in moving from Vimscript through Ruby into C and back again, but because most of the heavy lifting is done in C-land — the actual work of trawling through thousands or even millions of string bytes and producing scores for them — it ends up being blazingly fast compared to a native Vimscript or pure Ruby implemention.

The other nice thing about Ruby is that it is a "real" programming language, unlike Vimscript, which is a bespoke and idiosyncratic beast that you can use in exactly one place[2]. If you need a working Ruby layer in Vim just to get the C code, you may as well leverage that Ruby layer once you have it. That gives you access to niceties like object-orientation, modules, and a relatively flexible and extensible programming model that allows you to write expressive, readable code.

The downside to all this is that Ruby installations are notoriously fragile inside Vim as soon as you start involving C code. You must compile the Command-T C extension with exactly the same version of Ruby as Vim itself uses. The slightest discrepancy will crash the program. In a world where people are on an eternal operating system upgrade train, are constantly updating their editor with tools like Homebrew, and playing endlessly with Ruby versions via tools like RVM, rbenv, chruby — not even a complete list, by the way — you wind up with an incredibly fragile and unstable platform upon which to build. Over the years I have received uncountable reports about "bugs" in Command-T that were actually failures to install it correctly. A glance through the closed issues on the Command-T issue tracker reveals dozens of reports of this kind; command-t#341 is a representative example. The basic formula is:

I can’t get Command-T to work (or it stopped working)…

(Various installation commands are run or re-run…)

I got it working in the end.

(Issue gets closed with no code changes being committed.)

This alone is probably the main reason why I have never heavily promoted Command-T. Over the years there have been other fuzzy finders that have more features, or are more popular, but none with performance that scales to working on repositories with millions of files, and none which provide such a robust and intuitive ranking of match results. Those are the features that I still care about the most to this day, and that’s why I keep on using Command-T. But I don’t want to actually promote it, nor do I want to keep adding on features to attract new users, because I know that the bigger the user base, the more support tickets related to version mismatches, and the more hair ripped out from frustrated scalps across the globe. So, I continue on, quietly using Neovim and Command-T to get my job done, and I don’t twiddle my editor versions or my Ruby version unless there’s good reason to.

At one point, I was considering a way out from this in the form of running the Ruby code outside of the Vim process itself. The idea was to run a commandtd daemon process, and communicate with it using Vim’s job APIs. This would totally decouple the version of Ruby used by Vim from the version used by the daemon, and "solve" the installation woes once and for all. Users would still need to run a make command to build the daemon, but they could forget about versions at least. In the end, I didn’t pursue this idea to its conclusion. I didn’t like the complexity of having to manage a separate process, and I worried about the overhead of sending data back and forth via IPC. Finally, I figured that if I could just access the C code from Lua instead of Ruby, then I might be able to side-step my Ruby headaches.

So, I thought, let’s make a clean break. I’ll drop the Ruby requirement, and move wholesale over to Lua and Neovim (I’ve been using Neovim myself full-time now for about 5 years, if the first traces of evidence in my dotfiles repo are to be believed). Forget about Vim support, forget about Windows, and just go all-in on modern APIs. The nature of Git branches means that anybody wanting to continue using Vim or Windows or Ruby can do so just by pointing their plug-in manager or their Git submodule at the right branch; in the meantime, I’m going to ride off into a brave new world.

A huge amount of the Ruby code in Command-T is about managing windows, splits, buffers, and settings. Back in 2010 nobody had dreamed of putting floating windows inside Vim, so if you wanted to present a "UI" to the user you had to fake it. Command-T did this, basically, by:

  • Recording the position of all windows and splits.
  • Remembering the values of global settings that need to be manipulated in order to get the "UI" to behave as desired.
  • Creating a new buffer and window for showing the match listing.
  • Setting up global overrides as needed, along with other local settings.
  • Setting up mappings to intercept key presses; the "prompt" was actually just text rendered in Vim’s command line.
  • After a file is selected, clean up the prompt area, remove the match listing, restore the global settings, and reestablish the former geometry of windows and splits.

The code worked remarkably well because it was the product of extreme attention to detail and relentless refinement over the years. But it was an enormous hack, and it was incredibly ugly and annoying to maintain. In comparison, throwing up a floating window with the new APIs is an absolute breeze. No need to think about window geometry, no need to set up mappings, no need to construct an elaborate fake prompt. The importance of having a real prompt is not to be understated: with the old approach, Command-T couldn’t even support extremely natural things like the ability to paste a search query in a uniform and reliable way; with a real prompt, we get that "for free", along with the all of the standard Vim motions and editing bindings.

Other wins

One thing about a clean rewrite is it gives you a chance to reevaluate technical decisions. There are two examples that I’d like to highlight here.

The first is that I turned the C library from a piece of "Ruby-infested" C (that is, C code littered with calls to Ruby VM functions and using Ruby-defined data structures; example matcher.c) to a pure POSIX one (matcher.c). There is no mention of Lua in the C library, which means that any Ruby-VM-related overhead is gone now, replaced by nothing, and the library can be cleanly used from more places in the future, should people wish to do so. In the past, I extracted Command-T’s fast scoring algorithm into a Python package (still C, but adapted for the Python runtime instead of the Ruby one). Doing that was fiddly. With the new, pure POSIX library, grabbing the code and wrapping it up for any language would be a whole lot easier. Pleasingly, this new version is about 2x faster in benchmarks than the old one, which is pretty amazing considering how fast the old one was; maybe the Ruby-related overhead was more than I’d thought, or perhaps the LuaJIT FFI is unexpectedly awesome… And naturally, on revisiting code that had been iterated on for over a decade, and reworking it profoundly, I took advantage of the opportunity to improve readability, naming, structure, and a bunch of other things that you might classify under "spring cleaning". I also implemented some fast C-powered scanning functionality that had been proposed for the old version but never merged due to some doubts about performance. Overall, the C code is in much better shape.

The other aspect that I noticed was the effect of moving from heavily object-oriented Ruby idioms to light-weight Lua ones. Lua mostly favors a functional style, but it does provide patterns for doing a form of object-oriented programming. Nevertheless, because OOP is not the default, I’ve found myself using it only when the use-case for it is strong; that basically means places where you want to encapsulate some data and some methods for acting on it, but you don’t need complex inheritance relationships or "mixins" or any other such fanciness. The Ruby code is probably more legible — Ruby is famously readable, after all, if you don’t go crazy with your metaprogramming — but there is so much less Lua code than there was Ruby code, that I think the overall result is more intelligible. The other thing is that when I wrote Command-T in 2010, I was coming from Apple’s Objective-C ecosystem, and Rails too, both of which had spins on the "MVC" (Model-View-Controller) pattern, and which influenced the architecture. In 2022, however, we see the influence of React and its notion of "unidirectional data flow" to guide me whenever I have a question about where a particular piece of data should live, who should own it, and how updates to it should be propagated to other interested parties within the system. Overall, things seem clearer. My work-in-progress is still at very "pre-alpha" stages, but I’m confident that the end result will be more robust than ever.

It’s sometimes tempting to look at a rewrite and marvel, prematurely, at how much better and lighter it is. Think of it as the "Bucket Full of Stones v1.0", long creaking under the weight of all the stones inside it. You start a fresh with "Bucket Full of Stones v2.0" and are amazed at how light and manoeuvrable the whole thing feels without any stones in it. As you add back stone after stone, it still feels pretty light, but eventually, you discover that your bucket is as full as ever, and maybe it’s time to start thinking about "Bucket Full of Stones v3.0". Nevertheless, I still feel pretty good about the rewrite so far. It is much smaller in part because it only has a subset of the features, but the foundations really do look to be more solid this time around.

The upgrade path

This is where things get tricky. The Vim ecosystem encourages people to install plug-ins using plug-in managers that clone plug-in source from repositories. Users tend to track the main or master branch, so version numbers, SemVer, and the very concept of "releases" lose significance. You can maintain a changelog, but users might not even see it. In this scenario, how do you communicate breaking changes to users? Sadly, the most common answer seems to be, "You break their shit and let them figure it out for themselves". The other answer, and I think the right one, is that you simply don’t make breaking changes at all, ever, if you can help it. Put another way, as a maintainer, ya gotta do some hoop-jumping to avoid user pain[3]. Command-T is not the Linux kernel, but it stands to learn a lesson from it, about not breaking "user space".

My current plans for how to do this release with a minimum of pain are as follows:

  • The new version, version 6.0, will effectively include both the old Ruby and the new Lua implementations.
  • If the user opts-in to continuing with the Ruby version, everything continues as before. It may be that I never remove the Ruby implementation from the source tree, as the cost of keeping it there isn’t really significant in any way.
  • If the user opts-in to using the Lua version, they get that instead. For example, a command like :CommandT will map to the Lua implementation. A command that is not yet implemented in the Lua version, like :CommandTMRU, continues to map onto the Ruby implementation, for now. If you ever need to fallback and use the Ruby implementation, you can do that by spelling the command with a K instead of a C; that is, :CommandTBuffer will open the Lua-powered buffer finder, but :KommandTBuffer can be used to open the Ruby one.
  • It the user doesn’t explicitly opt-in one way or another, the system will use the Ruby implementation. We show a message prompting the user to make a decision; technically this is the breaking change (a new message that will bother the user at startup until they take the step of configuring a preference) that requires the bump in version number to v6. As far as breaking changes go, this is about as innocuous as they come, but it is still one that I make reluctantly.
  • In version 7.0, this default will flip over in the opposite direction: if you haven’t specified an explicit preference, you’ll get the Lua version. By this time, however, I expect pretty much everybody actively using Command-T will already have set their preference. In 7.0 the aliased version of the commands (eg. :KommandT) will go away.

A couple of things to note about this plan:

  1. All of the above applies on Neovim; if you’re running Vim you aren’t eligible to use the Lua backend, so you won’t see the deprecation prompt, and you’ll continue to use the Ruby version transparently.
  2. Maintaining two parallel implementations "forever" is only feasible because this is a hard fork. That is, there is no commitment to having an equal feature set in both implementations, having or fixing the same bugs, or even having the same settings. The Ruby backend, as a mature 12-year-old project, is mostly "done" at this point and I doubt I’ll do much more than fix critical bugs from here on. For people who don’t want any part of all this, they can point their checkout at the 5-x-release branch, and pretend none of this is happening. As an open source project, people are free to contribute pull requests, make a fork, or do whatever they see fit within the terms of the license.

How will all of this work? We’ll see. Last night I published v5.0.5, which may be the last release on that branch for a long while. As I write this, main doesn’t have any of the new stuff yet (currently, 81dba1e274) — the new stuff is all still sitting out on the pu (proposed updates) branch (currently, 9a4cbf954c). My plan is to keep baking that for a little while longer — a timespan probably measured in hours or days, but probably not in weeks or months — and then pull the trigger and merge it into main, at which point we’ll call it the "6.0.0-a.0" release. As I said above, this feels real close to being an MVP now, so it hopefully won’t be long.


  1. I’m defining "MVP" (Minimal Viable Product) here as having the subset of features that I use heavily on a daily basis: a file finder, a buffer finder, and a "help" finder (for searching the built-in Neovim :help). ↩︎

  2. In Vim, that is. Two places, if you count Neovim. ↩︎

  3. To make this more precise: users come first, so you default to hoop-jumping if necessary to avoid user pain; the only reason you might relent and actually break something is if the cost of hoop-jumping becomes so prohibitively high that it dissuades you from working on the project at all. ↩︎