With the recent announcement of Devin by Cognition Labs, an “Autonomous AI Engineer” donning a name that screams entitlement and overconfidence, I feel it’s time to re-assert one of my core belief.
Right now, AI applications should be focused on enhancement instead of replacement.
It’s as if, we’ve learned nothing from the decade long discourse on autonomous vehicles which are “right around the corner”. And while this white whale has been left to a few outliers, the rest of the industry is already progressing towards full autonomy without trying to jump the shark.
Today, level 2 vehicle autonomy is already widespread, and some constructors are already starting to roll out level 3 under certain conditions. But we are far from the fabled level 5, promised under unrealistic time frames by some.
And with the recent announcements, software engineering appears to be in a similar bind. Which prompted me to take the time to try and apply this 6 level framework, validated by a century old industry, to my favorite field.
The 6 levels of autonomous software development
Level 0: Coding in a text editor. The environment doesn't provide any assistance.
Level 1: Coding in an IDE with auto-complete and linting enabled. The environment provides deterministic assistance and guardrails in well-defined situations.
Level 2: Coding with Github's Copilot or Supermaven enabled. Your IDE is now able to help you code on well known tasks (like highways for cars) all while taking into account a small context window.
Level 3: Automating bigger tasks across multiple files. Something that you can start to achieve by mixing multiple infrastructure layers. I've already had success on some of these tasks by leveraging the Open AI Assistant API which provides chain-of-thought capabilities on top of custom tool calling.
Level 4: The software environment is fully autonomous, but its user can intervene at any time and correct the course of action. The "developer" (whatever the role will actually be called at this stage) is also able to revert to any previous level of autonomy whenever he desires.
Level 5: Set a destination through prompt and let the AI do the driving.
This is what projects like AutoGPT or Devin have been trying to do.
Unfortunately, aside from impressive looking demos, their result mostly feel underwhelming under an expert eye. That's because these products haven't managed the previous levels and are building on top of tools which aren't even able to get them there.
Enabling Level 5 requires a high level of accuracy in the final result under any situation.
What does level 4 look like ?
As you may have noticed, I haven't mentioned any level 4 projects but multiple level 5 ones. That's simply because I don't know of any, and for a good reason.
Level 4 is incredibly difficult to both conceptualize and make. Imagining an autonomous agent is easy, but creating the means to take control at any time and change the configuration of every parameter of autonomy in real time is much harder. Much harder but necessary if we want to make the most out of those new systems.
And that’s what I call AI steering. And that’s also why those “fully autonomous” solutions are lazy, only chasing easy money from gullible investors and CEOs who drool at the idea of a cheap and silent workforce. Good thing I’m not the only one who wishes to preserve the fun in engineering.
Despite this, let’s end on a positive note, because people from adjacent domains are definitely smarter and making breakthrough in their integration of AI:
MartinNebelong's work on preserving artistic intent while levering image generation is impressive, and I strongly encourage you to follow his work.
I was also wowed by the following demo from game developers. I love that the controller input perfectly illustrate my vision of Level 4 "AI steering".
And my goal is to share more of these novel practical usages as they come by, instead of focusing on the latest hype from a demo. So if you’ve heard of such interesting projects, make sure to reach out 🙏