Software Engineer Agents Are Coming — And It’s Not Optional

The new software engineer

Engineers, of course, have a soft spot for the idea of being replaced. Microsoft’s CTO, Meta’s CEO, and Anthropic’s CEO have all made bold predictions: AI will replace up to 90% of code and mid-level engineers in startlingly short timeframes. At first, that was hard to believe. Early coding assistants were clumsy, generating broken code for features they didn’t understand. But things improved fast.

Even so, the coding assistant paradigm hasn’t been convincing. Productivity gains are inconsistent, often negligible. One study from METR¹ even reported a 19% slowdown when engineers used assistants. Google’s own study² showed small improvements—up to a 21% reduction in task completion time—but none were statistically significant. In a space evolving weekly, results are impossible to generalize and assistant rely on many variables like novel interfaces, providing context and our own expertise using them. However, I believe that’s is not the problem, but that assistants were always the wrong framing.

A couple of years ago, I recall thinking that fully spec-driven approaches would be the winning approach. Focus humans on maintaining the core and interfaces, letting AI regenerate parts every time requirements change. Narrow context, tight constraints, maximum leverage. That still makes sense—but reality skipped ahead. The broad adoption of coding agents is relatively recent, and while it is subject to the same nuances as assistants, in this case, it is not just possible that they will replace mid-level engineers; I believe it is the most likely scenario, and the question isn’t whether this will happen—it’s how fast. I am not referring to vibe-coding, but production-grade software written mostly by agents such as Codex and the Claude CLI.

Not acknowledging that engineering work will look unrecognizable in ten years is naïve. The change is here, and it demands rapid experimentation with both organizational models and engineering processes.

The companies that figure this out will unleash engineers capable of achieving multiples of today’s output. Those that don’t will simply fall behind. Scary? Absolutely. Exciting? Without a doubt.

Short term, demand for junior talent will shrink; some companies will exploit agents just to cut headcount. But that is only half the story, there is a glass-half-full side; First, the productivity gains open doors to entire new industries and AI advancements opens others:

Micro-scale, fully customized software. Software houses going after the long tail.
Just-in-time features. First on software that is already customizable in precarious ways, such as excel, photoshop, etc.
Self-healing systems. Starting with integrations, data processing and extraction. Then generalazed in specialized frameworks.
Assistants in all kinds of domains and revolutions in interfaces
Even absurd-sounding things like “Artificial Resource departments” to keep the agents happy 😂

Alternatives may flood existing markets with much smaller teams or even solo developers. This echoes what happened in gaming around 2011, when tiny teams—or even solo developers—began shipping hits that rivaled studios. The future of software may follow a similar arc. Regardless, engineers will need to adapt their work methods and build processes with this in mind.

If your product depends on software and you don’t embrace this shift within the next 24 months, you may not be able to compete. This isn’t optional, It’s existential.

Similar to the properties the METR article lists as dimensions that could explain slowdowns or speedups. Results with agents heavily depend on variables that are under our control. I have compiled my own list, which, unsurprisingly, also applies to developers. Many of the challenges we face with agents are also challenges we encounter with inexperienced developers or those new to the problem at hand.

Context

Humans bury context in tribal knowledge. We tell ourselves it’s in a wiki—it never is. Business rules live in developers’ heads. Onboarding new devs is slow because they don’t know the history of that “weird old function” no one dares touch.

In presenting any problem to AI, we always overlook relevant context. As much as we like documenting and testing, the reality of the average codebase and engineering organization is that many practices are implicit, and some business rules may not be explicitly documented anywhere but in the minds of some developers. The amount of time invested in establishing context for the agent has a huge impact. The instructions range from everyday tasks, such as accessing databases, managing dependencies, running tests, and setting up testing infrastructure, to less obvious contexts, such as why old code still exists.

Agents face the same challenge than new developers, but worse. They can’t ask around. They sit quietly, like a junior developer afraid to ping the team, until someone notices. That’s why at least until we develop organizations and systems that allow access to this organic context, well documented organizations have an advantage and front-loading context becomes an essential engineering skill.

Architecture

Separation of concerns is required not only because of context size but also due to organizational reasons. The same problems that lead us to tear apart monoliths or complain about 500 microservices apply to both organizations and agents.

Model context limits are still relatively small in comparison to the largest codebases. Developers face the same problem: relying on their ability to abstract the parts they do not understand by making assumptions about them. Abstraction from complexity is a skill learnt over time. Sometimes, the codebase maintains clear interfaces that allow us to ignore the implementation details entirely. Agents benefit in the same way from modular code and clear interfaces. It is just not the reality of most codebases. Being able to target an Agent to specific parts without requiring inference or even accessing the others makes the development processes faster. For agents and particularly in monorepos ❤️, structure matters. Explaining the project's parts and providing the tooling that can work on subparts of the project goes a long way.

As the marginal cost of shipping features plummets, the bar for processes and organization soars. Your codebase will need to sustain a significantly larger team working on them.

Leveraging strengths

Human engineers’ future edge may be creativity (hot topic 💣)—or maybe just motivation. But we already know where agents shine, for example:

Combining rare skill sets (reverse engineering + cryptography, software engineering + RF design, etc). They can get much deeper in any field than the average developer.
Large-scale, detail-heavy refactors. Even on large code bases, they are getting excellent at refactors that, although simple, require maintaining a big mental stack with nested changes that require being very thorough. I've noticed a significant leap in this aspect with the release of GPT-5
Brutal consistency across codebases.
Immune to boredom. I find it highly tedious to safely refactor big chunks of IaC, such as Terraform or Tofu repos. Sometimes in very mechanical ways.

Learning how to leverage strengths will be very important in the mid-term (2 years).

They also have weaknesses:

Completely overconfident (maybe due to a false perception of job security 😂), can get into rabbit holes.
Lack of proactivity and the ability to ask questions spontaneously. Some are due to technical limitations; agents do not see our screens and hear our discussions. Their context is more static, much like remote developers with no Zoom.
Interface limitations require specialized setups to handle browsers or even simple interactive prompts (like a basic Git prompt for a message, as a developer who still hasn't learned ":wq").
Humans advance software engineering by approaching problems from novel perspectives, but it remains uncertain whether agents can similarly push the frontiers—for instance, by independently creating new frameworks and paradigms.

Objectivity & validation

The task “make a tennis game that is fun” exists for a reason—it’s subjective. Humans deal with fuzzy goals because we can ask, adapt, and align. Agents? Not so much.

The more objective and testable the task, the better the agent performs. When subjectivity is unavoidable, tight feedback loops are critical. Otherwise, you’ll drown in a flood of mediocre solutions, even outside what we would consider creative domains. For instance, startups deal with a very high number of subjective tasks. Writing detailed definitions is counterproductive because nobody knows what those solutions are. The optimal strategy empowers the developers, favoring high-frequency iterations to desambiguate and assess solutions.

In creative domains, volume tends to be associated with low quality. The truth is it can be a powerful tool to increase alignment at the begining to raise the quality at the end

Ideally, we want to assess a solution automatically and within time frames that are much faster than the development iteration. Similar to how real "organic" engineers perform, it is much simpler to obtain good outcomes if the stakeholder can objectively describe the outcome. The more comprehensively and efficiently developers can conduct the validation, the better the first results will be.

Some tasks are not just complex, but either impossible or unviable to describe. You can train a developer on enjoyable game mechanics and visually appealing interfaces. However, it is laborious to do it up front and almost impossible to assess programmatically. In any process where a high degree of subjectivity exists, synchronous communication and frequent partial assessment become increasingly important; this, combined with the interface limitations of current agents, makes it challenging to achieve good results.

Describing a task comprehensively is not just about its functional goal; it also needs to include any relevant constraints required to accept a solution. For example, defining precisely what happens during the creation of a user may be sufficient. However, as with a real engineer, you also have expectations that they will not introduce new vulnerabilities. We should consider these hidden expectations from the outset, including the architecture itself. Ideally, the system enforces the behavior, and anything else is ultimately checked in a CI pipeline or assumed to be part of "code reviews."

Sometimes, implementation details are irrelevant as long as we can assert that functional assessments are sufficient. In that case, the code does not even need to be maintained; it is simpler to rewrite it every time a new requirement appears. Frameworks help push in that direction, as does guidance on architecture or using different paradigms (such as data-oriented declarative, etc). In general, an agent will make fewer mistakes if constrained from an architecture point of view, especially if starting from scratch. The skill of designing with these goals in mind becomes more relevant, and the developers' experience becomes a significant advantage.

One setup

Unlike most developers, agents are fearless; they absolutely need to be sandboxed and isolated from anything destructive. It is not that they are dumb; they are just not aligned with the same incentives that prompt us to think twice about our actions. Arguably, this can change, but the solution is not immediately apparent; they don't seem bothered by writing incident reports and assume responsibility more effectively than most of us.

Tooling is still messy. MCP servers sometimes solve genuine interface problems, but they also introduce massive risks, such as leaking source code or private data. These tools make perfect sense on models running remotely without access to full environments. For agents, a CLI tool works just as well. The rule is simple: only add tools if they cover a specific gap caused by agent limitations. One such example is the comparison between GitHub's MCP and using the GitHub CLI alone.

A classic example of a gap is providing a browser. The developer also needs it, but they have eyes and hands to look at a screen and control a mouse. The tools must fill a functional or efficiency gap caused by agent limitations.

Regarding permissions, sometimes it makes sense for developers to delegate their access, but in most cases, it poses a considerable risk. Being able to sandbox and provide scoped access to agents is a must.

And above all: standardized setups will matter. Just as code reviews and PRs became universal, agent processes will too.

The Stakes

At Enter, our mission is to leverage AI for critical work—including software engineering. Our goal isn’t to reduce engineers. It’s to give them the infrastructure, guardrails, and workflows to run agents at peak efficiency.

Because the truth is stark:

The companies that master agents will multiply their engineering force.
The companies that don’t will fail.

This future isn’t optional. It’s inevitable.

Big thanks to @dmacvicar for the great feedback 🙏.