Agents Are Just Software: Why Frameworks Can’t Replace Engineering
TL;DR
Agent frameworks are exploding in popularity, but most of them obscure the real engineering challenges behind LLM-powered applications. An agent isn’t magic—it’s just the software around an LLM. That software needs to make decisions, handle errors, and scale reliably. While tools like LlamaIndex or LangSmith are great for demos, they can hinder production-readiness. Worse, they often ignore the core ingredient for reliable language systems: natural language processing. Real production agents require real software engineering—and a serious understanding of language.
Introduction: The Agent Hype Cycle
Every few months, a new buzzword takes center stage in the LLM ecosystem. Right now, it’s AI Agents.
From LangChain’s Agents to LangSmith to LlamaIndex, developers are rushing to build applications that feel more autonomous than traditional chatbots. But as a recent podcast put it: “An agent is something slightly more autonomous than a chatbot—and not just a RAG system.” Even the authors of LLMs in Production admit that the definition is fluid.
As a data engineer who’s worked closely with data science teams, I’ve spent the last year exploring GenAI on the side. I am doing everything I can to learn the best ways to interact with LLMs and to keep up to date with the rapidly changing landscape. Although AI Is cool. As an engineer, I am not really impressed with how it is being used.
So here’s my take—
Agents are just software that acts on the output of an LLM.
That’s it. No magic. No intelligence. Just control flow and decision-making layered on top of a language model. If you want to make it even cooler you take data and just use software + plain old machine learning.
What Agents Really Are
An “agent” is not a model behavior. It’s how you use model output to guide behavior in software.
The decision-making could be:
Hardcoded logic
Regex or rule-based pattern matching
A vector similarity lookup
Functionally calling or external API triggering
In every case, the agent isn’t thinking—your code is. This distinction matters. The “intelligence” of an agent is really the structure you place around the model, not the model itself.
Frameworks Help You Move Fast—But Not Build Well
Agent frameworks can speed up development—especially for demos and hackathons. But that ease comes at a cost. Frameworks like LlamaIndex and LangSmith:
Abstract away critical logic
Hide edge cases
Make debugging difficult
Give a false sense of production-readiness
These tools help people prototype quickly, but they often enable developers without experience to ship unreliable systems. They’re more like scaffolding than a foundation; when that scaffolding starts to collapse under scale, performance, or edge cases, who’s left holding the bag? Your engineering team. And what do they start doing, rebuilding whatever you built with a framework and re-engineering it for scale, performance, and edge cases.
NOTE: This can be said of most if not all “frameworks” in the software engineering space. I think there is a place for frameworks and abstraction, but I think it comes after foundational and fundamental learning, not before.
Don’t Ignore Natural Language Processing
One of the most dangerous things about these frameworks is how they obfuscate the need for NLP—actual Natural Language Processing science.
We’ve had decades of research into making systems that understand and respond to language. What I have a cursory knowledge of has only been really in use for the past 5-ish years. They are:
Prompt optimization
BERT-based classification and extraction
Fine-tuned models for structured language tasks
Rule-based and semantic parsing approaches
These techniques are:
Cheaper
More reliable
Well-understood by experts in linguistics and NLP
Good systems leverage language itself—not just the outputs of stochastic parrots. NLP provides precision, guardrails, structure, grounded expectations.
Want a more reliable agent? Don’t reach for another tool. Reach for a linguist.
Large language models work for the because they detect linguistic patterns. It’s the same reason mentalists can predict behaviors or why social engineering is so effective—language is leverage. If we ignore how it works, we’re skipping the most powerful tool we have. Understanding how language works helps us engineer linguistic patterns into more reliable systems.
Agents Need Engineering, Not Abstraction
If you plan to productionize an agent system, you’ll need:
Observability (logs, metrics, tracing)
Rate limiting and abuse detection
Failover strategies and retries
Testing and safety checks
CI/CD and deployment practices
These are the same problems we solve in distributed systems, cloud-based geo-redundant microservices, and software architecture. No framework will handle these for you at scale—not without a heavy price tag in computing cost, performance, or debugging complexity.
Software engineers, not just LLM enthusiasts, are the ones who will make agents viable in the long run.
Ask the Hard Question—Tool or Toy?
Before adopting any framework, ask:
Does this make debugging easier or harder?
Does it expose failure modes clearly?
Can I test it, monitor it, scale it?
Is it built for production or for demo day?
There’s nothing wrong with using a toy to explore a concept. But when you’re trying to build a business or deliver value at scale, you need tools—not toys.
Let’s Engineer, Not Just Prompt
Agents aren’t the future of AI, they’re just the next iteration of language-aware software. The hype is distracting us from the fundamentals: software engineering, observability, and real natural language understanding.
I know a lot of really good engineers who are working in the AI space. They are talented, and they do leverage tools and frameworks, but they have the context and experience to say "this is just a prototype but I need to change it when X happens." This is something that gets lost in a hype cycle, and I think this is what we are losing as we promote AI agents. To move forward and leverage to future of software engineering as a craft, we need fewer frameworks and more understanding. More attention to language. More discipline in design. Because if we don’t treat agents like real software, we’ll never get real systems.