
What’s the future of productivity: Pre-programmed, AI-enabled workflows or fully autonomous, general-purpose AI agents?
There are currently two schools of thought in the AI industry when it comes to the near future of AI agents: One perspective thinks that AI works best when it’s combined with more traditional code and ways to access data. Think of it as “AI agents on rails”: the AI can make certain decisions, but the degrees of freedom are strongly limited by workflow code that guides the AI. This enables better consistency and deeper integration with traditional IT mechanisms, such as existing rich APIs to connect legacy systems.
Many providers of workflow tools have started to embrace this concept, and with a bit of “vibe coding” in code generator tools such as Cursor it’s quite easy to roll your own, fully custom AI workflow. In a way, the underlying AI models are really a tool used by traditional code.
The other concept thinks that AI agents will be entirely autonomous and much more flexible. In other words, the AI is going to be the dominant force and interacts with its environment. The agent gets prompted by the user and then figures out independently what exactly to do. Agents come up with a plan, execute each step in the way they see fit and can use tools to access other systems.
For this last integration part, the new MCP (Model Context Protocol) framework is currently evolving as the new standard, bridging the AI agent world and traditional APIs. Importantly, the user or creator of the agent doesn’t have to know what a particular API can do. The agent figures that out independently.
Confusingly, both approaches are referred to as “AI agents” in the current market, but they are quite different. So which approach works best for which use cases?
Fully autonomous agents have made a lot of progress just in the last few weeks. New tools such as OpenAI’s o3 model or the Manus AI agent can plan and execute remarkably complex tasks. Several vendors (e.g. OpenAI, Google, Grok, You.com, Perplexity, Manus) now offer “Deep Research” capabilities that conduct extensive research jobs independently, googling and assembling information for you. And experimental browser use agents can even use the web for you to solve problems.
But can these agents run more complex workflows that go deeper and have to be consistent? As a test, I tried to implement several of my own vibe-coded AI workflows with general-purpose agents. The results were quite mixed.
One of my workflow agents that can research and draft a PowerPoint presentation about any topic is now topped in most cases by the flexibility of o3 and Manus AI. Both agents can create a presentation end-to-end with sometimes surprisingly good results. These broader tasks are probably best covered by general-purpose AI. Particularly when the input data can take on many forms (such as web search results), the flexibility of a general-purpose agent is a huge plus. It can not only decide what to best do with the data, but also much more dynamically find a way forward to get the best results.
But in other use cases the pre-programmed workflows still prevailed:
- Finding relevant industry events based on precise criteria
- Summarizing e-mail newsletters
- Conducting an in-depth competitor analysis
- Running semantic search on a large table
In all these cases the general-purpose tools showed some promise, but ultimately failed.
A big problem was the scope and precision of processing. For example, Claude with an MCP connector to Gmail was able to summarize newsletters in my inbox very nicely, but it stopped every time after newsletter no. 10 or so, while the pre-programmed workflow can process any number of inputs. Somehow the agent seems to have a limit in the length of its processing run. Similarly, the competitor analysis always stopped at a small number of relevant companies (often less than a third of what the workflow AI was finding), and the agents struggled to find correct numerical data, such as fundraising information. In some cases, the agents ended up hallucinating wildly.

A huge concern was the lack of reliability. A semantic search over a table of company profile data with OpenAI’s o3 and Google Gemini 2.5 delivered plausible and very well summarized results, but unfortunately they were different each time I ran the query, despite using the exact same prompt and data. This is not likely to instill much trust in users.
Similarly, accessing external systems with MCP is still quite a bit of a mess. Apart from a complicated setup process and the usual questions around security (do you really want to have a fully autonomous AI access your CRM with read/write privileges?) the results tended to be quite inconsistent. This approach is clearly promising, but will need to mature quite a bit.
Bottom line: Both the pre-programmed as well as the general-purpose approach have their justification in today’s market. For broad, fairly unpredictable tasks where exact precision is not crucial, general-purpose agents such as o3 or Manus can already be tremendously useful. But for anything that needs more reliability and depth, a healthy dose of good old deterministic code will do the job.
To use an analogy: Pre-programmed workflows are a bit like SaaS. They bring structure and deep capabilities, but also limitations. General-purpose agents are like a spreadsheet: Wide open to do pretty much anything, but a lot can go wrong if the user doesn’t know what they’re doing (and is not able or willing to check results). Obviously both approaches can and will co-exist, but you have to choose wisely to get the right tools for each job.
However, as always with AI, the really interesting question is not where we are today, but where we are going. Judging from the recent progress of general-purpose AI agents I wouldn’t bet against this category. The improvements in terms of planning scope, precision of instruction following and also reduction in hallucinations has already been tremendous. MCP and other tool integration approaches are still flimsy, but it’s clear that the industry needs these things and is investing massively, which sooner or later will lead to mature platforms. I wouldn’t be surprised if general-purpose agents would take over many of today’s structured workflows in the next couple of years.