Agents are the New Software
We don't need any other software, really. If we just have an AI agent run, it can operate for 16 hours without mistakes. GPT-3.5 could manage that for about 30 seconds. That is the pace at which things are moving.
Gramener All Hands · May 2026
How AI shifts bottlenecks, why verifiability and context are the new scarce assets, and what each role should do next.
We don't need any other software, really. If we just have an AI agent run, it can operate for 16 hours without mistakes. GPT-3.5 could manage that for about 30 seconds. That is the pace at which things are moving.
Recently, a financial research client said: "Can we create a CIM?" — a Confidential Information Memorandum, the kind of heavyweight document banks and investors use to evaluate acquisitions.
We took an earlier CIM as a reference and gave the AI a single prompt:
This entire slide deck was created based on research in approximately five minutes — and there didn't appear to be a single factual error in it. A task that once required days of analyst work.
Or consider teacher-coaching. A client said: "We have teachers conducting classes. We want to give them feedback. What should we tell the coach?" In a few hours, we processed hundreds of videos, telling coaches things like: this teacher was frequently reminding the student about upcoming assessments because of urgency, and should instead let the student come to their own conclusions.
Another client — Cengage — said: "From textbooks, we want to build test banks, instruction manuals, study guides, explainers, and so on." We put it into an agent pipeline and said: "For each chapter, create a test bank." It builds them one after another. All it takes is one instruction.
In other words, generation has become easy. But if generation has become easy, something else downstream becomes hard.
When research becomes easy, the new question is: how about the private data that specific teams have? What does it take to integrate and provide connectors to private data? Data stories have become easy to generate — but who is going to verify them?
Times of India reached out and said: "Can you create data stories for one of our features called Statoistics?" I said, "Okay, here are 30 data stories" — and that took half an hour. But then, for them to verify those 30 stories is a nightmare. Or take dashboards: a minute is all it takes to produce one. But that has to be integrated both upstream and downstream. Pipelines can be created extremely easily. But how do they fall into the production process?
The bottlenecks are constantly shifting.
| Before AI | What's hard now |
|---|---|
| Research | Private data access |
| Data stories | Verification |
| Dashboards | Integration |
| Pipelines | Workflows |
Use AI for everything. That will help you find the new bottleneck. And where there are new bottlenecks, that's where there is scarcity — build your assets there.
This is one of the most common questions I get. Everywhere I go, people are asking versions of the same thing:
Ankor Rai and I delivered a talk at Prudential on verifiability. There are different ways to verify AI output:
When you tell an LLM to calculate a bunch of numbers, it might make a mistake. When you tell it to write code to solve the same problem, it won't make a mistake. First, LLMs are very good at code. Second, code either works or it doesn't — and if it's working, it has almost certainly done the job right.
This rule-based approach can extend to many domains. For instance, there is a language called InsurLE. You can take an insurance claim and convert it into a set of rules. Then you have two pieces of code: "Here is my policy" and "Here is the claim." The question — Does this claim match the policy? — gets a verifiable yes or no. No hallucination. No ambiguity.
When AI starts generating content at scale, people will generate ten times the volume. You can be the person who comes in and starts verifying it. The ability to figure out the various ways of checking if AI output is good is power. Building that skill is an asset.
Save all your prompts. My prompts are in a public repository. Convert them into what are called skills — instructions that the agent can use when needed.
For instance, everything I've learned over the last 15 years about data storytelling is in one "Narrative Data Story" skill file. It's a long document capturing everything I know. What that means is that we have encapsulated intelligence.
Every conversation you have with your team in a meeting, every process you execute, every operational method you know — these are the kinds of things you can extract and save, and start using.
Here is an example. I asked the AI:
It came back with a prioritised list — IAS officers from a recent training session, leaders from government think tanks, clients from past engagements — a series of people I should be in touch with. Imagine using this to set up client meetings, finding from your network who you should be talking to. Or preparing for those meetings.
This is the prompt I use to prepare every day for meetings. I give it a standard prompt with historical context and live context — today's agenda, recent updates. This morning, for instance, it said in one sentence exactly what frame of mind I needed to go into a particular meeting with, what the counterpart needed to know, and what the agreed next steps were.
It's a small thing — one sentence of prep. But even for personal productivity, it makes a huge difference when you're moving between six meetings a day.
You can create several derived formats from a single piece of content too. A colleague mentioned we created a report for a consumer goods client — and then converted it into a rap song.
We took a dry business report and converted it into a rap song. Very different format, very engaging — and it got a fair bit of interest. The same content, multiple formats: video, sketchnote, report, data story, dashboard. Stop shipping one thing.
When you have more context that you save — any kind of text, any kind of process, any kind of operational method, any knowledge — just put it into a document, text file, save it, and you will start building assets that become reusable in a variety of ways.
So, what does that mean for you practically?
Your role is changing. Here's how.
Stop delivering models & notebooks. Build agent-verifiable analytical workflows. Your value is no longer the model. It's knowing what to verify and how.
Learn agent harness engineering — Git checkpoints, Docker isolation, MCP connectors, CLI-friendly tools, AGENTS.md. The agent writes the code. You own the architecture it runs inside.
Become excellent at examples, exclusions, acceptance criteria, and failure cases. These are what agents need to do your work well. You're writing their job description now.
Stop selling AI features. Sell verified outcomes. "We built it" is table stakes. "We can prove it works, and we'll own it if it doesn't" is the pitch.
Treat prompts, rubrics, validation rules, and postmortems as project assets. Version-controlled, reusable, owned. Not email threads.
Shift from case studies to living demos. Every document you publish can now spawn a podcast, a sketchnote, a video, a quiz. Stop shipping one thing.
Train through games and agent-native challenges, not slide decks. If someone can't complete a task with an agent within a time limit, that's the signal. Not whether they passed a module.
"Use AI for everything. You will find the next bottleneck. The bottleneck is where scarcity lies, and that is where you need to build assets."
— Anand S, Gramener All Hands, May 2026AI agents can now run for hours autonomously — what GPT-3.5 managed for 30 seconds, frontier models can sustain for 16 hours. The pace of capability improvement is non-linear. What's impossible today will be trivial next quarter.
Generation is no longer the bottleneck. Research, data stories, dashboards, and pipelines are all easy to generate. The bottleneck has shifted to private data access, verification, integration, and workflows. That's where the scarce value now lives.
Verifiability is a competitive asset. The most common client concern is not capability — it's trust. LLM-as-judge, human-in-the-loop, citations, and executable code/rules are the four verification strategies. The ability to pick the right one is rare and valuable.
Code beats arithmetic. When an LLM writes code to solve a numerical problem rather than computing directly, accuracy is dramatically higher — because code either works or it doesn't.
Context is the new moat. Save your prompts, processes, and domain expertise in reusable documents and skills. Every conversation, postmortem, and meeting transcript is raw material for an asset. The people who systematically capture and reuse context will compound their advantage.
One piece of content, ten formats. Documents, reports, and data can now spawn podcasts, sketchnotes, videos, quizzes, and dashboards automatically. The teams that publish multiple derived formats from a single source will vastly out-reach those who ship one thing.
Every role has a specific pivot. Data scientists: shift from models to verifiable workflows. Developers: own the architecture agents run inside. Business analysts: master examples and acceptance criteria. Sales: sell verified outcomes, not AI features. PM: version-control your prompts and rubrics.
The one-sentence summary: Use AI for everything. Find the new bottleneck. Build assets there.