How To Design Apis For An Ai World
Hey, Luca here, welcome to a new edition of Refactoring! Every week you get:
Here are the latest editions you may have missed: To access all our articles, library, and community, subscribe to the paid version: How to Design APIs for an AI World 🔌A thorough analysis of how AI changes what is needed from a good API, with real-world examples + some speculations about the future.
As engineers, we've been building applications for different audiences throughout the decades. At first, we only built for humans — creating visual interfaces, buttons, and forms. Then came the APIs, and we started building for humans and other software — predictable systems talking to each other through well-defined contracts. Now it feels like we are entering a third era, where a relevant consumer is neither human nor traditional software. It's AI. And here is the thing: AI agents don't behave like either of their predecessors. On one side, AI looks like regular software: LLM calls are fundamentally API calls, and our basic mental models about load, latency, and cost, still apply. On the other side, AI brings a degree of understanding and… chaos that, in many respects, makes them closer to how humans do things. They can:
It is obvious that to get the most value out of AI, we need to ensure it can interact with the rest of our tech stack. Advancements like MCP have driven progress, but it’s also clear that simply piping LLMs to APIs in the same ways we always have won’t get us the most value out of this collaboration. We need to rethink how we approach API design with a new consumer in mind: one that is neither software nor human. Because, in a way, AI agents are both. While we are firmly in the realm of speculation here, I believe we have been working with AI for long enough to be able to set some coordinates that are unlikely to change, and consider what AI-first APIs may look like. To help me with this, I am also bringing in none other than Ankit Sobti, co-founder and CTO of Postman. Postman is the world’s leading API platform, and Ankit has been working at the forefront of this space for more than 10 years. So here is what we will cover today:
Let's dive in! Disclaimer: I am a fan of what Ankit and the team are building at Postman, and I am grateful they partnered on this piece. However, I will only write my unbiased opinion about the practices and tools covered here, including Postman. Learn more about their new MCP Catalog and MCP Generator below 👇 ✨ Thriving in ambiguityEvery major shift in how we build interfaces has been about identifying and removing friction. The transition from human-first to API-first wasn't just about mobile apps or microservices — it was about recognizing that human interpretation was a bottleneck. We moved from clicking buttons to calling endpoints to eliminate the need for a human to translate intent into action. Traditional APIs are contracts between deterministic systems. They're the software equivalent of a vending machine: insert exact change, press B4, get your snack. The entire API-first movement is built on the premise of bottling intent into a predictable process. Now, in a way, AI is bringing human-like interpretation back. When an LLM calls an API, it's not following a hard-coded path — it's reasoning (kinda) about what to do. This creates a paradox: we spent two decades removing ambiguity from our interfaces, and now our newest consumers thrive on understanding ambiguity. It’s literally their superpower: they can parse poorly formatted JSON, understand inconsistent field names, guess what undocumented params probably do, and move past all kinds of grammar/syntax mistakes. So the question becomes: how do we design systems that make the most out of this? 🤖 Designing APIs for AIAI operates under different constraints and capabilities than traditional software. These, in turn, are going to change API design. As engineers, here are some topics we now need to take into account:
Let's explore each of these and their engineering implications: 1) Token economy 💰Here's a constraint traditional APIs have (largely) never faced: every byte costs money. When an LLM processes your API response, those tokens aren't free. So, for example, should field names be On the other hand, terse, abbreviated responses lose the self-documenting nature that helps AI understand APIs. You're optimizing for both comprehension and compression — a tradeoff that didn't exist before. To navigate this, some teams are experimenting with:
2) High latency ⏱️Traditional API calls are fast — milliseconds. LLMs are slow — seconds. This completely changes how you think about API orchestration. Consider a typical AI workflow:
What used to be sub-second can now easily become 5-10 seconds. This has real implications:
3) Self-healing error handling 🔄When traditional systems hit an error, they just fail. When AI hits an error, things get interesting. We are seeing AI consumers that:
This adaptability is chaotic but potentially powerful, and API design can take advantage of it:
Traditional error response:
AI-optimized error response:
In the second example, the additional info would be too difficult for regular software to parse—you’d need to code all the possible cases. Meanwhile, AI can actually act on it pretty easily. It might retry with the correct format, try the alternative endpoint, or explain the issue to the user. So, in this case, you are not just reporting errors: you are enabling recovery. 4) Non-Deterministic Operations 🎲Traditional APIs assume clients behave predictably. AI does not. An LLM might:
This forces a rethink of API statefulness and control:
Finally, a single AI conversation might spawn dozens of parallel API calls. Traditional rate limiting breaks down when your "single user" is actually an LLM orchestrating complex flows. So teams are experimenting with:
For example, Amazon SageMaker offers options like "provisioned concurrency" to handle predictable bursts, a pattern that could evolve into more nuanced "semantic" or "burst-allowance" rate limiting tailored for AI agents' parallel thinking. 5) Documentation as Runtime 📚Your documentation is no longer just developer guidance — it's part of your operational system. Unlike humans who read docs once and internalize patterns, AI processes documentation with nearly every decision. This changes what documentation means:
This shift means documentation requires the same rigor as code:
This is a brave new world, and some companies are starting to adapt. For example,. Stripe's new agent toolkit explicitly acknowledges how AI interprets their documentation and examples. 🔗 MCP — the current state of AI-first APIsHow does all of this relate to MCP, the emerging standard on how AI should use APIs and services? Actually, not a lot. MCP is about the connection problem rather than the API design challenge. It's plumbing, not an architectural overhaul. But let’s look at it briefly 👇 1) What is MCP?MCP is an open protocol introduced by Anthropic that standardizes how AI assistants connect with external data sources and tools. Think of it as a universal adapter that lets AI systems interact with any service—from databases to APIs to local files—through a consistent interface. At its core, MCP defines three things:
At its core, it’s pretty simple. Instead of every AI tool creating custom integrations for Slack, GitHub, or your internal database, developers write an MCP server once, and any MCP-compatible AI can use it. Small plug: Postman recently launched an MCP catalog to showcase these integrations, making it easier for developers to find and share MCP servers. They also debuted their own MCP generator to easily create MCP servers. 2) How MCP addresses our API challengesLooking back at the problems we identified, MCP actually tackles several key challenges, including:
But it doesn’t address others, such as:
To me, this is 100% ok, because 1) it is unclear whether the scope of MCP should include all or any of these (and when in doubt, always better to go with a smaller scope), and 2) it’s more important to converge on a standard than to design a perfect standard. Fortunately, it feels we are converging on MCP pretty fast, which suggests the industry was quite hungry for it. 🔮 Beyond APIs — the agent-to-agent futureWe've spent this entire article assuming APIs are still... APIs. We are acknowledging our new consumers are smart, but we are still assuming our APIs are dumb: fixed endpoints, predetermined schemas, static docs. Explicit state in responses, good batching, or semantic rate limiting, feel like good incremental improvements, but do they feel AI-native? If you ask me, not really. What is AI-native, then? No one knows for sure, and any speculation tends to age extremely poorly, but consider this: if we are putting intelligence on the consumer side (LLMs calling APIs), it feels natural to me, at some point, to have intelligence on the provider side too. Most of the constraints and capabilities we discussed today are about adaptation, flexibility, and non-determinism. The workarounds we discussed (e.g. explicit state, retry instructions, verbose errors) are not bad but they feel like ways to constrain this non-determinism — to predict where it can go and create systems around it. But this is the old way. The end game might be, instead, to have providers that are just as flexible as consumers, and can adapt on the spot: adapt to constraints on latency and cost, adapt to non-deterministic sequences of calls, and help with self-healing by problem-solving with the other party. This is largely what Google is envisioning and designing for with the Agent2Agent protocol. A critical quote from Google👇
So, this agent-to-agent future isn't just hypothetical, but rather likely, if we reflect on all of this from first principles. I believe that, as good engineers, we need to design by keeping a foot in the present (MCPs and incremental improvements we discussed), and a foot in the future (A2A), accounting for what may come next. 📌 Bottom lineAnd that's it for today! We've covered a lot of ground — from the evolution of application interfaces to the emerging agent-to-agent future. Here are some takeaways:
So whether you're building APIs today or planning for tomorrow, the message is clear: the age of static, rigid interfaces is ending. The future belongs to systems that can think, adapt, and collaborate. Time to start building them! See you next week 👋 Sincerely |