Should You Build AI Agent Capabilities? A Decision Framework for Product Managers

Your CEO asks: "Should we become an agent capability?" You have one quarter, one team, finite resources. Build for AI agents or ship features your customers asked for? The five dimensions that determine which path fits your product.

You're a PM at a SaaS company. Your product has a great UI, strong workflows, happy customers paying $50/month per seat. Your CEO forwards you an article about Apple Intelligence. Your head of sales asks if your product will work with Copilot. Your engineering lead sends you a link to the Model Context Protocol.

Everyone wants to know the same thing: "Should we become an agent capability?"

You have one quarter. One team. Finite engineering hours. You can either:

  • Ship the features your customers are asking for
  • Build capabilities for AI agents that might use your product

The wrong answer costs you six months.

Part 1 established that software is becoming liquid. This essay helps you decide if your product should flow with it.


The Question PMs Are Actually Asking

"Should we build for agents?" sounds strategic. It's not.

The real questions are:

  • Where do I allocate engineering resources?
  • What's the opportunity cost of building this vs improving the core product?
  • How do I know if this generates revenue?

You're not deciding whether agents are the future. You're deciding whether building for agents fits your product, your customers, and your business model right now.

But first, lets's first understand what "building for agents" actually means.


What Being Agentic Actually Means

An agent capability is a service that executes operations autonomously without requiring a user interface session.

Key difference: Traditional products control the user session. Agent capabilities respond to stateless API calls from autonomous systems.

Traditional UI product:

  1. User opens your app
  2. User navigates your interface
  3. User clicks buttons, reads screens, makes decisions
  4. User completes their task
  5. User closes your app

You control the session, design the flow, decide what they see.

Agent capability:

  1. Agent determines it needs your functionality (you don't control when)
  2. Agent calls your API with parameters (you don't control the context)
  3. Your service executes and returns a result
  4. Agent uses your result in a larger workflow (you don't control what happens next)
  5. Agent moves on to the next capability

You don't control the session. You don't design the flow. You might not even know which agent called you or why.

Example - Email sentiment analysis:

UI version: User logs into your dashboard, imports their email, your UI shows sentiment score and suggested tone, user reads and decides.

Agent version: Agent calls analyze_sentiment(email_text), your API returns {score: 0.72, recommended_action: "acknowledge_concern"}, agent uses this in workflow (search emails → analyze → draft response → post to Slack). Your service ran for 200 milliseconds. User never saw your product.

What this means:

You're not building a new interface. You're exposing your functionality as composable operations that agents can use without human supervision.

The implications:

  • No session context - Each call is stateless. You don't know what happened before or what will happen after.
  • No user guidance - You can't show tips, tutorials, or help text. Your API documentation is your only UI.
  • No brand presence - Your logo doesn't appear. Your carefully designed screens don't render.
  • No error recovery - If your API fails, the agent might retry, skip you, or fail the entire workflow. You can't ask the user "what did you mean?"

This is what you're deciding to build when you "become an agent capability."

Not a better UI. Not a faster app. A service that works when nobody's looking at your product.

The Five Dimensions To Assess Your Product

Use these five dimensions to assess whether agent capabilities fit your product:

Dimension 1: Revenue Model

The question: How do you charge today?

Why it matters: Agent-driven usage doesn't map to traditional pricing models.

If you charge per seat, agents break your model. One person using your product via agents can generate 100x the API calls of a typical user. Are you pricing for that? Can you?

If you charge usage-based, agents might work in your favor. More usage = more revenue. But only if your infrastructure can scale and your margins hold at higher volumes.

What to assess:

  • Can your pricing absorb 10x usage variability? A $50/month customer could spike to $500/month when agents use your product. Do your customers expect usage-based pricing? Will they pay?
  • What happens to margins at usage-based pricing? If platform orchestration adds latency, if agents retry failed calls, if you're paying per API request to your own dependencies—your cost structure changes. High-margin seat licenses might become low-margin usage-based revenue.
  • Can you maintain two surfaces? You're now maintaining UI and API surfaces simultaneously. Every feature needs versioning, deprecation paths, and breaking change management for both. Does your team have capacity for this?

Real-World Example: Slack

  • Current model: $7.25/user/month (seat-based)
  • Agent challenge: One person generates 100x API calls via agents
  • Problem: Pricing doesn't capture value, can't switch without alienating customers

Dimension 2: Customer Behavior

The question: How do your customers approve new software?

Why it matters: Agent capabilities get approved differently than UI products.

If IT approves your software through 6-month procurement cycles, they'll apply the same process to agent capabilities. Security questionnaires, vendor risk assessments, compliance reviews. Building for agents makes enterprise sales more complex.

Agent discoverability works differently than a traditional product.

When someone asks an agent "analyze my email sentiment," the agent searches for a capability that matches that intent. It reads capability descriptions, matches semantic meaning, and calls what fits. The user never browsed a list or read your marketing copy.

Your optimization changes: clear capability names and accurate descriptions matter more than screenshots or review scores. The agent finds you based on whether your description matches what the user asked for.

If developers already integrate your APIs directly, agent capabilities might already exist, you would just be formalizing them. Your challenge now is to standardize what already works.

What to assess:

  • How long does it take to add your product to a customer's tech stack? One week? One month? Six months? Agent capabilities inherit that timeline plus additional security reviews for autonomous access.
  • Do customers use APIs or only UIs? If they're API-first, exposing capabilities is incremental. If they've never touched your API, you're changing how they think about your product.
  • Who configures access controls? If IT locks down everything, they'll lock down agent access too. You need admin controls, audit logs, and permission models that fit their compliance requirements.

Real-World Example: Salesforce vs Notion

  • Salesforce (Enterprise): IT approval process applies to agent capabilities—security questionnaires, vendor risk, compliance reviews
  • Notion (SMB/Prosumer): Discovery through app stores—must optimize for semantic discoverability, not enterprise procurement
  • Key difference: Same capability, different approval and discovery mechanisms

Dimension 3: Risk Profile

The question: What's the blast radius if an agent misbehaves?

Why it matters: Not all capabilities carry equal risk.

Read-only capabilities have low blast radius. An agent that searches your product, retrieves data, summarizes information—the worst case is information disclosure. Serious, but contained.

Write capabilities have medium blast radius. An agent that creates records, sends notifications, posts updates—it can create noise, spam users, or generate bad data. Annoying, sometimes costly, but usually reversible.

Delete capabilities have high blast radius. An agent that archives emails, removes calendar events, purges records—it can cause data loss. Often irreversible. Sometimes catastrophic.

What to assess:

  • If you exposed your product to agents today with no restrictions, what's the worst that could happen? Walk through the scenario. Does it result in mild annoyance or legal liability?
  • Can you separate read from write from delete permissions? Not at the product level—at the capability level. "View calendar" and "delete event" should require separate approvals, not blanket "calendar access."
  • What happens when an agent makes 1,000 API calls in 10 seconds? Does your rate limiting catch it? Does your infrastructure handle it? Or does it bring down your service?

Real-World Example: Gmail

  • Low risk (read-only): "Search messages"—information disclosure only
  • Medium risk (write): "Send email"—creates data but reversible
  • High risk (delete): "Delete all messages from last month"—destructive, irreversible
  • Implementation: Google separates these into distinct capabilities with separate permission grants

Dimension 4: Moat Location

The question: Where is your defensibility?

Why it matters: If UI disappears, what's left?

Some products are defensible because of their interface. Figma's collaborative canvas, Notion's block-based editor. Take away the interface and you're left with "storage with an API." Not defensible.

Some products are defensible because of their data. Your customer's transaction history, their social graph, their proprietary knowledge base. The UI matters, but the data is the moat. Exposing capabilities strengthens your position—more ways to access the data = more lock-in.

Some products are defensible because of their algorithm. Recommendation engines, fraud detection models, pricing optimizers. The UI is just presentation. The moat is the capability itself. Agent access changes nothing.

What to assess:

  • If a competitor exposed the same capabilities with a better API, would customers switch? If yes, your moat isn't in the capability—it's somewhere else (brand, UI, integrations, incumbency).
  • Does your product create data that becomes more valuable over time? Customer behavior, historical trends, trained models. If agents can access that through your capabilities, you're increasing lock-in. If they can't, you're missing the opportunity.
  • Can you expose capabilities without exposing your differentiation? If your edge is how you present information, stripping away UI strips away your moat. If your edge is what you know or how you process it, agents amplify your moat.

Real-World Example: Figma vs Stripe

  • Figma (UI moat): Collaborative canvas, real-time cursors, component systems—strip away UI and compete with every design-to-code tool
  • Stripe (Algorithm moat): Fraud detection and payment reliability—agents calling APIs get same value, UI never mattered
  • Pattern: UI moats get commoditized, algorithm/data moats strengthen

Dimension 5: Time Horizon

The question: When do your customers actually need this?

Why it matters: Building too early wastes resources. Building too late means catching up.

Some customers are asking for agent integration right now. Developers want MCP servers, CLI tools, and API-first access. They're your early adopters. Small in number, vocal in feedback, willing to tolerate rough edges.

Some customers will care in 6-12 months. They've heard about agents, they're curious, but they're not blocked. They'll adopt when it's polished, when competitors have it, when the ROI is clear.

Some customers won't care for years. They're happy with the UI. They don't trust AI. They're in regulated industries that move slowly. Building for agents might attract new customers, but it won't retain current ones.

What to assess:

  • Are customers asking for this? Not "have they heard about agents" but "are they actively requesting integration?" If three customers asked this month, that's signal. If zero asked this year, building for them is speculative.
  • Are competitors shipping agent capabilities? If they are and customers notice, you're playing catch-up.
  • What's the cost of being 6 months late vs 6 months early? Early means wasted effort if adoption is slow. Late means lost customers if competitors ship first. Which risk is more expensive?

Real-World Example: GitHub Copilot vs DocuSign

  • GitHub Copilot (Ship now): Users asking for MCP integration today—want agents to commit code, open PRs, manage issues
  • DocuSign (Wait 12 months): Enterprise customers not asking yet—legal teams want humans approving contracts, can watch competitors handle compliance first
  • Pattern: Developer tools move now, enterprise tools wait for demand

Assessment Summary

Use the five dimensions to determine fit:

Revenue Model: Can your pricing model handle 10x usage spikes?
Customer Behavior: How do customers approve new software (IT vs self-serve)?
Risk Profile: What's the blast radius (read vs write vs delete)?
Moat Location: Is your defensibility in UI or data/algorithms?
Time Horizon: When do customers need this (now vs 12 months)?

If 3+ dimensions signal "poor fit," the answer is probably "not yet."


The Timing Reality

The question isn't when agents become mainstream. It's which slot you're competing for.

Enterprise IT departments don't pre-approve unlimited vendors. They approve 10-15 and lock the list. Adding an 11th vendor requires security reviews, compliance documentation, executive approval—6 to 12 months minimum. Once those slots fill, you're not competing with other vendors. You're displacing someone already integrated.

Platform partnerships follow the same pattern. Microsoft Copilot didn't approve 200 partners on day one. The first wave gets distribution and trust. The second wave competes with companies that already have both.

Even open marketplaces reward whoever ships first. Developers discover MCP servers through reputation—stars, downloads, community trust. Early publishers build that when discovery is easy. Late entrants start from zero in a crowded registry where nobody's looking past page one.

By the time the "right approach" becomes obvious, the slots will be filled. The companies positioning now aren't making perfect bets—they're making early bets that compound.

Key Question: Which slot are you competing for? And is it still open?

The Three Validation Questions

Before prototyping, answer these three questions:

Question 1: Where does your revenue actually come from?

Not your target market. Your current paying customers. Enterprises through procurement? SMBs through app stores? Developers through GitHub?

The path that fits your aspirational customer might kill your current business. If 80% of revenue comes from enterprises with 12-month contracts and you optimize for consumer virality, you're alienating the customers who pay your bills to chase customers who might not.

Question 2: How do your customers already buy software?

If they go through 6-month IT approval, open marketplaces won't work. If they expect App Store downloads, enterprise vendor meshes are the wrong abstraction. If they're developers who prefer CLIs, UI approaches will frustrate them.

Match the mechanism to how money flows.

Question 3: What can you ship in 2 weeks to learn?

Not "what's the right long-term approach." What can you implement fast enough to start learning from real usage?

App Intents if you're iOS-native with Shortcuts support. MCP server if you have developers using Claude or Cursor. REST API with semantic docs if you have enterprise SSO and admin controls.

Ship the smallest version that lets real users discover and use one capability.
Then measure:

  • Did they find it?
  • Did they trust it?
  • Did they use it again?

If all three are yes, expand. If any are no, you learned what doesn't work before investing a quarter.

The companies that ship in 2026 will have 12 months of usage data by mid-2027. The companies perfecting their positioning will have slides.


What You're Actually Deciding

You're deciding "Do we spend one quarter learning what works for our specific product, customers, and business model."

The five dimensions don't give you a yes or no. They show which constraints matter for your decision.

Sometimes the answer is "not yet." No customer demand. No competitive pressure. Resources better spent elsewhere. That's fine. Revisit in three months when the landscape changes or customers start inquiring.

The shift to liquid software is real. Whether you shift with it depends on whether it fits your product.

Subscribe to Thought Munchies

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe
Mastodon