Skip to main content
Steve Manuel
CEO, Co-founder @ Dylibso
View all authors

MCP: The Differential for Modern APIs and Systems

Β· 14 min read
Steve Manuel
CEO, Co-founder @ Dylibso
TL;DR

MCP enables vastly more resilient system integrations by acting as a differential between APIs, absorbing changes and inconsistencies that would typically break traditional integrations. With "the prompt is the program," we can build more adaptable, intention-based integrations that focus on what needs to be done rather than how it's technically implemented.

Try it now β†’

The Problem with Traditional API Integration​

When engineers connect software systems, they're engaging in an unspoken contract: "I will send you data in exactly this format, and you will do exactly this with it." This rigid dependency creates brittle integrations that break when either side makes changes.

Traditional API integration looks like this:

// System A rigidly calls System B with exact parameters
const response = await fetch("https://api.system-b.com/v2/widgets", {
method: "POST",
headers: {
"Content-Type": "application/json",
"Authorization": "Bearer " + token,
},
body: JSON.stringify({
name: widgetName,
color: "#FF5733",
dimensions: {
height: 100,
width: 50,
},
metadata: {
created_by: userId,
department: "engineering",
},
}),
});

// If System B changes ANY part of this structure, this code breaks

This approach has several fundamental problems:

  1. Version Lock: Systems get locked into specific API versions
  2. Brittle Dependencies: Small changes can cause catastrophic failures
  3. High Maintenance Burden: Keeping integrations working requires constant vigilance and updates
  4. Implementation Details Exposure: Systems need to know too much about each other

The Automobile Differential: A Mechanical Analogy​

To understand how MCP addresses these issues, let's examine a mechanical engineering breakthrough: the differential gear in automobiles.

Before the differential, cars had a serious problem. When turning a corner, the outer wheel needs to travel farther than the inner wheel. With both wheels rigidly connected to the same axle, this created enormous stress, causing wheels to slip, skid, and wear out prematurely.

The differential solved this by allowing wheels on the same axle to rotate at different speeds while still delivering power to both. It absorbed the differences between what each wheel needed.

MCP: The API Differential​

MCP functions as an API differential – it sits between systems and absorbs their differences, allowing them to effectively work together despite technical inconsistencies.

How does this work? Through intention-based instructions rather than implementation-specific calls.

Traditional API Calls vs. MCP Calls​

Traditional API:

// Extremely specific requirements
await client.createWidget({
name: "Customer Dashboard",
type: "analytics",
visibility: "team",
access_level: 3,
resource_id: "res_8675309",
parent_id: "dash_12345",
metadata: {
created_by: "user_abc123",
department_id: 42,
},
});

MCP Approach:

// High-level intention
"Create a new analytics widget called 'Customer Dashboard'
for the marketing team's main dashboard"

With the MCP approach, the client doesn't need to know all the specific parameters upfront. Instead, it engages in a discovery-based workflow:

How MCP Clients Resolve Required Parameters​

Tool Discovery, Dependency Analysis, Parameter Resolution​

When an MCP Client receives a high-level instruction like creating a dashboard widget, it follows a sophisticated process:

  1. Tool Discovery: The client first queries the MCP Server for available tools:
// MCP Client requests available tools
const toolList = await listTools();

// Server returns available tools with their descriptions
[
{
"name": "create_widget",
"description": "Creates a new widget on a dashboard",
"inputSchema": {
"type": "object",
"required": ["name", "dashboard_id", "widget_type"],
"properties": {
"name": {
"type": "string",
"description": "Display name for the widget",
},
"dashboard_id": {
"type": "string",
"description": "ID of the dashboard to add the widget to",
},
"widget_type": { ... },
"team_id": { ... },
// Other parameters...
},
},
},
{
"name": "list_dashboards",
"description": "List available dashboards, optionally filtered by team",
"inputSchema": { ... },
},
{
"name": "get_user_info",
"description": "Get information about the current user or a specified user",
"inputSchema": { ... },
},
{
"name": "get_team_info",
"description": "Get information about a team",
"inputSchema": { ... },
},
];
  1. Dependency Analysis: The LLM analyzes the instruction and the available tools, recognizing that to create a widget, it needs:

    • The dashboard ID for "marketing team's main dashboard"
    • Possibly the team ID for the "marketing team"
  2. Parameter Resolution: The LLM plans and executes a sequence of dependent calls:

(This is deterministic code for demonstration, but the model infers these steps on its own!)

// First, get the marketing team's ID
const teamResponse = await callTool("get_team_info", {
team_name: "marketing",
});
// teamResponse = { team_id: "team_mktg123", department_id: 42, ... }

// Next, find the marketing team's main dashboard
const dashboardsResponse = await callTool("list_dashboards", {
team_name: "marketing",
});
// dashboardsResponse = [
// { id: "dash_12345", name: "Main Dashboard", is_primary: true, ... },
// { id: "dash_67890", name: "Campaign Performance", ... }
// ]

// Filter for the main dashboard
const mainDashboard = dashboardsResponse.find((d) => d.is_primary) ||
dashboardsResponse.find((d) => d.name.toLowerCase().includes("main"));

// Finally, create the widget with all required parameters
const widgetResponse = await callTool("create_widget", {
name: "Customer Dashboard",
dashboard_id: mainDashboard.id,
widget_type: "analytics",
team_id: teamResponse.team_id,
});
  1. Semantic Mapping: The MCP Server handles translating from the standardized tool parameters to the specific API requirements, which might involve:
    • Translating team_id to the internal department_id format
    • Setting the appropriate access_level based on team permissions
    • Generating a unique resource_id
    • Populating metadata based on contextual information

This approach is revolutionary because:

  1. Resilient to Changes: If the underlying API changes (e.g., requiring new parameters or renaming fields), only the MCP Server needs to update – the high-level client instruction stays the same
  2. Intent Preservation: The focus remains on what needs to be accomplished, not how
  3. Progressive Enhancement: New API capabilities can be leveraged without client changes
  4. Contextual Intelligence: The LLM can make smart decisions about which dashboard is the "main" one based on naming, flags, or other context

With MCP, the instruction doesn't change even when the underlying API changes dramatically. The MCP Server handles the translation from high-level intent to specific API requirements.

Why This Matters for System Resilience​

In an MCP-enabled world, system upgrades, API changes, and even complete backend replacements can happen with minimal disruption to connected systems.

Consider a company migrating from Salesforce to HubSpot. Traditionally, this would require rewriting dozens or hundreds of integrations. With MCP:

  1. The high-level instructions ("Create a new contact for John Smith from Acme Corp") stay the same
  2. Only the MCP Server implementation changes to target the new system
  3. Existing systems continue functioning with minimal disruption

The Technical Magic Behind MCP's Differential Capabilities​

MCP achieves this resilience through several key mechanisms:

1. Dynamic Tool Discovery​

Unlike traditional SDKs where available methods are fixed at compile/build time, MCP Clients discover tools dynamically at runtime. This means:

  • New capabilities can be added without client updates
  • Deprecated features can be gracefully removed
  • Feature flags can be applied at the server level

2. Semantic Parameter Mapping​

Semantic Parameter Mapping enables systems to communicate based on meaning rather than rigid structure. This approach drastically improves resilience and adaptability in integrations.

In traditional API integration, parameters are often implementation-specific and tightly coupled to the underlying data model. For example, a CRM API might require a customer record to be created with fields like cust_fname, cust_lname, and cust_type_id - names that reflect internal database schema rather than their semantic meaning.

With MCP's semantic parameter mapping, tools are defined with parameters that reflect their conceptual purpose, not their technical implementation:

Traditional API Parameters:

{
"cust_fname": "Jane",
"cust_lname": "Smith",
"cust_type_id": 3,
"cust_status_cd": "A",
"addr_line1": "123 Main St",
"addr_city": "Springfield",
"addr_state_cd": "IL",
"addr_zip": "62701",
"cust_src_id": 7,
"rep_emp_id": "EMP82736"
}

MCP Tool Description with Semantic Parameters:

{
"name": "create_customer",
"description": "Create a new customer record in the CRM system",
"inputSchema": {
"type": "object",
"properties": {
"firstName": {
"type": "string",
"description": "Customer's first or given name"
},
"lastName": {
"type": "string",
"description": "Customer's last or family name"
},
"customerType": {
"type": "string",
"enum": ["individual", "business", "government", "non-profit"],
"description": "The category of customer being created"
},
"address": {
"type": "object",
"description": "Customer's primary address",
"properties": {
"street": {
"type": "string",
"description": "Street address including number and name"
},
"city": {
"type": "string",
"description": "City name"
},
"state": {
"type": "string",
"description": "State, province, or region"
},
"postalCode": {
"type": "string",
"description": "ZIP or postal code"
},
"country": {
"type": "string",
"description": "Country name",
"default": "United States"
}
}
},
"source": {
"type": "string",
"description": "How the customer was acquired (e.g., 'website', 'referral', 'trade show')"
},
"assignedRepresentative": {
"type": "string",
"description": "Name or identifier of the sales representative assigned to this customer",
"required": false
}
},
"required": ["firstName", "lastName", "customerType"]
}
}

The key differences in the semantic approach:

  1. Human-readable parameter names: Using firstName instead of cust_fname makes the parameters self-descriptive

  2. Hierarchical organization: Related parameters like address fields are nested in a logical structure

  3. Descriptive enumerations: Instead of opaque codes (like cust_type_id: 3), semantically meaningful values like "business" are used

  4. Clear descriptions: Each parameter includes a description of its purpose rather than just its data type

  5. Meaningful defaults: When appropriate, semantic defaults can be provided

This semantic approach provides tremendous advantages:

Future-proofing​

If the underlying CRM changes its internal codes or structure, only the mapping function in the MCP Server needs to be updated

Interoperability​

Multiple different CRM systems could implement MCP Servers with the same semantic parameters, allowing seamless switching between backends

Conceptual clarity​

People and AI systems can more easily understand the parameters and their purpose

Field validation​

Semantic validation becomes possible (e.g., ensuring state names are valid) rather than just type checking

Extensibility​

New parameters can be added to the semantic schema without breaking existing integrations

When combined with intent-based execution, semantic parameter mapping creates a powerful abstraction layer that shields systems from the implementation details of their integration partners, making the entire ecosystem more adaptable and resilient to change.

3. Intent-Based Execution​

Intent-based execution is perhaps the most transformative aspect of MCP. Let me walk you through a detailed example that illustrates how this works in practice.

Imagine a scenario where a business wants to "send a quarterly performance report to all department heads." This seemingly simple task involves multiple steps and systems in a traditional integration context:

Traditional Integration Approach:

  1. Query the HR system to identify department heads
  2. Access the financial system to gather quarterly performance data
  3. Generate a PDF report using a reporting engine
  4. Connect to the email system to send personalized emails with attachments
  5. Log the communication in the CRM system

Each of these steps would require detailed knowledge of the respective APIs, authentication methods, data formats, and error handling. If any system changes its API, the entire integration could break.

With MCP's Intent-Based Execution:

The MCP Client (like Tasks) might simply receive the instruction:

"Send our Q1 2024 performance report to all department heads. Include YoY comparisons and highlight areas exceeding targets by more than 10%."

Behind the scenes, the MCP Client would:

  1. Recognize the high-level intent and determine this requires multiple tool calls
  2. Query the MCP Server for available tools related to reporting, employee data, and communications
  3. Based on the tool descriptions, construct a workflow:

(Again, not executed code, but to illustrate the inferred logic the LLM runs!)

// In step 2 listed above, the MCP Client has all the
// static identifiers and parameters from the available
// tool descriptions to be used in this code

// First, identify who the department heads are
const departmentHeads = await callTool("get_employees", {
filters: { position_type: "department_head", status: "active" },
});

// Get financial performance data for Q1 2024
const financialData = await callTool("get_financial_report", {
period: "q1_2024",
metrics: ["revenue", "expenses", "profit_margin", "growth"],
comparisons: ["year_over_year"],
});

// The LLM analyzes the data to identify high-performing areas
const highlights = financialData.metrics.filter(
(metric) => metric.year_over_year_change > 10,
);

// Generate a report with the appropriate formatting and emphasis
const report = await callTool("create_report", {
title: "Q1 2024 Performance Report",
data: financialData,
highlights: highlights,
format: "pdf",
template: "quarterly_executive",
});

// Send the report to each department head with a personalized message
for (const head of departmentHeads) {
await callTool("send_email", {
recipient: head.email,
subject: "Q1 2024 Performance Report",
body:
`Dear ${head.name},\n\nPlease find attached our Q1 2024 performance report. Your department ${head.department} showed ${
highlights.some((h) => h.department === head.department)
? "exceptional performance in some areas"
: "consistent results"
}.\n\nRegards,\nExecutive Team`,
attachments: [report.file_id],
log_to_crm: true,
});
}

The crucial difference is that the MCP Server for each system is responsible for translating these semantic, intent-based calls into whatever specific API calls its system requires.

For example, the HR system's MCP Server might translate get_employees with a position or role filter into a complex SQL query or LDAP search, while the reporting system's MCP Server might convert create_report into a series of API calls to a business intelligence platform.

If any of these backend systems change:

  • The HR system might switch from an on-premise solution to Workday
  • The financial system might upgrade to a new version with a completely different API
  • The reporting engine might be replaced with a different vendor
  • The email system might move from Exchange to Gmail

None of these changes would affect the high-level intent-based instruction. Only the corresponding MCP Servers would need to be updated to translate the same semantic calls into the new underlying system's language.

This is the true power of intent-based execution with MCP - it decouples what you want to accomplish from the technical details of how to accomplish it, creating resilient integrations that can withstand significant changes in the underlying technology landscape.

Building Resilient Systems with MCP​

To leverage MCP as a differential in your own systems:

  1. Focus on intent over implementation: Design your tools around what they accomplish, not how
  2. Embrace semantic parameters: Name and structure parameters based on their meaning, not your current implementation
  3. Build for adaptation: Assume underlying APIs will change and design your MCP Servers to absorb these changes

The Future of System Integration​

As we move toward a world where "the prompt is the program," traditional rigid API contracts will increasingly be replaced by intent-based interfaces. MCP provides a standardized protocol for this transition.

The implications are profound:

  • Reduced integration maintenance: Systems connected via MCP require less ongoing maintenance
  • Faster adoption of new technologies: Backend systems can be replaced without disrupting front-end experiences
  • Greater composability: Systems can be combined in ways their original designers never anticipated
  • Longer component lifespan: Software components can remain useful far longer despite ecosystem changes

The differential revolutionized transportation by solving a mechanical impedance mismatch. MCP is poised to do the same for software integration by solving the API impedance mismatch that has plagued systems for decades.

The future of integration isn't more rigid contracts – it's more flexible, intent-based communication between systems that can adapt as technology evolves.

Get Started with MCP​

Ready to build more resilient integrations? Here's how to start:

  1. Explore the MCP specification at mcp.run
  2. Install mcpx, or generate an SSE URL from your Profile to start using MCP tools immediately in any MCP Client
  3. Consider which of your systems would benefit from an MCP interface
  4. Join our community to learn MCP best practices

MCP March Madness

Β· 27 min read
Steve Manuel
CEO, Co-founder @ Dylibso

MCP March Madness

Announcing the first MCP March Madness Tournament!​

This month, we're hosting a face-off like you've never seen before. If "AI Athletes" wasn't on your 2025 bingo card, don't worry... it's not quite like that.

We're putting the best of today's API-driven companies head-to-head in a matchup to see how their MCP servers perform when tool-equipped Large Language Models (LLMs) are tasked with a series of challenges.

How It Works​

Each week in March, we will showcase two competing companies in a test to see how well AI can use their product through API interactions.

This will involve a Task that describes specific work requiring the use of the product via its API.

For example:

"Create a document titled 'World Populations' and add a table containing the world's largest countries including their name, populations, and ranked by population size. Provide the URL to this new document in your response."

In this example matchup, we'd run this exact same Task twice - first using Notion tools, and in another run, Google Docs. We'll compare the outputs, side-effects, and the time spent executing. Using models from Anthropic and OpenAI, we'll record each run so you can verify our findings!

We're keeping these tasks fairly simple to test the model's accuracy when calling a small set of tools. But don't worry - things will get spicier in the Grand Finale!

Additionally, we're releasing our first evaluation framework for MCP-based tool calling, which we'll use to run these tests. We really want to exercise the tools and prompts as fairly as possible, and we'll publish all evaluation data for complete transparency.

Selecting the MCPs​

As a prerequisite, we're using MCPs available on our public registry at www.mcp.run. This means they may not have been created directly by the API providers themselves. Anyone is able to publish MCP servlets (WebAssembly-based, secure & portable MCP Servers) to the registry. However, to ensure these matchups are as fair as possible, we're using servlets that have been generated from the official OpenAPI specifications provided by each company.

MCP Servers [...] generated from the official OpenAPI specification provided by each company.

Wait, did I read that right?

Yes, you read that right! We'll be making this generator available later this month... so sign up and follow-along for that announcement.

So, this isn't just a test of how well AI can use an API - it's also a test of how comprehensive and well-designed a platform's API specification is!

NEW: Our Tool Use Eval Framework​

As mentioned above, we're excited to share our new evaluation framework for LLM tool calling!

mcpx-eval is a framework for evaluating LLM tool calling using mcp.run tools. The primary focus is to compare the results of open-ended prompts, such as mcp.run Tasks. We're thrilled to provide this resource to help users make better-informed decisions when selecting LLMs to pair with mcp.run tools.

If you're interested in this kind of technology, check out the repository on GitHub, and read our full announcement for an in-depth look at the framework.

The Grand Finale​

After 3 rounds of 1-vs-1 MCP face-offs, we're taking things up a notch. We'll put the winning MCP Servers on one team and the challengers on another. We'll create a new Task with more sophisticated work that requires using all 3 platforms' APIs to complete a complex challenge.

May the best man MCP win!


2025 Schedule​

Round 1: Supabase vs. Neon​

πŸ“… March 5, 2025

Round 1: Supabase vs. Neon

We put two of the best Postgres platforms head-to-head to kick-off MCP March Madness! Watch the matchup in realtime, executing a Task to ensure a project is ready to use and to generate and execute the SQL needed to set up a database to manage our NewsletterOS application.

Here's the prompt:

Pre-requisite:
- a database and or project to use inside my account
- name: newsletterOS

Create the tables necessary to act as the primary transactional database for a Newsletter Management System, where its many publishers manage the creation of newsletters and the subscribers to each newsletter.

I expect to be able to work with tables of information including data on:
- publisher
- subscribers
- newsletters
- subscriptions (mapping subscribers to newsletters)
- newsletter_release (contents of newsletter, etc)
- activity (maps publisher to a enum of activity types & JSON)

Execute the necessary queries to set my database up with this schema.

In each Task, we attach the Supabase and Neon mcp.run servlets to the prompt, giving our Task access to manage those respective accounts on our behalf via their APIs.

See how Supabase handles our Task as Claude Sonnet 3.5 uses MCP server tools:

Next, see how Neon handles the same Task, leveraging Claude Sonnet 3.5 and the Neon MCP server tools we generated from their OpenAPI spec.

The results are in... and the winner is...

πŸ† Supabase πŸ†

Summary​

Unfortunately, Neon was unable to complete the Task as-is, using only its functionality exposed via their official OpenAPI spec. But, they can (and hopefully will!) make it so an OpenAPI consumer can run SQL this way. As noted in the video, their hand-writen MCP Server does support this. We'd love to make this as feature-rich on mcp.run so any Agent or AI App in any language or framework (even running on mobile devices!) can work as seamlessly.

Eval Results​

In addition to the Tasks we ran, we also executed this prompt with our own eval framework mcpx-eval. We configure this eval using the following, and when it runs we provide the profile where the framework can load and call the right tools:

name = "neon-vs-supabase"
max-tool-calls = 100

prompt = """
Pre-requisite:
- a database and or project to use inside my account
- name: newsletterOS

Create the tables and any seed data necessary to act as the primary transactional database for a Newsletter Management System, where its many publishers manage the creation of newsletters and the subscribers to each newsletter.

Using tools, please create tables for:
- publisher
- subscribers
- newsletters
- subscriptions (mapping subscribers to newsletters)
- newsletter_release (contents of newsletter, etc)
- activity (maps publisher to a enum of activity types & JSON)


Execute the necessary commands to set my database up for this.

When all the tables are created, output the queries to describe the database.
"""

check="""
Use tools and the output of the LLM to check that the tables described in the <prompt> have been created.

When selecting tools you should never in any case use the search tool.
"""

ignore-tools = [
"v1_create_a_sso_provider",
"v1_update_a_sso_provider",
]

Supabase vs. Neon (with OpenAI GPT-4o)

Supabase mcpx-eval: OpenAI GPT-4o

(click to enlarge)

Neon (left) outperforms Supabase (right) on the accuracy dimension by a few points - likely due to better OpenAPI spec descriptions, and potentially more specific endpoints. These materialize as tool calls and tool descriptions, which are provided as context to the inference run, and make a big difference.

In all, both platforms did great, and if Neon adds query execution support via OpenAPI, we'd be very excited to put it to use.

Next up!​

Stick around for another match-up next week... details below ⬇️


Round 2: Resend vs. Loops​

πŸ“… March 12, 2025

Round 2: Resend vs. Loops

Everybody loves email, right? Today we're comparing some popular email platforms to see how well their APIs are designed for AI usage. Can an Agent or AI app successfully carry out our task? Let's see!

Here's the prompt:

Unless it already exists, create a new audience for my new newsletter called: "{{ audience }}"

Once it is created, add a test contact: "{{ name }} {{ email }}".

Then, send that contact a test email with some well-designed email-optimized HTML that you generate. Make the content and the design/theme relevant based on the name of the newsletter for which you created the audience.

Notice how we have parameterized this prompt with replacement parameters! This allows mcp.run Tasks to be dynamically updated with values - especially helpful when triggering them from an HTTP call or Webhook.

In each Task, we attach the Resend and Loops mcp.run servlets to the prompt, giving our Task access to manage those respective accounts on our behalf via their APIs.

See how Resend handles our Task using its MCP server tools:

Next, see how Loops handles the same Task, leveraging the Loops MCP server tools we generated from their OpenAPI spec.

The results are in... and the winner is...

πŸ† Resend πŸ†

Summary​

Similar to Neon in Round 1, Loops was unable to complete the Task as-is, using only its functionality exposed via their official OpenAPI spec. Hopefully they add the missing API surface area to enable an AI application or Agent to send transactional email along with a new template on the fly.

Resend was clearly designed to be extremely flexible, and the model was able to figure out exactly what it needed to do in order to perfectly complete our Task.

Eval Results​

In addition to the Tasks we ran, we also executed this prompt with our own eval framework mcpx-eval. We configure this eval using the following, and when it runs we provide the profile where the framework can load and call the right tools:

name = "loops-vs-resend"

prompt = """
Unless it already exists, create a new audience for my new newsletter called: "cat-facts"
Once it is created, add a test contact: "Zach [email protected]"

Then, send that contact a test email with some well-designed email-optimized HTML that you generate. Make the content and the design/theme relevant based on the name of the newsletter for which you created the audience.
"""

check="""
Use tools to check that the audience exists and that the email was sent correctly
"""

Resend vs. Loops (with OpenAI GPT-4o)

Supabase mcpx-eval: Claude Sonnet 3.7

(click to enlarge)

Resend (left) outperforms Loops (right) accross the board. In part due to Loops missing functionality to complete the task, but also likely that Resend's OpenAPI spec is extremely comprehensive and includes very rich descriptions and detail.

Remember, all of this makes its way into the context of the inference request, and influences how the model decides to respond with a tool request. The better your descriptions, the more accurately the model will use your tool!

Next up!​

Stick around for another match-up next week... details below ⬇️


πŸ“… March 19, 2025

Round 3: Perplexity vs. Brave Search

If you're looking for something like DeepResearch, without the "PhD-level reasoning" or the price tag that goes along with it, then this is the round for you!

Perplexity is a household name, and packs a punch for sourcing relevant and recent information on any subject. Through its Sonar API, we can programmatically make our way through the web. Brave exposes its powerful, more traditional search engine via API. Which one can deliver the best results for us when asked to find recent news and information about a given topic?

Here's the prompt:

We need to find the latest, most interesting and important news for people who have subscribed to our "{{ topic }}" newsletter. 

To do this, search the web for news and information about {{ topic }}, do many searches for newly encountered & highly related terms, associated people, and other related insights that would be interesting to our subscribers.

These subscribers are very aware of {{ topic }} space and what is happening, so when we find a good source on the web, also add some intellegent & researched prose around the article or content. Limit this to just a sentence or two, and include it in the output you provide.

Output all of the links you find on the web and your expert additional prose in a Markdown format so it can be read and approved by a 3rd party.

Notice how we have parameterized this prompt with replacement parameters! This allows mcp.run Tasks to be dynamically updated with values - especially helpful when triggering them from an HTTP call or Webhook.

In each Task, we attach the Perplexity and Brave Search mcp.run servlets to the prompt, giving our Task access to manage those respective accounts on our behalf via their APIs.

This round, we've combined the Task runs into a single video. Check them out:


Content Output​

Here's the full, rendered output from each of the Tasks run in the video. What do you think, which did a better job finding us results for "AI Agent" news to include in a newsletter?

AI Agents Newsletter: Latest Developments - March 2025

NVIDIA's Game-Changing AI Agent Infrastructure

NVIDIA AI-Q Blueprint and AgentIQ Toolkit

NVIDIA unveiled AI-Q, a comprehensive Blueprint for developing agentic systems that's reshaping how enterprises build AI agents. The framework integrates NVIDIA's accelerated computing with partner storage platforms and software tools.

The AI-Q Blueprint represents NVIDIA's strategic move to dominate the enterprise AI agent infrastructure market, positioning them as the essential foundation for companies building sophisticated agent systems.

Llama Nemotron Model Family

NVIDIA launched the Llama Nemotron family of open reasoning AI models designed specifically for agent development. Available in three sizes (Nano: 8B, Super: 49B, and Ultra: 253B parameters), these models offer advanced reasoning capabilities with up to 20% improved accuracy over base Llama models.

These models are particularly significant as they offer hybrid reasoning capabilities that let developers toggle reasoning on/off to optimize token usage and costsβ€”a critical feature for enterprise deployment that could accelerate adoption.

Enterprise AI Agent Adoption

Industry Implementation Examples

  • Yum Brands is deploying voice ordering AI agents in restaurants, with plans to roll out to 500 locations this year.
  • Visa is using AI agents to streamline cybersecurity operations and automate phishing email analysis.
  • Rolls-Royce has implemented AI agents to assist service desk workers and streamline operations.

While these implementations show promising use cases, the ROI metrics remain mixedβ€”only about a third of C-suite leaders report substantial ROI in areas like employee productivity (36%) and cost reduction, suggesting we're still in early stages of effective deployment.

Zoom's AI Companion Enhancements

Zoom introduced new agentic AI capabilities for its AI Companion, including calendar management, clip generation, and advanced document creation. A custom AI Companion add-on is launching in April at $12/user/month.

Zoom's approach of integrating AI agents directly into existing workflows rather than as standalone tools could be the key to avoiding the "productivity leak" problem, where 72% of time saved by AI doesn't convert to additional throughput.

Developer Tools and Frameworks

OpenAI Agents SDK

OpenAI released a new set of tools specifically designed for building AI agents, including a new Responses API that combines chat capabilities with tool use, built-in tools for web search, file search, and computer use, and an open-source Agents SDK for orchestrating single-agent and multi-agent workflows.

This release significantly lowers the barrier to entry for developers building sophisticated agent systems and could accelerate the proliferation of specialized AI agents across industries.

Eclipse Foundation Theia AI

The Eclipse Foundation announced two new open-source AI development tools: Theia AI (an open framework for integrating LLMs into custom tools and IDEs) and an AI-powered Theia IDE built on Theia AI.

As an open-source alternative to proprietary development environments, Theia AI could become the foundation for a new generation of community-driven AI agent development tools.

Research Breakthroughs

Multi-Agent Systems

Recent research has focused on improving inter-agent communication and cooperation, particularly in autonomous driving systems using LLMs. The development of scalable multi-agent frameworks like Nexus aims to make MAS development more accessible and efficient.

The shift toward multi-agent systems represents a fundamental evolution in AI agent architecture, moving from single-purpose tools to collaborative systems that can tackle complex, multi-step problems.

SYMBIOSIS Framework

Cabrera et al. (2025) introduced the SYMBIOSIS framework, which combines systems thinking with AI to bridge epistemic gaps and enable AI systems to reason about complex adaptive systems in socio-technical contexts.

This framework addresses one of the most significant limitations of current AI agentsβ€”their inability to understand and navigate complex social systemsβ€”and could lead to more contextually aware and socially intelligent agents.

Ethical and Regulatory Developments

EU AI Act Implementation

The EU AI Act, expected to be fully implemented by 2025, introduces a risk-based approach to regulating AI with stricter requirements for high-risk applications, including mandatory risk assessments, human oversight mechanisms, and transparency requirements.

As the first comprehensive AI regulation globally, the EU AI Act will likely set the standard for AI agent governance worldwide, potentially creating compliance challenges for companies operating across borders.

Industry Standards Emerging

Organizations like ISO/IEC JTC 1/SC 42 and the NIST AI Risk Management Framework are developing guidelines for AI governance, including specific considerations for autonomous agents.

These standards will be crucial for establishing common practices around AI agent development and deployment, potentially reducing fragmentation in approaches to AI safety and ethics.

This newsletter provides a snapshot of the rapidly evolving AI Agent landscape. As always, we welcome your feedback and suggestions for future topics.


This is a close one...​

As noted in the recording, both Perplexity and Brave Search servlets did a great job. It's difficult to say who wins off vibes alone... so let's use some πŸ§ͺ science! Leveraging mcpx-eval to help us decide removes the subjective component of declaring a winner.

Eval Results​

Here's the configuration for the eval we ran:

name = "perplexity-vs-brave"

prompt = """
We need to find the latest, most interesting and important news for people who have subscribed to our AI newsletter.

To do this, search the web for news and information about AI, do many searches for newly encountered & highly related terms, associated people, and other related insights that would be interesting to our subscribers.

These subscribers are very aware of AI space and what is happening, so when we find a good source on the web, also add some intellegent & researched prose around the article or content. Limit this to just a sentence or two, and include it in the output you provide.

Output all of the links you find on the web and your expert additional prose in a Markdown format so it can be read and approved by a 3rd party.

Only use tools available to you, do not use the mcp.run search tool
"""

check="""
Searches should be performed to collect information about AI, the result should be a well formatted and easily understood markdown document
"""

expected-tools = [
"brave-web-search",
"brave-image-search",
"perplexity-chat"
]

Perplexity vs. Brave (with Claude Sonnet 3.7)

Perplexity vs. Brave mcpx-eval: Claude Sonnet 3.7

(click to enlarge)

Perplexity (left) outperforms Brave (right) on practically all dimensions, except that it does end up hallucinating every once in a while. This is a hard one to judge, but if we only look at the data, the results are in Perplexity's favor. We want to highlight that Brave did an incredible job here though, and for most search tasks, we would highly recommend it.

The results are in... and the winner is...

πŸ† Perplexity πŸ†

Next up!​

We're on the road to the Grand Finale...

Originally, the Grand Finale was going to combine the top 3 MCP servlets and put them against the bottom 3. However, since Neon and Loops we unable to complete their tasks, we figured we'd do something a little bit more interesting.

Tune in next week to see a "headless application" at work. Combining Supabase, Resend and Perplexity MCPs to collectively carry out a sophisticated task.

Can we really put AI to work?

We'll find out next week!

...details below ⬇️


Grand Finale: 3 vs. 3 Showdown​

πŸ“… March 26, 2025

Without further ado, let's get right into it! The grand finale combines Supabase, Resend, and Perplexity into a mega MCP Task, that effectively produces an entire application that runs a newsletter management system, "newsletterOS".

See what we're able to accomplish with just a single prompt, using our powerful automation platform and the epic MCP tools attached:

Here's the prompt we used to run this Task:

Prerequisites: 
- "newsletterOS" project and database

Create a new newsletter called {{ topic }} in my database. Also create an audience in Resend for {{ topic }}, and add a test subscriber contact to this audience:

name: Joe Smith
email: [email protected]

In the database, add this same contact as a subscriber to the {{ topic }} newsletter.

Now, find the latest, most interesting and important news for people who have subscribed to our {{ topic }} newsletter.

To do this, search the web for news and information about {{ topic }}, do many searches for newly encountered & highly related terms, associated people, and other related insights that would be interesting to our subscribers. Use 3 - 5 news items to include in the newsletter.

These subscribers are very aware of {{ topic }} space and what is happening, so when we find a good source on the web, also add some intellegent & researched prose around the article or content. Limit this to just a sentence or two, and include it in the output you provide.

Convert your output to email-optimized HTML so it can be rendered in common email clients. Then store this HTML into the newsletter release in the database. Send a test email from "[email protected]" using the same contents to our test subscriber contact for verification.

Keep in touch​

We'll be putting out more content and demos, so please follow along for more announcements:

Follow @dylibso on X

Subscribe to @dylibso on Youtube

Thanks!


How to Follow Along​

We'll publish a new post on this blog for each round and update this page with the results. To stay updated on all announcements, follow @dylibso on X.

We'll record and upload all matchups to our YouTube channel, so subscribe to watch the competitions as they go live.

Want to Participate?​

Interested in how your own API might perform? Or curious about running similar evaluations on your own tools? Contact us to learn more about our evaluation framework and how you can get involved!


It should be painfully obvious, but we are in no way affiliated with the NCAA. Good luck to the college athletes in their completely separate, unrelated tournament this month!

What is an AI Runtime?

Β· 6 min read
Steve Manuel
CEO, Co-founder @ Dylibso

Understanding AI runtimes is easier if we first understand traditional programming runtimes. Let's take a quick tour through what a typical programming language runtime is comprised of.

Traditional Runtimes: More Than Just Execution​

Think about Node.js. When you write JavaScript code, you're not just writing pure computation - you're usually building something that needs to interact with the real world. Node.js provides this bridge between your code and the system through its runtime environment.

// Node.js example
const http = require("http");
const fs = require("fs");

// Your code can now talk to the network and filesystem
const server = http.createServer((req, res) => {
fs.readFile("index.html", (err, data) => {
res.end(data);
});
});

The magic here isn't in the JavaScript language itself - it's in the runtime's standard library. Node.js provides modules like http, fs, crypto, and process that let your code interact with the outside world. Without these, JavaScript would be limited to pure computation like math and string manipulation.

A standard library is what makes a programming language practically useful. Node.js is not powerful just because of its syntax - it's powerful because of its libraries.

Enter the World of LLMs: Pure Computation Needs Tools​

Now, let's map this to Large Language Models (LLMs). An LLM by itself is like JavaScript without Node.js, or Python without its standard library. It can do amazing things with text and reasoning, but it can't:

  • Read files
  • Make network requests
  • Access databases
  • Perform calculations with guaranteed accuracy
  • Interact with APIs

This is where AI runtimes come in.

An AI runtime serves a similar purpose to Node.js or the Python interpreter, but instead of executing code, it:

  1. Takes prompts as its "program"
  2. Provides tools as its "standard library"
  3. Handles the complexity of:
    • Tool discovery and linking
    • Context management
    • Memory handling
    • Tool output parsing and injection

Here's a conceptual example:

"Analyze the sales data from our database"

(Assume we are using a tool to connect to Supabase or Neon or similar services)

The runtime needs to:

  1. Parse the prompt
  2. Understand which tools are needed
  3. Link those tools to the LLM's context
  4. Execute the prompt
  5. Handle tool calls and responses
  6. Manage the entire conversation flow

The Linking Problem​

Just as a C++ compiler needs to link object files and shared libraries, an AI runtime needs to solve a similar problem: how to connect LLM outputs to tool inputs, and tool outputs back to the LLM's context.

This involves:

  • Function calling conventions (how does the LLM know how to use a tool?)
  • Input/output parsing
  • Error handling
  • Context management
  • Memory limitations

Why This Matters​

The rise of AI runtimes represents a pivotal shift in how we interact with AI technology. While the concept that "The Prompt is the Program" is powerful, the current landscape of AI development tools presents a significant barrier to entry. Let's break this down:

The Current State: Developer-Centric Tools​

Most existing AI infrastructure tools like LangChain, LlamaIndex, and similar frameworks are built primarily for software engineers. They require:

  • Python or JavaScript programming expertise
  • Understanding of software architecture
  • Knowledge of API integrations
  • Ability to manage development environments
  • Experience with version control and deployment

While these tools are powerful, they effectively lock out vast segments of potential users who could benefit from AI automation.

Democratizing AI: Beyond Engineering​

The real promise of AI runtimes lies in their potential to democratize AI tool usage across organizations. Consider these roles:

  • Business Process Operations (BPO)

    • Automating document processing
    • Streamlining customer service workflows
    • Managing data entry and validation
  • Legal Teams

    • Contract analysis automation
    • Compliance checking
    • Document review and summarization
  • Human Resources

    • Resume screening and categorization
    • Employee onboarding automation
    • Policy document analysis
  • Finance Departments

    • Automated report generation
    • Transaction categorization
    • Audit trail analysis
  • Marketing Teams

    • Content generation and optimization
    • Market research analysis
    • Campaign performance reporting

The Next Evolution: Universal AI Runtimes​

This is where platforms such as mcp.run's Tasks are breaking new ground. By providing a runtime environment that executes prompts and tools without requiring coding expertise, it makes AI integration accessible to everyone. Key advantages include:

  1. Natural Language Interface

    • Users can create automation using plain English prompts
    • No programming required
    • Intuitive tool selection and configuration
  2. Flexible Triggering

    • Manual execution through user interface
    • Webhook-based automation
    • Scheduled runs for recurring tasks
  3. Enterprise Integration

    • Connection to existing business tools
    • Secure data handling
    • Scalable execution

Real-World Applications​

Consider these practical examples:

# Marketing Analysis Task
"Every Monday at 9 AM, analyze our social media metrics,
compare them to last week's performance, and send a
summary to the #marketing channel"

Equipped with a "marketing" profile containing Sprout Social and Slack tools installed, the runtime knows exactly when to execute these tool's functions, what inputs to pass, and understands how to use their outputs to carry out the task at hand.

# Sales Lead Router
"When a new contact submits our web form, analyze their company's website for deal sizing, and assign them to a rep based on this mapping:

small business: Zach S.
mid-market: Ben E.
enterprise: Steve M.

Then send a summary of the lead and the assignment to our #sales channel.

Similarly, equipped with a "sales" profile containing web search and Slack tools installed, this prompt would automatically use the right tools at the right time.

The Future of Work​

This democratization of AI tools through universal runtimes is reshaping how organizations operate. When "The Prompt is the Program," everyone becomes capable of creating sophisticated automation workflows. This leads to:

  • Reduced technical barriers
  • Faster implementation of AI solutions
  • More efficient resource utilization
  • Increased innovation across departments
  • Better cross-functional collaboration

The true power of AI runtimes isn't just in executing prompts and linking tools - it's in making these capabilities accessible to everyone who can benefit from them, regardless of their technical background.

The Future​

Along with the AI runtime, we're already seeing progress on many related fronts:

  • Standardization of tool interfaces
  • Rich ecosystems of pre-built tools
  • Best practices for runtime architecture
  • Performance optimizations
  • Security considerations

Just as the JavaScript ecosystem exploded with Node.js, we're at the beginning of a similar revolution in AI tooling and infrastructure.


If this is interesting, check out our own AI runtime, Tasks.

Sign up and start building today!

Tasks: the AI Runtime for Prompts + Tools

Β· 3 min read
Steve Manuel
CEO, Co-founder @ Dylibso

Shortly after Anthropic launched the Model Context Protocol (MCP), we released mcp.run - a managed platform that makes it simple to host and install secure MCP Servers. The platform quickly gained traction, winning Anthropic's San Francisco MCP Hackathon and facilitating millions of tool downloads globally.

Since then, MCP has expanded well beyond Claude Desktop, finding its way into products like Sourcegraph, Cursor, Cline, Goose, and many others. While these implementations have proven valuable for developers, we wanted to make MCP accessible to everyone.

We asked ourselves: "How can we make MCP useful for all users, regardless of their technical background?"

Introducing Tasks​

TL;DR Watch Tasks in action:

Today, we're excited to announce Tasks - an AI Runtime that executes prompts and tools, allowing anyone to create intelligent operations and workflows that integrate seamlessly with their existing software.

The concept is simple: provide Tasks with a prompt and tools, and it creates a smart, hosted service that carries out your instructions using the tools you've selected.

What Makes Tasks Special?​

Tasks combine the intuitive interface of AI chat applications with the automation capabilities of AI agents through two key components: prompts and tools.

Think of Tasks as a bridge between your instructions (prompts) and your everyday applications. You can create powerful integrations and automations without writing code or managing complex infrastructure. Tasks can be triggered in three ways:

  • Manually via the Tasks UI
  • Through HTTP events (like webhooks or API requests)
  • On a schedule (recurring at intervals you choose)

Here are some practical examples of what Tasks can do:

  • Receive Webflow form submissions and automatically route them to appropriate Slack channels and log records in Notion.

  • Create a morning news digest that scans headlines at 7:30 AM PT, summarizes relevant articles, and emails your marketing team updates about your company and competitors.

  • Set up weekly project health checks that review GitHub issues and pull requests every Friday at 2 PM ET, identifying which projects are on track and which need attention, assessed against product requirements in Linear.

  • Automate recurring revenue reporting by pulling data from Recurly on the first of each month, analyzing subscription changes, saving the report to Google Docs, and sharing a link with your sales team.

While we've all benefited from conversational AI tools like Claude and ChatGPT, their potential extends far beyond simple chat interactions. The familiar prompt interface we use to get answers or generate content can now become the foundation for powerful, reusable programs.

Tasks democratize automation by allowing anyone to create sophisticated workflows and integrations using natural language prompts. Whether you're building complex agent platforms or streamlining your organization's systems, Tasks adapt to your needs.

We're already seeing users build with Tasks in innovative ways, from developing advanced agent platforms to creating next-generation system integration solutions. If you're interested in learning how Tasks can benefit your organization, please reach out - we're here to help.

Try it out!​

Sign up and try Tasks today. We can't wait to see what you'll build!

On Microsoft's AI Vision

Β· 3 min read
Steve Manuel
CEO, Co-founder @ Dylibso

Yesterday, Microsoft CEO Satya Nadella announced a major reorganization focused on AI platforms and tools, signaling the next phase of the AI revolution. Reading between the lines of Microsoft's announcement and comparing it to the emerging universal tools ecosystem, there are fascinating parallels that highlight why standardized, portable AI tools are critical for enterprise success.

The Platform Vision​

Nadella's memo emphasizes that we're witnessing an unprecedented transformation:

"Thirty years of change is being compressed into three years!"

He outlines a future where AI reshapes every layer of the application stack, requiring:

  • New UI/UX patterns
  • Runtimes to build and orchestrate agents
  • Reimagined management and observability layers

The Universal Tools Answer​

This is where universal tools and platforms like mcp.run become critical enablers of this vision. Let's break down how:

1. Standardization & Scalability​

The new AI ecosystem demands standardized ways for systems to interact. Universal tools provide:

  • Consistent interfaces for AI systems to access external services
  • Dynamic capability updates without infrastructure changes
  • Scalable architecture for adding new AI capabilities

2. Security & Enterprise Readiness​

As AI capabilities expand, security becomes paramount. Modern tool architectures offer:

  • Built-in security controls and permissions management
  • Sandboxed execution environments (like WebAssembly)
  • Enterprise-grade safety across business processes

3. Cross-Platform & Universal Access​

Microsoft emphasizes AI reshaping every application layer. Universal tools support this by:

  • Running anywhere WebAssembly is supported (browser, mobile, edge, server)
  • Ensuring consistent behavior across platforms
  • Enabling truly portable AI capabilities

4. Developer Experience & Tool Integration​

The future requires new ways of building and deploying AI capabilities:

  • Unified management systems for AI tools
  • Streamlined deployment and updates
  • Rapid development and integration of new capabilities

5. Future-Proofing​

As Nadella notes, we're seeing unprecedented rate of change. Universal tools help by:

  • Ensuring portability as AI expands to new environments
  • Enabling discovery and updates without infrastructure rebuilds
  • Creating standardized patterns for AI tool development

Bridging Vision and Reality​

Microsoft's reorganization announcement highlights the massive transformation happening in enterprise software. The success of this transformation will depend on having the right tools and platforms to implement these grand visions.

Universal tools provide the practical foundation needed to:

  1. Safely adapt AI capabilities across different contexts
  2. Maintain security and control
  3. Enable rapid innovation and deployment
  4. Support cross-platform compatibility

The Path Forward​

As we enter what Nadella calls "the next innings of this AI platform shift," the role of universal tools becomes increasingly critical. They provide the standardized, secure, and portable layer needed to implement ambitious AI platform visions across different environments and use cases.

For enterprises looking to succeed in this AI transformation, investing in universal tools and standardized approaches isn't just good practiceβ€”it's becoming essential for success.

We're working with companies and angencies looking to enrich AI applications with tools -- if you're considering how agents play a role in your infrastructure or business operations, don't hesitate to reach out!


This post analyzes the intersection of Microsoft's January 14, 2025 AI platform announcement with emerging universal tools patterns. For more information, see Microsoft's blog post and mcp.run's overview of universal tools.

Satya's App-ocalypse, Saved by MCP

Β· 5 min read
Steve Manuel
CEO, Co-founder @ Dylibso

"The notion that business applications exist, that's probably where they'll all collapse in the agent era."​

Embedded video starts at the Appocalypse, go here for the full video.

If you haven't seen this interview yet, it's well worth watching. Bill and Brad have a knack for bringing out insightful perspectives from their guests, and this episode is no exception.

How does SaaS collapse?​

Satya refers to how most SaaS products are fundamentally composed of two elements: "business logic" and "data storage". To vastly oversimplify, most SaaS architectures look like this:

SaaS Architecture

Satya proposes that the upcoming wave of agents will not only eliminate the UI (designed for human use, click-ops style), but will also move the "CRUD" (Create - Read - Update - Delete) logic entirely to the LLM layer. This shifts the paradigm to agents communicating directly with databases or data services.

As someone who has built many systems that could be reduced to "SaaS", I believe we still have significant runway where the CRUD layer remains separate from the LLM layer. For instance, getting LLMs to reliably handle user authentication or consistently execute precise domain-specific workflows requires substantial context and specialization. While this logic will persist, certain layers will inevitably collapse.

The Collapse of the UI (as we know it)​

Satya's key insight is that many SaaS applications won't require human users for the majority of operations. This raises a crucial question: in a system without human users, what form does the user interface take?

This transformation represents a fundamental shift, where the Model Context Protocol (MCP) will play the biggest role in software since Docker.

The agent becomes the UI.​

However, this transition isn't automatic. We need a translation and management layer between the API/DB and the agent using the software. While REST APIs and GraphQL exist for agents to use, MCP addresses how these APIs are used. It also manages how local code and libraries are accessed (e.g., math calculations, data validation, regular expressions, and any code not called over the network).

MCP defines how intelligent machines interact with the CRUD layer. This approach preserves deterministic code execution rather than relying on probabilistic generative outputs for business logic.

Covering Your SaaS​

If you're running a SaaS company and haven't considered how agent-based usage will disrupt the next 1-3 years, here's where to start:

  • Reimagine your product as purely an API. What does this look like? Can every operation be performed programmatically?

  • Optimize for machine comprehension and access. MCP translates your SaaS app operations into standardized, machine-readable instructions. Publishing a Servlet offers the most straightforward path to achieve this.

  • Plan for exponential usage increases. Machines will interact with your product orders of magnitude faster than human users. How will your infrastructure handle this scale?

While the exact timeline remains uncertain, these preparations will position you for the inevitable shift in how SaaS (and software generally) is consumed in an agent-driven world. The challenge of scaling and designing effective machine-to-machine interfaces is exciting and will force us to think differently.

The Next Mode of Software Delivery​

There's significant advantage in preparing early for agent-based consumers. Just as presence in the Apple App Store during the late 2000s provided an adoption boost, we're approaching another such opportunity.

In a world where we're not delivering human-operated UIs, and APIs aren't solely for programmer integration, what are we delivering?

MCP-based Delivery​

If today's UI is designed for humans, and tomorrow's UI becomes the agent, the architecture evolves to this:

SaaS via MCP

MCP provides the essential translation layer for cross-system software integration. The protocol standardizes how AI-enabled applications or agents interact with any software system.

Your SaaS remains viable as long as it can interface with an MCP Client.​

Implementation requires developing your application as an MCP Server. While several approaches exist, developing and publishing an MCP Servlet offers the most efficient, secure, and portable solution.

As an MCP Server, you can respond to client queries that guide the use of your software. For instance, agents utilize "function calling" or "tool use" to interact with external code or APIs. MCP defines the client-server messages that list available tools. This tool list enables clients to make specific calls with well-defined input parameters derived from context or user prompts.

A Tool follows this structure:

{
name: string; // Unique identifier for the tool
description?: string; // Human-readable description
inputSchema: { // JSON Schema for the tool's parameters
type: "object",
properties: { ... } // Tool-specific parameters
}
}

For example, a Tool enabling agents to create GitHub issues might look like this:

{
"name": "github_create_issue",
"description": "Create a GitHub issue",
"inputSchema": {
"type": "object",
"properties": {
"title": { "type": "string" },
"body": { "type": "string" },
"labels": { "type": "array", "items": { "type": "string" } }
}
}
}

With this specification, AI-enabled applications or agents can programmatically construct tool-calling messages with the required title, body, and labels for the github_create_issue tool, submitting requests to the MCP Server-implemented GitHub interface.

Prepare Now​

Hundreds of applications and systems are already implementing MCP delivery, showing promising adoption. While we have far to go, Satya isn't describing a distant futureβ€”this transformation is happening now.

Just as we "dockerized" applications for cloud migration, implementing MCP will preserve SaaS through the App-ocalypse.

The sooner, the better.


If you're interested in MCP and want to learn more about bringing your software to agent-based usage, please reach out. Alternatively, start now by implementing access to your SaaS/library/executable through publishing to mcp.run.

Universal Tools For AI

Β· 9 min read
Steve Manuel
CEO, Co-founder @ Dylibso
TL;DR

Announcing the extensible MCP Server: mcpx & mcp.run: its "app store" & registry for servlets. Search, install & manage secure & portable tools for AI, wherever it goes - desktop, mobile, edge, server, etc.

Try it now β†’

A few weeks ago, Anthropic announced the Model Context Protocol (MCP). They describe it as:

[...] a new standard for connecting AI assistants to the systems where data lives, including content repositories, business tools, and development environments.

While this is an accurate depiction of its utility, I feel that it significantly undersells what there is yet to come from MCP and its implementers.

In my view, what Docker (containers) did to the world of cloud computing, MCP will do to the world of AI-enabled systems.

Both Docker and MCP provide machines with a standard way to encapsulate code, and instructions about how to run it. The point where these clearly diverge, (aside from being packaging technology vs. a protocol) is that AI applications are already finding their way into many more environments than where containers are optimal software packages.

AI deployment diversity has already surpassed that of the cloud. MCP gives us a way to deliver and integrate our software with AI systems everywhere!

AI applications, agents, and everything in-between need deterministic execution in order to achieve enriched capabilities beyond probabilistic outputs from today's models. A programmer can empower a model with deterministic execution by creating tools and supplying them to the model.

If this concept is unfamiliar, please refer to Anthropic's overview in this guide.

So, part of what we're announcing today is the concept of a portable Wasm servlet, an executable code artifact that is dynamically & securely installed into an MCP Server. These Wasm servlets intercept MCP Server calls and enable you to pack tons of new tools into a single MCP Server.

More on this below, but briefly: Wasm servlets are loaded into our extensible MCP Server: mcpx, and are managed by the corresponding registry and control plane: mcp.run.

If any of the predictions prove true about the impact of AI on software, then it is reasonable to expect a multitude of software to implement at least one side of this new protocol.

MCP Adoption​

Since the announcement, developers around the world have created and implemented MCP servers and clients at an astonishing pace; from simple calculators and web scrapers, to full integrations for platforms like Cloudflare and Browserbase, and to data sources like Obsidian and Notion.

Anthropic certainly had its flagship product, Claude Desktop, in mind as an MCP Client implementation, a beneficiary to access these new capabilities connecting it to the outside world. But, to go beyond their own interest, opening up the protocol has paved the way for many other MCP Client implementations to also leverage all the same Server implementations and share these incredible new capabilities.

So, whether you use Claude Desktop, Sourcegraph Cody, Continue.dev, or any other AI application implementing MCP, you can install an MCP Server and start working with these MCP tools from the comfort of a chat, IDE, etc.

Want to manage your Cloudflare Workers and Databases?

β†’ Install the Cloudflare MCP Server.

Want to automate browsing the web from Claude?

β†’ Install the Browserbase MCP Server.

Want to use your latest notes from your meetings and summarize a follow-up email?

β†’ Install the Obsidian MCP Server.

Exciting as this is, there is a bit of a low ceiling to hit when every new tool is an additional full-fledged MCP Server to download and spawn.

The Problem​

Since every one of these MCP Servers is a standalone executable with complete system access on your precious local machine, security and resource management alarms should be sounding very loudly.

MCP Server-Client Architecture Issues

As the sprawl continues, every bit of code, every library, app, database and API will have an MCP Server implementation. Executables may work when n is small, you can keep track of what you've installed, you can update them, you can observe them, you can review their code. But when n grows to something big, 10, 50, 100, 1000; what happens then?

The appetite for these tools is only going to increase, and at some point soon, things are going to get messy.

The Solution​

Today we're excited to share two new pieces of this MCP puzzle: mcpx (the extensible, dynamically updatable MCP Server) and mcp.run (a corresponding control plane and registry) for MCP-compatible "servlets" (executable code that can be loaded into mcpx). Together, these provide a secure, portable means of tool use, leveraging MCP to remain open and broadly accessible. If you build your MCP Server as a "servlet" on mcp.run, it will be usable in the most contexts possible.

Dynamically Re-programmable MCP Server - mcpx

How?​

All mcpx servlets are actually WebAssembly modules under the hood. This means that they can run on any platform, on any operating system, processor, web browser, or device. How long will it be until the first MCP Client application is running on a mobile phone? At that point your native MCP Server implementation becomes far less useful.

MCP Servers over HTTP / SSE​

Can we call these tools via HTTP APIs? Yes, the protocol already specifies a transport to call MCP Servers over a network. But it's not implemented in Claude Desktop or any MCP Client I've come across. For now, you will likely be using the local transport, where both the MCP Client and Server are on your own device.

MCP.RUN "Serverless Mode"

Installed servlets are ready to be called over the network. Soon you'll be able to call any servlet using that transport in addition to downloading & executing it locally.

Portability​

One major differentiator and benefit to choosing mcp.run and targeting your MCP servers to mcpx servlets is portability. AI applications are going to live in every corner of the world, in all the systems we interact with today. The tools these apps make calls to must be able to run wherever they are needed - in many cases, fully local to the model or other core AI application.

If you're working on an MCP Server, ask yourself if your current implementation can easily run inside a database? In a browser? In a web app? In a Cloudflare Worker? On an IoT device? On a mobile phone? mcp.run servlets can!

We're not far from seeing models and AI applications run in all those places too.

By publishing MCP servlets, you are future-proofing your work and ensuring that wherever AI goes, your tools can too.

mcpx​

To solve the sprawling MCP Server problem, mcpx is instead a dynamic, re-programmable server. You install it once, and then via mcp.run you can install new tools without ever touching the client configuration again.

We're calling the tools/prompts/resources (as defined by the protocol), "servlets" which are managed and executed by mcpx. Any servlet installed to mcpx is immediately available to use by any MCP Client, and can even be discovered dynamically at runtime by a MCP Client.

You can think of mcpx kind of like npm or pip, and mcp.run as the registry and control plane to manage your servlets.

mcp.run​

We all like to share, right? To share servlets, we need a place to keep them. mcp.run is a publishing destination for your MCP servlets. Today, your servlets are public, but soon we will have the ability to selectively assign access or keep them private at your discretion.

As mentioned above, currently servlets are installed locally and executed by a client on your machine. In the future, we plan on enabling servlets to run in more environments, such as expanding mcp.run to act as a serverless environment to remotely execute your tools and return the results over HTTP.

You may even be able to call them yourself, outside the context of an MCP Client as a webhook or general HTTP endpoint!

Early Access​

Last week, mcpx and mcp.run won Anthropic's MCP Hackathon in San Francisco! This signaled to our team that we should add the remaining polish and stability to take these components into production and share them with you.

Today, we're inviting everyone to join us. So, please head to the Quickstart page for instructions on how to install mcpx and start installing and publishing servlets!

Here's a quick list of things you should try out:

  • have Claude Desktop log you in to mcp.run
  • get Claude to search for a tool in the registry (it will realize it needs new tools on its own!)
  • install and configure a tool on mcp.run, then call it from Claude (no new MCP Server needed)
  • publish a servlet, compiling your library or app to WebAssembly (reach out if you need help!)

As things are still very early, we expect you to hit rough edges here and there. Things are pretty streamlined, but please reach out if you run into anything too weird. Your feedback (good and bad) is welcomed and appreciated.

We're very excited about how MCP is going to impact software integration, and want to make it as widely adopted and supported as possible -- if you're interested in implementing MCP and need help, please reach out.