Sampling
Deprecated — SEP-2577
Call your LLM provider's API directly from your server instead. Sampling is deprecated as of protocol version 2026-07-28 (SEP-2577) and stays functional on 2025-era connections for at least twelve months — see the deprecated features registry.
Replace sampling with a direct provider call
Sampling routes an LLM call through the connected client: a tool handler sends a prompt, the host runs it through a model it controls, and the handler resumes with the completion. The 2026-07-28 revision removes the server-to-client request channel that carries it.
Migrate by importing your LLM provider's SDK into the server and calling it from the tool handler with your own API key. The handler keeps its shape; the requestSampling call is the only line that changes, and you stop depending on what the client supports.
Request a completion from the client
ctx.mcpReq.requestSampling sends a sampling/createMessage request to the connected client from inside a tool handler. The client runs the messages through its model and resolves the call with the completion.
server.registerTool(
'summarize',
{
description: 'Summarize text using the client LLM',
inputSchema: z.object({ text: z.string() })
},
async ({ text }, ctx) => {
const response = await ctx.mcpReq.requestSampling({
messages: [{ role: 'user', content: { type: 'text', text: `Summarize in one sentence:\n\n${text}` } }],
maxTokens: 500
});
return { content: [{ type: 'text', text: `Model (${response.model}): ${JSON.stringify(response.content)}` }] };
}
);The handler blocks until the client answers, so your server never holds the key for the model that does the work — the host does.
INFO
On a 2026-07-28 connection requestSampling throws. The replacement on that revision is returning an embedded createMessage request from the handler — input_required owns that form. Era differences are listed in Protocol versions.
Read the model's reply
The response is a CreateMessageResult: the client decides which model fulfils the request and returns its name as model, plus the assistant role and one content block. The handler above folds it into its tool result, so calling summarize from a client whose model is named host-model returns:
[
{
type: 'text',
text: 'Model (host-model): {"type":"text","text":"Sampling lets a tool ask the client for a completion."}'
}
]Require the sampling capability
requestSampling only works against a client that declared the sampling capability and registered a sampling/createMessage handler — Handle requests from the server covers that side.
Pass enforceStrictCapabilities: true to the McpServer constructor and the SDK checks the client's declared capabilities before it sends any server-initiated request. Against a client that never declared sampling, requestSampling then throws inside your handler, and the call comes back as an ordinary isError tool result:
{
content: [
{
type: 'text',
text: 'Client does not support sampling (required for sampling/createMessage)'
}
],
isError: true
}Recap
- Sampling is deprecated (SEP-2577); the migration target is a direct LLM provider call from your server.
ctx.mcpReq.requestSampling({ messages, maxTokens })asks the connected client's model for a completion mid-handler.- The client picks the model; the result carries
model,role, andcontent. - On a 2026-07-28 connection
requestSamplingthrows; the embedded-request form lives on the input_required page. - The client must declare the
samplingcapability;enforceStrictCapabilities: truerejects the request before the wire when it did not.