Defer Loading MCP Server Tools

When an agent is connected to many MCP servers, each with dozens of tools, the full set of tool definitions can consume a large portion of the context window before the user even sends a message. The Preload Tools setting on each MCP server controls whether those definitions are loaded upfront or discovered on demand. By default, Preload Tools is off — the same default used for skills. The agent only sees each MCP server’s name and description at startup and loads individual tool schemas as needed. This keeps the context window lean while still giving the agent access to the full breadth of tools when it needs them.

The problem

Each tool definition — name, description, input schema, output schema — consumes tokens from the available context window.
With many MCP servers and tools, the context window can fill up before any user interaction happens.
Most interactions only need a small subset of the available tools, not all of them.

Comparison showing context window usage with Preload Tools off versus on across many MCP servers

How it works

Preload Tools	Behavior	Best for
Off (default)	Only the MCP server’s name and description are in context. Tool schemas are discovered on demand via meta tools.	Large tool catalogs, servers used only occasionally
On	Full tool definitions are loaded into context at startup.	Small, frequently-used servers where every turn calls a tool

When Preload Tools is off, the model has each MCP server’s name and description in context. Individual tool definitions are skipped — meta tools for discovering available tools are exposed instead. The Agent Harness loads definitions on demand and runs the actual MCP tool when asked:

Example

In this example we have truefoundry-mcp with 100+ tools added with Preload Tools off and a web_search MCP server with 2 tools added with Preload Tools on.

User

What vllm version is “nomic-embed-multimodal-7b-1” app in “tfy-test-cluster:test-ws” workspace using? Can you please get me the github release notes link of that version?

Assistant

The agent first lists all tools available on truefoundry-mcp mcp server.

list_tools

Tool Call

{
  "mcp_server": "truefoundry-mcp"
}

Tool Response

truefoundry-mcp: list_applications,search_workspaces,get_workspace,get_application,list_app_metric_charts,list_app_metric_charts, ...

Assistant

The agent then gets the tool information for list_applications tools.

get_tool_info

Tool Call

{
  "mcp_server": "truefoundry-mcp",
  "tool_name": "list_applications"
}

Tool Response

{
  "outputSchema": {
    "type": "object",
    "properties": {
      "data": { "type": "array", ... },
      "pagination": { "type": "object", ... }
    },
    "required": ["data", "pagination"]
  }
}

Assistant

The agent then calls the list_applications tool using the meta tool call_tool to find the application by its name.

Now I have the tool info. Let me find the application.

call_tool

Tool Call

{
  "mcp_server": "truefoundry-mcp",
  "tool_name": "list_applications",
  "input": {
    "workspaceFqn": "tfy-test-cluster:test-ws",
    "applicationName": "nomic-embed-multimodal-7b-1"
  }
}

Tool Response

{
  "data": [
    {
      "name": "nomic-embed-multimodal-7b-1",
      "type": "service",
      "lastDeployment": {
        "manifest": {
          ...
          "image": {
            "type": "image",
            "image_uri": "public.ecr.aws/truefoundrycloud/vllm/vllm-openai:v0.13.0"
            ...
          },
          
          
        }
      }
    }
  ],
  "pagination": { ... }
}

Assistant

I can see the vLLM version from the image URI: v0.13.0. Now let me get the GitHub release notes link for that version.

Tool Call

{
  "urls": [
    "https://github.com/vllm-project/vllm/releases/tag/v0.13.0"
  ],
  "objective": "Get the GitHub release notes page URL for vLLM v0.13.0"
}

Tool Response

{
  "extract_id": "extract_e71ecabe5c934b75a34f99c891613f69",
  "results": [
    {
      "url": "https://github.com/vllm-project/vllm/releases/tag/v0.13.0",
      "title": "Release v0.13.0 · vllm-project/vllm · GitHub"
    }
  ]
}

Assistant

GitHub Release Notes: 🔗 https://github.com/vllm-project/vllm/releases/tag/v0.13.0 …

​The problem

​How it works

​Example

The problem

How it works

Example