
Nous Research added Tool Search to its open-source Hermes Agent to load MCP tool schemas on demand, reducing context — window bloat and cutting tool-definition token use by about 85%.
Nous Research has introduced Tool Search, a progressive — disclosure capability for the open-source Hermes Agent that delays loading full Model Context Protocol (MCP) tool schemas until the model actually needs them. The change targets growing context — window overload in multi — tool deployments by supplying only the schema elements required per turn, reducing schema noise that can overwhelm long-context models and increase latency and cost.
Tool Search is implemented as an opt-in bridge that replaces the normal tools array with three actions: tool_search(query, limit?), tool_describe(name) to retrieve a single tool’s parameters, and tool_call(name, arguments) to invoke a deferred tool. A typical interaction has the model issue tool_search, call tool_describe to fetch parameter details, then run tool_call; all hooks, guardrails and approval prompts operate against the real underlying tool name rather than the bridge interface.
The feature responds to real operational pain points. One Hermes deployment example with five MCP servers and 34 tools produced average prompt sizes around 45,000 tokens per turn, with roughly 22,000 tokens — about half-coming from tool-schema overhead. Anthropic engineering data cited total tool-definition accumulations up to 134,000 tokens before optimization, and measured an "MCP Tools Tax" of roughly 15,000 — 60,000 tokens per turn for typical multi — server setups, a burden that raises latency and inference cost.
Anthropic’s internal MCP evaluations attribute substantial accuracy gains to removing irrelevant tool schema from context. In their tests, Claude Opus 4 accuracy increased from 49% to 74% with Tool Search enabled; Opus 4.5 rose from 79.5% to 88.1%. Anthropic also measured about an 85% reduction in tool-definition token usage while preserving access to the full tool catalog, noting that the reduced schema exposure cuts model "decision paralysis" and false positives when choosing among many options.
Under the hood, Hermes builds a searchable catalog of tool names, descriptions and parameter names and uses BM25 retrieval to match the model’s query. If BM25 yields no positive scores, the system falls back to a literal substring match on tool names to avoid degenerate zero — IDF cases. The catalog is rebuilt from current tool definitions on every assembly so it remains stateless across turns and avoids index drift relative to the live registry.
Tool Search runs in auto mode by default and activates only when deferrable tool schemas would consume at least 10% of the active model’s context window; the threshold is re-evaluated each turn and, if not met, the tools array is passed through unchanged. Practically, sessions with few MCP tools or very long-context models may never trigger Tool Search, while larger multi — server deployments stand to see token, accuracy and cost benefits; Anthropic noted cache — miss generation costs of roughly $0.07–$0.10 per turn as one operational cost driver.
Sources
Replies (0)
No replies in this topic yet.