# Agent platform comparisons


|                  |                                                                                  |
|------------------|----------------------------------------------------------------------------------|
| **Author**       | Rhyannon Rodriguez                                                               |
| **Last updated** | 2026-05-08                                                                       |
| **Methodology**  | [Agent Ecosystem Testing](https://rhyannonjoy.github.io/agent-ecosystem-testing) |

This page provides an overview of observed agent web fetch retrieval behavior across agent platforms. To clarify observed behavior, we break down the retrieval process into three key components that form most agent web fetch pipelines:

- [Retrieval](#retrieval): How and when an agent fetches content
- [Truncation](#truncation): What gets lost and whether agents report it
- [Summarization](#summarization): What happens to content between retrieval and generation

These observations inform the size thresholds and pipeline assumptions in the [Agent-Friendly Documentation Spec](/spec/), particularly [Category 3: Page Size and Truncation Risk](/spec/#category-3-page-size-and-truncation-risk).

## Retrieval

The web fetch gap isn't in retrieval, but in what follows: how agents attend to various content types
during generation, whether that's context window handling, chunking losses, or summarization. Platform
links lead to each tool's official documentation.

| **Platform** | **Prompt Syntax** | **Invocation Pattern** | **Retrieval Behavior** |
| ---------- | ------------ | -------------------- | ----------------------- |
| [Claude API web fetch](https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-fetch-tool) | Enable tool to augment Claude's context with URL | _Mid-generation deterministic_: tool requires enablement in API request, includes URL validation and results cache, may or may not provide live web content | _Visibility high_: only platform where response body includes raw tool result; no JavaScript execution, CSS-heavy pages and/or SPAs often return little to no prose |
| [Claude Code](https://code.claude.com/docs/en/overview) | `WebFetch` invoked automatically with prompt URL | _Mid-generation deterministic_: returns cached result if available, otherwise fetches live | _Visibility high_: Markdown result returned directly if trusted, text/markdown, <100k; otherwise a smaller LLM extracts relevant content before passing to Claude; no JavaScript execution |
| [Cursor](https://cursor.com/docs) | _No web fetch behavior publicly documented_, `@Web` context attachment redundant, agents don't correct misuse | _Mid-generation nondeterministic_: `Auto` default setting autonomous LLM and fetch method selection per request | _Visibility low_: fetch method not explicitly named, no JavaScript execution, CSS-heavy pages and/or SPAs often return little to no prose; prefers Markdown, content negotation documented with `Accept: text/markdown`; sends full browser fingerprint: `User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/139.0.0.0 Safari/537.36` with Chrome client hints `Sec-Ch-Ua`, `Sec-Fetch-*` |
| [Gemini API URL context](https://ai.google.dev/gemini-api/docs/url-context) | Enable tool to augment Gemini's context with URL, request requires `url_context` with full, unnested URLs | _Pre-generation injection deterministic_: two-step process, fetches from internal cache, if unsuccessful, then live fetch; documentation includes parsing limitations | _Visibility low_: retrieved content injected into context without a testable field, retrieval orchestration and generation process opaque; `url_context_metadata` order _nondeterministic_, authoritative signal `url_retrieval_status`, `tool_use_prompt_token_count` only size proxy |
| [GitHub Copilot](https://code.visualstudio.com/docs/copilot/overview) | _No web fetch behavior publicly documented_, prompt with URL | _Mid-generation nondeterministic_: `Auto` default setting autonomous LLM and fetch method selection per request | _Visibility medium_: intermittently tools named via error, `fetch_webpage` returns relevance-ranked excerpts with elision markers, occasional nonlinear, inaccurate reassembly, `curl` byte-perfect retrieval, but no prose; content negotiation tool-dependent, presents as a browser, but overclaims `User-Agent`: `Mozilla/5.0`, `AppleWebKit` `Accept`: full HTML, `curl/8.7.1` no preference,`Accept`: `*/*` |
| [MCP Fetch (reference server)](https://pypi.org/project/mcp-server-fetch/) | `url` required; `max_length`, `start_index`, `raw` optional | _Mid-generation deterministic_: fetch invoked automatically with URL | _Visibility high_: returns extracted contents as Markdown; supports chunked reading via `start_index`, allowing LLM to page through content until it finds what's needed; no JavaScript execution |
| [OpenAI web search](https://developers.openai.com/api/docs/guides/tools-web-search) | Chat Completions API augments `GPT`'s search with URL, Responses API for `web_search` | _Mid-generation nondeterministic: integration and agent-dependent_: static facts and trivial math don't invoke the tool; Chat Completions search implicit, Responses `web_search_preview` conditional, control cached/indexed or live content `external_web_access` | _Visibility low_: Responses `response.output`'s `web_search_call` names tools, but search context not equal to LLM context window; no JavaScript execution; `search_context_size: low/medium/high` controls context amount, Chat Completions latency lever consistent, but Responses inconsistent |
| [Windsurf Cascade](https://docs.windsurf.com/windsurf/cascade/web-search) | Web and docs search partially documented, `@web` directive redundant with URL, agents don't correct misuse | _Mid-generation deterministic_: autonomous two-stage pipeline designed to emulate human browsing and skimming, documentation acknowledges not all pages parseable | _Visibility medium_: `read_url_content` returns chunk index with summaries, metadata and requires sequential `view_content_chunk` calls; `curl` substitution for CSS-heavy pages, SPAs return ~20–35% of expected size, little or no prose; agents used `@web`'s `web_search` as verification once every ~60 turns; presentation transparent about using crawler-scaper, but underdelivers, `User-Agent`: [Colly](https://github.com/gocolly/colly) |

## Truncation

Pipelines are lossy by design in attempt to balance token cost, speed, and access to fresh content.
Agents intermittently acknowledge architectural constraints, misattribute truncation causes, or
self-report completeness when content is incomplete or unusable. Platform links lead to empirical
testing analysis and/or tool documentation.


| **Platform** | **Truncation Limit** | **Observations** |
| ---------- | ----------------- | ------- |
| [Claude API web fetch](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/anthropic-claude-api-web-fetch-tool/claude-interpreted-vs-raw) | ~20,700 chars and/or ~100 KB of rendered content _default unset_ | `max_content_tokens` approximate, setting 5,000 returned 17,186 chars, truncation occurs mid-token. Default limit identified in raw track, self-report attributed missing content to JavaScript rendering, masking character limit. |
| [Claude Code](https://giuseppegurgone.com/claude-webfetch) | ~100,000 chars | Trusted sites serving `text/markdown` under 100K chars bypass summarization, while content over 100K chars are passed to a summarization LLM. |
| [Cursor](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/anysphere-cursor/cursor-interpreted-vs-raw) | 28 KB–240 KB+ _method-dependent_, _nondeterministic filtering_ | `WebFetch MCP` ~28 KB, `urllib` ~72 KB, unknown path 245 KB+, `curl` no ceiling detected; appears to apply structure-aware content filtering, navigation and CSS stripped, but content selection heuristic presents as complete, so agents don't report truncation. |
| [Gemini API URL context](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/google-gemini-url-context-tool/gemini-interpreted-vs-raw) | _No fixed ceiling or silent dropping detected_, 20 URLs hard limit per request | API-layer rejection returns `400` and doesn't consume tokens; retrieval-layer failure completes the request, but records `URL_RETRIEVAL_STATUS_ERROR`. Format support inconsistent with documentation: PDF fails, YouTube succeeds, JSON nondeterministic; Google Docs fail consistently. |
| [GitHub Copilot](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/microsoft-github-copilot/copilot-interpreted-vs-raw) | _No fixed ceiling detected_, _nondeterministic excerpting_, tested 6.68M tokens | Pipeline with `fetch_webpage` discards whole sections or more granularly before generation, `curl` delivers all raw bytes but unreadable, chat rendering cutoff visible in output, not persisted as requested, but agents don't reliably report these results as truncation. |
| [MCP Fetch (reference server)](https://pypi.org/project/mcp-server-fetch/) | Default 5,000 chars | Default `max_length` is 5,000 chars, but configurable up to 1,000,000; uniquely user-controlled truncation. |
| [OpenAI web search](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/open-ai-web-search-tool/chatgpt-interpreted-vs-raw) | _No fixed ceiling or silent dropping detected_ | Raw source count stable at 12 regardless of `search_context_size` setting. Query construction not temporally aware, internal queries append training-era date strings despite running in 2026. Documented domain filtering limits not functional in Python SDK. |
| [Windsurf Cascade](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/cognition-windsurf-cascade/cascade-interpreted-explicit-vs-raw) | _No fixed ceiling detected at retrieval stage_, _nondeterministic agent-dependent write ceiling_ | Full retrieval agent and doc-size-dependent. Agents often retrieve fully under ~14 chunks, spotty at ~35, sparse sampling at 50+. Chunk index summary population not guaranteed, those present often include byte-count loss notices. Unique read-write asymmetry. Agents often self-report full retrieval, but fail to prove it with a write task or report truncation. |

## Summarization

Processing layer observability vary by implementation. Platforms often offer user-configured subagents
while turn-by-turn chat interactions abstract any default orchestrator-subagent relationships away.
Observable outputs from default settings primarily inform the conclusions below.

| **Platform** | **Processing Layer** | **Inference** |
| ---------- | -------------------- | ------------ |
| [Claude API web fetch](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/anthropic-claude-api-web-fetch-tool/methodology) | _Dynamic filtering optional_, `web_fetch_20260209` | Server-side tool called directly with inspectable tool result in response. [Dynamic filtering](https://platform.claude.com/docs/en/agents-and-tools/tool-use/web-fetch-tool) available with certain LLMs in which Claude writes, executes code to filter before content reaches the context window, but it's not default behavior. |
| [Claude Code](https://giuseppegurgone.com/claude-webfetch) | _Summarization threshold-triggered_ | Content under ~100K chars from trusted text/markdown sources reaches the context window directly without intermediate processing, but content exceeding this threshold goes through a summarization LLM that may lose information. |
| [Cursor](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/anysphere-cursor/methodology) | _Inferred via filtering, undocumented for web fetch_ | Codebase research, terminal commands, and browser automation requests trigger [built-in subagents](https://cursor.com/docs/subagents) `explore`, `bash`, and `browser`. Test prompts likely invoked `explore` and `bash` alongside web fetch. Backend routing and structure-aware content filtering suggest a pre-generation processing layer, not a passive, linear pipeline. |
| [Gemini API URL context](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/google-gemini-url-context-tool/methodology) | _API layer pipeline, undocumented_ | Pre-generation injection suggests processing occurs before LLM invocation. No transformation layer between retrieval and generation; LLM receives content directly and any summarization occurs as part of generation, not as an intermediate pipeline stage. |
| [GitHub Copilot](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/microsoft-github-copilot/methodology) | _Inferred via relevance-ranking, undocumented for web fetch_ | Reassembled excerpts, outputs that don't note discarded content, browser masquerading, and tool substitution patterns suggests an orchestrator-subagent relationship and not a linear, passive pipeline. Agent loop descriptions vary by implementation. [VS Code-Copilot docs](https://code.visualstudio.com/docs/copilot/agents/subagents) describe subagent delegation as _main agent-initiated_ for complex tasks with further config available, but [Copilot SDK docs](https://docs.github.com/en/copilot/how-tos/copilot-sdk/use-copilot-sdk/custom-agents) only describe subagents as configurable, and not default architecture. |
| MCP Fetch (reference server) | _None_ hard truncation at `max_length` | Passive, linear pipeline without a processing layer. |
| [OpenAI web search](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/open-ai-web-search-tool/methodology) | _Differs by API surface, undocumented_ | Chat Completions autonomously retrieves, but Responses' LLM actively manages search in the chain of thought with `open_page` and `find_in_page`, suggesting a processing layer, but not explicitly documented or named in either API responses. |
| [Windsurf Cascade](https://rhyannonjoy.github.io/agent-ecosystem-testing/docs/cognition-windsurf-cascade/methodology) | _Inferred via chunking, undocumented for web and docs search_ | Codebase research triggers [built-in subagent Fast Context](https://docs.windsurf.com/context-awareness/fast-context). Test prompts likely invoked Fast Context alongside web search. Chunk analysis, tool substitution, terminal execution, and workspace referencing suggest an extensive processing layer, not a passive, linear pipeline. |

