Perplexity :: Week11 :: Special Series :: AI Token Compression and Task Delegation Research :: The Model-Agnostic AI Hand-Off Packet

The Model-Agnostic AI Hand-Off Packet

Design Guide for Trivial, Lossless Inter-Agent Task Delegation

Prepared June 2026. Sources favor publications from 2024–2026. Uncertainty is flagged inline.


Executive Summary

No universal, ratified standard for a single-task AI hand-off packet exists as of mid-2026, but a clear convergence is underway. Google's Agent2Agent (A2A) protocol (v0.3.0, now under the Linux Foundation) and Anthropic/industry's Model Context Protocol (MCP) together describe 90% of the structural requirements. IBM's competing Agent Communication Protocol (ACP) formally merged into A2A in August 2025. The IETF is actively receiving drafts but has not ratified a working group charter yet. This report synthesizes the emerging consensus into an actionable, model-agnostic hand-off packet design — answering all five research questions with 6-part analysis per key technique.[^1][^2]


1. What Should Go INTO the Packet, and How Should the Answer Come BACK?

1.1 The Canonical Outbound Packet

A hand-off packet is, before anything else, a data contract. The AWS Well-Architected Agentic AI Lens states directly: "A handoff is a data contract before it is a workflow step. The sending agent must know what to include in the payload, while the receiving agent must know what to expect."[^3]

Based on convergent guidance from AWS, agent pattern libraries, the Inkeep handoff protocol documentation, and the Gravity/Fast agent handoff patterns analysis, a minimal-viable outbound packet requires six required sections plus optional extension fields:[^4][^5]

{
  "packet_schema": "handoff/v1",
  "packet_id": "uuid-v7",
  "task": {
    "id": "original-task-uuid",
    "goal": "Outcome-stated objective, not workflow steps",
    "skill_hint": "optional: which skill/capability is being invoked",
    "deadline_iso": "2026-06-29T18:00:00Z",
    "scope_boundary": "Explicit list of what is IN scope",
    "out_of_scope": ["List of what is explicitly NOT to be done"]
  },
  "context": {
    "summary": "Compressed 200-token synthesis of prior work — conclusions, not transcript",
    "decisions_made": [
      { "decision": "Used approach X", "rationale": "Y", "alternatives_rejected": ["Z"] }
    ],
    "constraints": ["Hard limits: budget, domain, format, language"],
    "prior_failures": ["Approaches already tried that failed — do not retry"]
  },
  "state": {
    "work_completed": ["Specific completed sub-tasks"],
    "open_questions": ["Blockers the receiver must resolve"],
    "artifacts": [
      { "type": "file|data|text", "ref": "uri-or-inline", "mime": "application/json" }
    ]
  },
  "return_contract": {
    "output_schema": { "$ref": "inline-json-schema-or-uri" },
    "output_format": "json|markdown|text",
    "return_path": "webhook-url-or-polling-id",
    "acceptance_criteria": "Explicit success definition"
  },
  "metadata": {
    "sender_id": "agent-or-system-name",
    "sender_framework": "optional: LangGraph/CrewAI/etc.",
    "protocol_version": "1.0",
    "trace_id": "opentelemetry-compatible-trace-id"
  }
}

The four non-negotiable fields that every pattern library agrees on are: state (what to act on), context (compressed prior work), intent (outcome, not workflow), and return path (how to resume the caller). Missing any one silently drops or duplicates work.[^4]

1.2 The Canonical Return Packet

The receiving agent's result must be machine-parseable by the orchestrator without post-processing. Based on Inkeep's handoff protocol documentation and the agent handoff patterns literature, the return packet should mirror the outbound structure:[^5][^4]

{
  "packet_schema": "handoff-result/v1",
  "packet_id": "uuid-v7",
  "in_reply_to": "original-packet_id",
  "task_id": "original-task-uuid",
  "status": "completed|failed|partial|needs_input",
  "result": {
    "summary": "One-paragraph human-readable outcome",
    "artifacts": [
      { "type": "data", "mime": "application/json", "content": { } }
    ],
    "findings": ["Structured conclusions — not exploration log"],
    "needs_attention": ["Items the caller must act on"],
    "unresolved": ["Open questions the receiver could not resolve"]
  },
  "execution_log": {
    "confidence_level": 0.85,
    "approaches_tried": ["optional: for audit"],
    "token_usage": { "input": 1200, "output": 340 }
  },
  "metadata": {
    "receiver_id": "agent-id",
    "completed_at": "iso8601",
    "trace_id": "same-as-inbound-trace-id"
  }
}

Key design rule for return packets: return structured conclusions, not exploration logs. The agentpatterns.ai analysis of multi-agent system failures specifically names "raw transcript forwarding" as the primary anti-pattern at return time — it bloats the context and pollutes the next stage's reasoning.[^6][^7]

1.3 Six-Part Analysis

Dimension Guidance
What it does Packages task, context, state, and a return contract into one self-contained unit
Why it's needed Without it, receiving agent re-derives work the delegating agent already finished[^3]
Key mechanism JSON schema enforcement + explicit return contract field
Failure mode Omitting the return_contract field turns every failure into a debugging dig
Model agnosticism Pure JSON; no framework calls; any model that reads text can consume it
Status Emerging consensus, not yet a ratified standard[^8]

2. What Commonly Makes a Delegated AI Result Hard to Re-Ingest?

2.1 The Seven Re-Ingestion Failure Modes

Research across production multi-agent systems identifies a consistent taxonomy of re-ingestion problems. The Carnegie Mellon benchmarks cited in a 2026 production analysis found that leading agents complete only 30–35% of multi-step tasks end-to-end without human intervention — and the bottleneck is structural, not capability-related.[^9]

Failure Mode 1: Prose Wrapping Around JSON

The most common failure: the model is asked to return JSON but prepends conversational text ("Okay, I've analyzed this and here is the result: ...") before the JSON blob. Every downstream parser breaks on the prefix text. Observed across n8n, LangChain, and raw OpenAI API usage.[^10][^11][^12][^13]

Fix: Grammar-constrained generation (token-level enforcement) or schema-validated tool calls, not prompt instructions alone. Prompting alone fails 5–20% of the time in production.[^14][^9]

Failure Mode 2: Schema Drift Between Invocations

The same prompt returns date as a string one run and as an object the next. This "all-or-nothing failure" means one malformed field makes the entire 200-line JSON unparseable.[^15]

Fix: Pydantic model or JSON Schema validation at the receiver boundary with a retry cascade — not at prompt-writing time.[^16][^17]

Failure Mode 3: Double-Wrapping / Nesting

When agent frameworks pass the output through an intermediate parser, the output key gets nested inside another output key, producing {"output": {"output": {...}}} instead of {"output": {...}}. Intermittent and extremely hard to reproduce because it depends on internal framework memory state.[^18]

Fix: Keep the main agent output as plain text; run structured extraction in a separate downstream LLM chain node, not inline with the agent.[^18]

Failure Mode 4: Context Loss / "Lost in the Middle"

Research consistently shows LLMs recall information near the start and end of a prompt much better than information in the middle. In a long-running delegation chain, critical constraints and decisions stated early sink into the middle of an expanding context and get quietly ignored — a phenomenon called "context drift".[^19][^20][^21][^22]

Fix: Surface constraints and scope boundary at both the top and bottom of the system prompt in the hand-off packet. Keep context summaries under 200 tokens.[^22][^5]

Failure Mode 5: Omitting the do_not_retry List

Without an explicit list of failed approaches, the receiving agent re-explores paths the delegating agent already exhausted — causing circular work repetition that can consume thousands of tokens before producing any new output.[^6]

Fix: Mandatory prior_failures / do_not_retry array in the outbound packet's context block.[^6]

Failure Mode 6: Ambiguous Confidence

Structured fields imply certainty. An agent that fills findings with a well-formatted JSON array may hide that its conclusions were tentative; the downstream agent reads structure as authoritative. This is especially dangerous in chains of 3+ agents where false confidence compounds.[^7]

Fix: Require a confidence_level float (0.0–1.0) in the return packet's execution_log block; pair with explicit hedging in the unresolved array.

Failure Mode 7: Context Rot from Over-Long History

When the full conversation transcript is forwarded between agents, the receiving agent's context fills with the sender's reasoning process rather than its conclusions. Veseli et al. (2025) found that context degrades non-linearly when the context window is more than 50% full.[^23][^19]

Fix: Summarize at every boundary — pass the 200-token compressed summary, not the transcript. For large artifacts (> 2–3 KB), write to file and pass the path.[^5]

2.2 Six-Part Analysis

Dimension Guidance
What it does Identifies where hand-off packets structurally fail
Why it matters Schema drift and unstructured output failures are the #1 cause of production agent failure — 88% of enterprise AI initiatives never reach full production[^9]
Key mechanism Grammar-constrained generation + mandatory prior_failures and confidence_level fields
Failure mode Trusting prompt-only JSON enforcement; it works in demos, not production
Model agnosticism All seven failure modes are universal across ChatGPT/Claude/Gemini
Status Well-established; multiple production post-mortems confirm[^16][^9][^17]

3. How Do You Keep the Packet Model-Agnostic?

3.1 The Four Agnosticism Properties

A packet is model-agnostic when any LLM-based agent — regardless of vendor, version, or framework — can receive and act on it correctly without bespoke adaptation. Four properties are required.

Property 1: Pure JSON with Inline JSON Schema

Use JSON (not YAML, not XML, not custom DSLs). JSON is the native interchange format of every major LLM API (OpenAI, Anthropic, Google, Mistral, Cohere, AWS Bedrock). Including an inline $schema reference or an output_schema JSON Schema object in the return contract field gives any schema-aware validator a machine-readable contract without requiring the receiving agent to know who sent the packet.[^24][^3]

Assumption: The orchestrator controls output validation, not the receiving model. This is the correct architecture regardless of model.

Property 2: Outcome-Stated Intent, Not Workflow-Stated

State intent as outcome: "goal": "Produce a 3-paragraph executive summary of the attached report" — not "goal": "Open the file, extract the intro, then summarize each section". Workflow-stated goals are brittle across models because different models parse step sequences differently; outcome-stated goals are stable. This principle appears in AWS's Well-Architected guidance as "explicit handoff schemas".[^24][^4]

Property 3: No Framework-Specific Fields

Never include fields like langchain_run_id, openai_thread_id, or framework-specific enum values in the core packet. All framework-specific metadata goes in a metadata.extensions object, making it ignorable by a receiving agent on a different stack. The A2A protocol spec explicitly states agents "collaborate based on declared capabilities and exchanged information, without needing to share their internal thoughts, plans, or tool implementations".[^25][^26][^27]

Property 4: Modality-Agnostic Artifact References

Reference large artifacts by URI ("ref": "s3://..." or "ref": "file://...") rather than inlining them, and include a mime field. This follows A2A's Part model (TextPart, FilePart, DataPart) and allows a receiving agent to fetch only what it needs in the modality it supports. Different models have different context window sizes and multimodal capabilities; URI references abstract over this.[^27]

3.2 What Model Agnosticism Does NOT Mean

It does not mean using the lowest common denominator. A model-agnostic packet is a structured contract — it may include rich schema constraints that a weaker model ignores but a stronger model respects. The packet format itself does not degrade; the receiving model's compliance may vary. Test against the weakest target model in your ecosystem.

3.3 Six-Part Analysis

Dimension Guidance
What it does Ensures any compliant agent can parse and act on the packet
Why it's needed Even within one vendor (e.g., GPT-4.1 vs GPT-4o), behavioral differences mean framework-specific fields cause silent degradation
Key mechanism Pure JSON + outcome-stated goals + URI artifact references + extensions-only framework fields
Failure mode Including vendor-specific thread IDs or framework enums in the core schema
Model agnosticism Self-referentially guaranteed by design
Status Established principle; A2A spec states it explicitly[^27]

4. How Do You Bound Scope So the Receiving AI Doesn't Wander Off-Task?

4.1 Scope as a Security and Reliability Property

Scope bounding is not a UX nicety — it is an architecture decision with security implications. The agentpatterns.ai security boundary analysis states: "The breadth of an agent's task description is also the breadth of its attack surface. Narrowing scope is a security decision, not a UX detail."[^28]

With broad latitude, injected instructions can plausibly extend the task without contradicting anything — no stated boundary exists to defend. With narrow scope, injected instructions must directly contradict a stated constraint, which is far harder to disguise. This is the parameterized query analogy: tight instructions separate task structure from data just as parameterized SQL separates query structure from user input.[^28]

4.2 The Five Scope-Bounding Mechanisms

Mechanism 1: Explicit In-Scope / Out-of-Scope Arrays

Include both a scope_boundary array (what is permitted) and an out_of_scope array (what is explicitly forbidden). Do not rely on the receiving agent inferring what is out of scope from what is in scope — name it.[^29][^28]

"scope_boundary": ["Summarize the attached PDF — no other sources"],
"out_of_scope": ["Web search", "File writes", "Tool calls other than read_file"]

Mechanism 2: Tool Allowlist in the Packet Header

The packet's metadata block should include a permitted_tools array listing only the tools the receiving agent may invoke. This mirrors the Anthropic MCP connector's allowed_tools configuration and the Claude Code community's request for per-agent tool filtering. Claude Code GitHub issue #4380 documents the exact failure: showing all tools to a subagent causes "decision paralysis" and wrong tool selection.[^30][^31]

Mechanism 3: Termination Condition

Include an explicit acceptance_criteria in the return_contract block. A task without a defined "done" condition runs until the agent decides it is done — which is nondeterministic. The acceptance criteria field is what the orchestrator evaluates to gate result acceptance.[^26][^5]

Mechanism 4: Deadline

Every hand-off should carry a deadline_iso timestamp. Making deadlines explicit lets the receiver plan its own timeouts and lets a downstream supervisor detect stuck work without polling. AWS explicitly lists "deadlock detection and timeout handling" as a required agentic operations pattern.[^3][^4]

Mechanism 5: Confidence Threshold in Return Contract

Set a minimum confidence_level the receiver must report in the return packet. If it cannot meet the threshold, it should return "status": "needs_input" rather than a low-confidence result. This gates human escalation without hardcoding it.[^3][^5]

4.3 The Anti-Pattern

"Use your judgment" instructions are the highest-risk scope pattern for agents processing untrusted content — they explicitly authorize redirection by providing no boundary to defend. Avoid all delegated-judgment framing in the goal field.[^28]

4.4 Six-Part Analysis

Dimension Guidance
What it does Prevents task scope creep, prompt injection, and nondeterministic termination
Why it's needed Without explicit scope, an agent's attack surface equals its task description[^28]
Key mechanism Explicit scope_boundary, out_of_scope, permitted_tools, acceptance_criteria, and deadline_iso fields
Failure mode "Do whatever is needed" goals; no deadline; no acceptance criteria
Model agnosticism Works identically across all models — pure instruction text
Status Established principle; OWASP, IBM/ETH Zurich/Google research, and AWS all confirm[^32][^33][^28]

5. Current Standards and Formats for Inter-AI / Agent-to-Agent Task Exchange

5.1 The Protocol Landscape as of June 2026

The inter-agent protocol space has gone from zero standards to a dozen competing proposals in 18 months. As of mid-2026, four protocols have significant traction, with one emerging dominant for task delegation specifically.

Protocol 1: Google Agent2Agent (A2A) Protocol — Dominant

Status: Open standard, v0.3.0, under the Linux Foundation. Released April 2025, donated to Linux Foundation June 2025, IBM's ACP formally merged into A2A in August 2025. As of early 2026, over 150 organizations support it, including Google, Microsoft, AWS, Salesforce, SAP, ServiceNow, Workday, and IBM. Native integration in Azure AI Foundry, Amazon Bedrock AgentCore, and Google Cloud. A2A v0.3.0 spec is available at a2a-protocol.org.[^34][^27][^1]

Core design: Agent-to-agent task delegation over JSON-RPC 2.0, gRPC, or HTTP+JSON. Each agent publishes an Agent Card at /.well-known/agent-card.json declaring name, skills, input/output MIME types, and authentication requirements. The Task is the atomic unit — it carries a unique ID, state (submitted/working/completed/failed), message history, and output Artifacts (TextPart, FilePart, DataPart). Authentication uses standard OAuth 2.0 / API keys / mTLS — no new auth protocol.[^27]

What A2A covers: Agent discovery, task lifecycle management, streaming (SSE), push notifications for async tasks, capability negotiation. What it does not cover: The internal content of the task payload — that is left to implementers, which is where this report's packet design fills the gap.[^27]

Six-Part Analysis (A2A):

Dimension Guidance
What it does Standardizes the transport and lifecycle of inter-agent task delegation
Why it matters 150+ org adoption makes it the de facto standard for new enterprise multi-agent builds[^34]
Key mechanism Agent Card + Task object + JSON-RPC 2.0 over HTTPS
Failure mode A2A specifies the envelope, not the letter — the payload content is still up to the implementer
Model agnosticism Explicitly designed for "opaque" agents — no shared internals required[^27]
Status Published standard, v0.3.0, active development — use for any new inter-agent build[^27][^34]

Protocol 2: Anthropic Model Context Protocol (MCP) — Complementary Layer

Status: Open standard, ratified SEP-1649 November 2025, 97M+ downloads as of early 2026. Introduced by Anthropic in 2024.[^35][^36]

Core design: MCP defines how an agent connects to tools — databases, file systems, APIs — not how agents connect to other agents. "MCP connects an agent to its tools — vertical. A2A connects agents to other agents — horizontal." MCP uses JSON-RPC over stdio or HTTP, with servers exposing Tools, Resources, and Prompts. Anthropic's API now supports MCP connector natively.[^30][^34]

Relation to hand-off packets: An agent receiving a hand-off packet may use MCP to invoke its own tools during execution. The hand-off packet's permitted_tools array can reference MCP tool names as the allowlist.

Six-Part Analysis (MCP):

Dimension Guidance
What it does Standardizes agent-to-tool connection
Why it matters Broadest client adoption (Claude Desktop, Cursor, Cline, OpenAI Agents SDK)[^37]
Key mechanism JSON-RPC, tool/resource/prompt primitives
Failure mode Conflating MCP with A2A — they solve different layers
Model agnosticism Works with any model that supports tool calling
Status Ratified standard — use for agent-to-tool connections[^37]

Protocol 3: IBM Agent Communication Protocol (ACP) — Deprecated, Merged into A2A

ACP was launched by IBM in March 2025 as a REST-native alternative to A2A, donated to the Linux Foundation alongside BeeAI. In August 2025, the ACP team formally merged with A2A. Do not start new projects on ACP.[^1]

Protocol 4: IETF Drafts — Fragmented, No RFC Yet

As of mid-2026, over a dozen IETF Internet-Drafts address AI agent coordination, including:

  • draft-cui-ai-agent-task-00: Task-oriented coordination requirements[^38][^39]
  • draft-cui-ai-agent-discovery-invocation-01: Agent discovery protocol (AIDIP)[^40]
  • draft-klrc-aiagent-auth-00: AI agent authentication/authorization[^41]
  • draft-somoza-dmsc-atn-agent-trust-negotiation-00: Agent Trust Negotiation (ATN) protocol[^42]

Critical caveat: As of June 2026, none of these drafts carry IETF consensus — they are individual submissions with no formal standing. At least one (agents.txt) has already expired. The IETF is working on a potential working group charter, but no RFC exists.[^39][^8][^2][^38]

Six-Part Analysis (IETF Drafts):

Dimension Guidance
What it does Proposes requirements for agent task coordination at the network standards layer
Why it matters Signals that standardization bodies are engaging — but it's early[^2]
Key mechanism Individual Internet-Drafts; no ratified RFC
Failure mode Treating drafts as standards — all carry explicit "not endorsed by the IETF" disclaimers[^38]
Model agnosticism Intended to be universal, but too early to implement against
Status ⚠️ Fragmented, experimental — monitor but do not build against[^8][^2]

Protocol 5: OpenAI Agents SDK Handoffs — Framework-Specific

OpenAI's Agents SDK exposes handoff() as a first-class construct: a tool call that transfers control to a named agent, with an optional input_filter to control what history the receiving agent sees and an inputJsonSchema for typed payloads. The SDK supports both agents-as-tools (orchestrator-subagent pattern) and peer handoffs.[^43][^44][^29]

This is framework-specific and works only within the OpenAI Agents SDK ecosystem. However, the input/output schema patterns are directly applicable to the model-agnostic packet design above.[^43]

Protocol 6: Works With Agents — Handoff Protocol Layer 4 — Emerging Community Spec

A community-authored spec (CC BY 4.0, v1.0.0) defines a Handoff Protocol as "Layer 4 — Session" in an Agent OSI model. It specifies handoff_id (UUID v7), sender identity with Ed25519 signature, context_pack, and a full lifecycle (request → ack/reject → progress → completion). Framework-agnostic by design, explicitly targeting Hermes, Claude Code, Codex, and Copilot interop.[^25]

Status: Community spec, not a standards-body document. Valuable reference implementation; uncertain adoption path.[^25]


6. Putting It All Together: The Ideal Hand-Off Packet

6.1 Reference Architecture

The complete architecture stack for a model-agnostic hand-off uses three layers:

  1. Transport Layer: A2A protocol (JSON-RPC 2.0 over HTTPS) for inter-agent task lifecycle management and capability discovery
  2. Packet Layer: The hand-off packet JSON format described in Section 1 — the content that travels inside an A2A Task's DataPart
  3. Enforcement Layer: Grammar-constrained generation or schema-validated tool calls at both the sending and receiving boundary; Pydantic / JSON Schema validation with retry cascade

6.2 The Minimal Model-Agnostic Hand-Off Template

A minimum viable packet that works across ChatGPT, Claude, Gemini, and any other instruction-following model:

{
  "packet_schema": "handoff/v1",
  "packet_id": "uuid-v7",
  "task": {
    "id": "task-uuid",
    "goal": "OUTCOME-STATED: what done looks like, not how to get there",
    "deadline_iso": "ISO8601",
    "scope_boundary": ["Explicit list of what IS in scope"],
    "out_of_scope": ["Explicit list of what is NOT to be done"],
    "permitted_tools": ["tool_a", "tool_b"]
  },
  "context": {
    "summary": "≤200 token compressed synthesis of relevant prior work",
    "decisions_made": [{"decision": "...", "rationale": "..."}],
    "constraints": ["Hard limits that cannot be violated"],
    "prior_failures": ["Do not retry: approach X failed because Y"]
  },
  "state": {
    "work_completed": ["Completed sub-tasks"],
    "open_questions": ["Blockers receiver must resolve"],
    "artifacts": [{"type": "data|file|text", "ref": "uri", "mime": "..."}]
  },
  "return_contract": {
    "output_schema": { },
    "return_path": "webhook-or-polling-id",
    "acceptance_criteria": "Explicit definition of success",
    "min_confidence": 0.8
  }
}

6.3 Six-Part Analysis: The Complete Packet

Dimension Guidance
What it does Packages all information a receiving agent needs to act without re-deriving prior work
Why it's needed Structured handoffs eliminate the #1 and #2 causes of multi-agent pipeline failure: context loss and schema drift[^9][^45]
Key mechanism Compressed context + outcome-stated goal + explicit scope + typed return contract
Failure mode The packet is the contract — if the orchestrator does not validate return packets against output_schema, the contract is unenforceable
Model agnosticism Pure JSON; no framework fields in core schema; URI artifact refs abstract modality
Status Design synthesis from current best practices; not yet a single ratified standard[^8]

Source Map

Source Type Recency Trust Level
A2A Protocol Spec v0.3.0 (a2a-protocol.org)[^27] Normative spec Active 2026 ★★★★★
AWS Well-Architected Agentic AI Lens[^3][^24] Official vendor guidance 2025–2026 ★★★★★
arXiv 2505.02279 — Survey of Agent Interoperability Protocols[^35] Peer-reviewed survey May 2025 ★★★★★
Google A2A announcement + I/O 2025 updates[^46][^47][^48] Official vendor Apr–May 2025 ★★★★★
Gravity/Fast — 8 Agent Handoff Patterns[^4] Practitioner reference May 2026 ★★★★☆
agentpatterns.ai — Handoff Protocols + Scope Security[^7][^28] Pattern library 2025–2026 ★★★★☆
Inkeep Handoff Protocol / Procedural Patterns[^5] Practitioner reference Nov 2024 ★★★★☆
Lorg — Handoff Protocol Pattern[^6] Community pattern 2025–2026 ★★★☆☆
works-with-agents.dev Handoff Spec[^25] Community spec May 2026 ★★★☆☆
agentmarketcap.ai — Structured Output Reliability 2026[^9] Industry blog Apr 2026 ★★★☆☆
dev.to — JSON Parsing Problem[^16] Practitioner blog Mar 2026 ★★★☆☆
n8n community threads on double-wrapping[^11][^18][^12] Community forum 2025 ★★☆☆☆
IETF Internet-Drafts (agent task, auth, trust)[^38][^39][^42][^41] Draft proposals (not ratified) 2025–2026 ★★★☆☆ (as signals)
Corvair Agentic AI Risk Catalog R-MC-03[^45] Risk catalog 2025–2026 ★★★☆☆
"Lost in the Middle" — atlan.com / ai-tldr.dev[^21][^22] Research synthesis Jun 2026 ★★★★☆

State of the Art as of June 29, 2026

Convergent reality: The community has reached rough consensus on what a hand-off packet should contain (the fields described in Section 1) but has not agreed on a single schema format. A2A v0.3.0 specifies the transport and task lifecycle envelope; it deliberately leaves the payload content open, creating a 1-layer gap this report fills.

The dominant stack: A2A (horizontal, agent-to-agent) + MCP (vertical, agent-to-tool) is the architecture that Google, Microsoft Azure AI Foundry, AWS Bedrock AgentCore, Anthropic, Salesforce, SAP, and IBM have converged on as of mid-2026. Any new multi-agent system should be designed against this stack.[^48][^34][^3]

The open problems:

  1. No universal registry/discovery for non-enterprise agents (11 competing IETF drafts, none ratified)[^8]
  2. Grammar-constrained generation is available on OpenAI, Gemini, and open-source inference servers (vLLM, llama.cpp) but requires model-specific configuration — not yet transparently model-agnostic[^49][^50]
  3. Trust / provenance across organizational boundaries is addressed by A2A v0.3.0's signed Agent Cards (JWS per RFC 7515) and the ATN draft, but cross-org trust chains remain implementation-defined[^42][^27]
  4. Scope bounding relies entirely on the sender crafting tight instructions — there is no runtime enforcement layer at the protocol level equivalent to, say, parameterized SQL

Uncertainty flag: The IETF agent-to-agent mailing list and potential working group charter may produce an RFC that supersedes A2A or formalizes its structure. Monitor datatracker.ietf.org and the A2A GitHub at github.com/a2aproject/A2A for changes. The landscape has been moving fast enough that a six-month-old source may be materially outdated.[^51][^2]


References

  1. Agent Communication Protocol (ACP): Review, Radar ... - Tekai - Agent Communication Protocol (ACP) reviewed and rated hold on the Tekai technology radar. See analys...

  2. Agentic AI communications: Identifying the standards we ... - When it comes to standards work around agentic AI, we’re at an exciting threshold. As more tools eme...

  3. AGENTOPS01-BP02 Design multi-agent handoff procedures with ... - Without a structured context package, the receiving agent re-derives work the previous agent already...

  4. AI Agent Handoff Patterns: 8 Contracts for Passing Work ... - Eight named patterns for AI agent handoffs: state transfer, context compression, human escalation. E...

  5. Handoff Protocol and Procedural Patterns - Inkeep Open Source Docs - Structure context passing between agents with handoff packets and return packets. Use procedural pat...

  6. Handoff Protocol Pattern - Lorg - PATTERN · multi-agent, handoff, context-transfer · Score 85/100

  7. Agent Handoff Protocols: Passing Work Between Agents - Explicit contracts defining what each pipeline stage produces and expects — preventing context and i...

  8. 11 Competing IETF Drafts, 1 Expiring April 10 - Global Chat - There are 11 competing IETF drafts for AI agent discovery and policy standardization. The first one ...

  9. Structured Output Reliability Engineering for Production AI Agents in ... - How JSON schema enforcement, typed tool calls, and retry cascade patterns are eliminating the #1 cau...

  10. Agent Output Not Machine-Parseable Downstream - Your agent wraps JSON in markdown or adds prose commentary, breaking the downstream parser. Here's h...

  11. Parsing JSON Output Issues with n8n AI Agent and Structured Parser - Hello! We understand you’re facing an inconsistent output format issue when using n8n’s AI Agent mod...

  12. Issue using Structured Output Parser with Cluster AI Agent ... - I’m using the Cluster AI Agent node with a Structured Output Parser to generate technical qualificat...

  13. AI Agent Ignores Prompt Instructions, Outputs Extra Text Causing ... - Hi everyone, I'm using the AI Agent node in n8n. My goal is to have it process data and output a str...

  14. Beyond JSON Mode: Getting Reliable Structured Outputs from LLMs ... - Prompt-only JSON extraction fails 5–20% of the time in production. A practical breakdown of all four...

  15. LLMs suck at generating large, structured data. Tips on how to get ... - LLMs are great at generating text. They're terrible at generating structured data reliably. If you'v...

  16. The JSON Parsing Problem That's Killing Your AI Agent ... - Every AI agent operator hits this wall: you prompt your LLM to return structured data, and it gives....

  17. Why Most AI Agents Fail at Structured Output and How to ... - Learn why AI agents fail at structured output and master Pydantic, JSON schemas, and retry strategie...

  18. AI Agent response sometimes nests / double wraps "output ... - Describe the problem/error/question I’m seeing an intermittent bug whereby the input to my Structure...

  19. Why One AI Agent Isn't Enough: Subagent Delegation and Context ... - Long-running AI agent sessions suffer from context drift — quality degrades as the context window fi...

  20. Context Management: The Agent Reliability Variable Nobody ... - Zartis - Large context windows solved capacity but birthed 'context rot.' Discover why frontier models degrad...

  21. Lost-in-the-Middle Problem: Why Context Position Matters - Atlan - The 'lost-in-the-middle' problem occurs when LLMs prioritize the beginning and end of a context wind...

  22. The 'Lost in the Middle' Problem, Explained | AI/TLDR - Long-context models recall the start and end of a prompt better than the middle. Why it happens and ...

  23. Context Rot: Why AI Gets Worse the Longer You Chat ... - Product Talk - When the context is greater than 50% full, Veseli et al (2025) found a different pattern: Context de...

  24. AGENTREL02-BP04 Develop clear instruction protocols for ... - Ad-hoc prompts interpreted slightly differently by each model call produce unpredictable behavior, a...

  25. Handoff Protocol — Layer 4 — Works With Agents - Specification: Handoff Protocol — Layer 4. CC BY 4.0.

  26. Agent Handoff Protocol Documentation Spec for Multi-Agent AI ... - Specification for documenting agent handoff protocols in multi-agent AI systems—trigger conditions, ...

  27. Specification - A2A Protocol - The Agent2Agent protocol is an open standard that allows different AI agents to securely communicate...

  28. Narrow Task Scope as a Security Boundary for AI Agents - Narrow task scope limits attack surface and blast radius. Tight instructions force injections to con...

  29. Orchestration and handoffs | OpenAI API - Learn how to orchestrate multiple agents with handoffs and agents-as-tools in the OpenAI Agents SDK.

  30. MCP connector

  31. Feature: Per-agent MCP tool filtering to improve agent focus and accuracy · Issue #4380 · anthropics/claude-code - Main problem Claude Code shows ALL MCP tools to main agent and sub agents, causing: Decision paralys...

  32. Design Patterns for Securing LLM Agents against Prompt Injections - This new paper by 11 authors from organizations including IBM, Invariant Labs, ETH Zurich, Google an...

  33. LLM Prompt Injection Prevention - OWASP Cheat Sheet Series - Prompt injection is a vulnerability in Large Language Model (LLM) applications that allows attackers...

  34. 3 - A2A vs MCP: The Two Protocols Powering Multi-Agent AI in 2026 - From episode 2: the 2026 production default for multi-agent systems is supervisor-worker — one agent...

  35. A Survey of Agent Interoperability Protocols: Model Context ... - arXiv - Large language model (LLM)-powered autonomous agents demand robust, standardized protocols to integr...

  36. AI Agent Protocol Ecosystem Map 2026: Complete Visual - Visual ecosystem map of the AI agent protocol landscape: MCP (97M downloads), A2A (50+ partners), AC...

  37. MCP vs A2A vs agents.json: agent-protocol discovery compared - The three agent-discovery protocols compared — MCP (Model Context Protocol), A2A (Agent-to-Agent), a...

  38. Task-oriented Coordination Requirements for AI Agent Protocols - AI agent communication requires intelligent task level coordination to manage dynamic workloads acro...

  39. Task-oriented Coordination Requirements for AI Agent Protocols - AI agent communication requires intelligent task level coordination to manage dynamic workloads acro...

  40. draft-cui-ai-agent-discovery-invocation-01 - IETF Datatracker - Internet-Draft AIDIP February 2026 · This Internet-Draft is submitted in full conformance with the p...

  41. AI Agent Authentication and Authorization - IETF - This document proposes a model for authentication and authorization of AI agent interactions. It lev...

  42. Agent Trust Negotiation: Capability, Delegation, and Provenance Binding for AI Agents - This document defines the Agent Trust Negotiation (ATN) protocol. ATN sits above whatever mechanism ...

  43. Handoffs - OpenAI Agents SDK

  44. Handoff | OpenAI Agents SDK - GitHub Pages

  45. R-MC-03: Context Loss in Delegation | Agentic AI Risk Catalog - When one agent delegates work to another agent, critical context is lost with each handoff. Receivin...

  46. Announcing the Agent2Agent Protocol (A2A) - The A2A protocol will allow AI agents to communicate with each other, securely exchange information,...

  47. Google Open-Sources Agent2Agent Protocol for Agentic ... - Google released the Agent2Agent (A2A) Protocol, an open-source specification for building AI agents ...

  48. What's new with Agents: ADK, Agent Engine, and A2A Enhancements - Build advanced, intelligent agents with ease using Google's ADK, Agent Engine, and A2A protocol, des...

  49. Structured Outputs from LLMs — JSON, Pydantic & Schema ... - Force LLMs to return valid JSON, match Pydantic schemas, and integrate with typed pipelines. OpenAI,...

  50. Evaluating LLM Structured Output Modes 2026 - Future AGI - Evaluating LLM Structured Output Modes (2026). Compare OpenAI strict, Anthropic JSON, Gemini schema,...

  51. Agent2Agent (A2A) is an open protocol enabling ... - GitHub - The Agent2Agent (A2A) protocol addresses a critical challenge in the AI landscape: enabling gen AI a...

Previous
Previous

Claude :: Week11 :: Special Series :: AI Task Delegation Research :: The Model-Agnostic Hand-Off Packet: Designing Lossless Single-Task Delegation Between Different AIs

Next
Next

Perplexity :: Week10 :: Special Series :: AI Token Compression and Task Delegation Research :: Lazy-Boot Agent Design: Load Only What You Need, Defer the Rest