The insight that changes everything
Here's what you learn the hard way when you actually run dual agents: agents doing shared work don't need natural language. Every time Agent A tells Agent B something in full sentences, you're burning tokens on ceremony. The work itself — data handoffs, status flags, structured results — can travel as compressed JSON. Natural language is for humans.
This creates a two-channel architecture:
graph TB Human["🧑 HUMAN LAYER
NL, context-rich, expensive"] Orch["ORCHESTRATOR
routes work, aggregates results,
translates for humans"] Analyst["ANALYST AGENT
MCP: database,
web search, calculators"] Writer["WRITER AGENT
MCP: templates,
style guides, publishing APIs"] Orch -- "Report: expand to NL
(only here)" --> Human Orch -- "A2A DataPart
structured JSON
~50-200 tokens" --> Analyst Orch -- "A2A DataPart
structured JSON
~50-200 tokens" --> Writer style Human fill:#1A1A2E,stroke:#E94560,color:#E8E8EC style Orch fill:#1A1A2E,stroke:#E94560,color:#E8E8EC style Analyst fill:#0A0A0F,stroke:#8888A0,color:#E8E8EC style Writer fill:#0A0A0F,stroke:#8888A0,color:#E8E8EC
The token math
| Channel | Format | Token Cost | Purpose |
|---|---|---|---|
| Agent → Agent (work) | DataPart (JSON) | Low (~50-200) | Pass data, status, instructions |
| Agent → Human (report) | TextPart (NL) | High (~500-2000) | Explain what happened, provide context |
Over a 6-message workflow, structured A2A messaging saves ~83% of tokens compared to natural language. At scale (100 runs/day), that's 500K-800K tokens/day saved. Real money.
The lab project
tng-advanced-lab/
├── shared/
│ ├── schemas.py # Shared data schemas (the "contract")
│ └── token_counter.py # Token counting utility
├── agents/
│ ├── analyst_agent.py # Specialist: research & analysis (A2A server)
│ └── writer_agent.py # Specialist: content production (A2A server)
├── orchestrator.py # Coordinator + human reporting layer
└── run_all.py # Convenience launcher
Shared Schemas — The Agent Contract
Both agents agree on data shapes upfront. No natural language needed — just structured JSON that both sides understand. This is the foundation.
"""
Shared data schemas for inter-agent communication.
THE KEY INSIGHT: These schemas ARE the agent-to-agent language.
Instead of Agent A writing "I found 3 market opportunities in the
AI agent space..." (800 tokens), it sends:
{
"analysis_id": "a1",
"opportunities": [
{"name": "...", "market_size": "...", "score": 8}
],
"token_cost": 120
}
Same information. 85% fewer tokens.
"""
from dataclasses import dataclass, asdict
import json
@dataclass
class MarketSignal:
"""A single market observation — compressed to essentials."""
topic: str
signal_type: str # "opportunity" | "threat" | "shift" | "trend"
confidence: float # 0.0 - 1.0
evidence: str # One-line source/proof
impact_score: int # 1-10
time_horizon: str # "now" | "3mo" | "6mo" | "12mo+"
@dataclass
class AnalysisResult:
"""The Analyst Agent's complete output — structured, not narrated."""
analysis_id: str
query: str
signals: list # List of MarketSignal dicts
summary_stats: dict
recommended_action: str
metadata: dict
def to_json(self) -> str:
return json.dumps(asdict(self), indent=None, separators=(',', ':'))
@dataclass
class WriteRequest:
"""Instructions for the Writer Agent — structured brief, not prose."""
request_id: str
content_type: str # "newsletter_section" | "tweet" | "summary"
analysis_data: dict
tone: str # "builder" | "executive" | "technical"
max_words: int
format_hints: list
@dataclass
class TokenLedger:
"""Tracks token spending across channels."""
agent_to_agent_tokens: int = 0 # Structured JSON (cheap)
agent_to_human_tokens: int = 0 # Natural language (expensive)
@property
def total(self):
return self.agent_to_agent_tokens + self.agent_to_human_tokens
@property
def efficiency_ratio(self):
if self.total == 0: return 0
return self.agent_to_agent_tokens / self.total
The Analyst Agent (A2A Server)
The critical design choice: when this agent returns results to the orchestrator, it sends structured DataPart JSON — not a natural language summary. The orchestrator gets pure data. Zero wasted tokens.
In production, replace the simulated tools with real MCP servers: mcp-server-sqlite for database access, mcp-server-brave-search for research, custom MCP servers wrapping financial data APIs.
# The key pattern — return DataPart, not TextPart:
result_data = {
"analysis_id": f"analysis-{uuid.uuid4().hex[:8]}",
"query": query_text,
"signals": signals, # Structured list, not paragraphs
"summary_stats": stats, # Numbers, not prose
"recommended_action": f"Prioritize '{top_opp}' — highest impact",
"metadata": {
"processing_time_ms": processing_time,
"sources_checked": len(signals),
"agent": "analyst-v1"
}
}
task.artifacts = [
Artifact(
name="analysis-result",
parts=[
# DataPart = structured JSON (low tokens, machine-readable)
Part(root=DataPart(data=result_data))
]
)
]
Three terminals, one system
# Terminal 1 — Analyst Agent (A2A server on port 8001)
python agents/analyst_agent.py
# Terminal 2 — Writer Agent (A2A server on port 8002)
python agents/writer_agent.py
# Terminal 3 — Orchestrator (A2A client)
python orchestrator.py
What you should see
🎯 The New Guard — Advanced Lab 3
Dual-Agent Architecture: Structured Work + Human Reporting
============================================================
📡 Phase 1: Agent Discovery via A2A...
✅ Analyst Agent → Skills: ['Market Signal Analysis']
✅ Writer Agent → Skills: ['Content Producer']
🔧 Phase 2: Agent↔Agent Work Channel (structured JSON)...
📤 → Analyst: query (12 tokens)
📥 ← Analyst: 4 signals (248 tokens)
📤 → Writer: newsletter_section brief (285 tokens)
📥 ← Writer: 67w newsletter_section (198 tokens)
📋 Phase 3: Human Report (natural language expansion)...
THIS is the only place we pay for natural language.
════════════════════════════════════════════════════════
📊 TOKEN EFFICIENCY REPORT
════════════════════════════════════════════════════════
Agent↔Agent (structured JSON): 1165 tokens
Agent→Human (natural language): 487 tokens
TOTAL spent: 1652 tokens
────────────────────────────────────────────────────────
💡 If agent↔agent used natural language: ~6,990 tokens
Structured JSON actually used: ~1,165 tokens
SAVED: ~5,825 tokens (83% reduction)
════════════════════════════════════════════════════════
The architecture principles
- Agents doing shared work should never generate natural language for each other. JSON is the language of work. NL is the language of reporting.
- A2A's DataPart was designed for this. It's the structured data channel that makes agent-to-agent communication token-efficient by design.
- The human reporting layer is a distinct architectural concern. It sits at the boundary between the agent world and the human world. The only place NL expansion happens.
- Schemas are the contract. When both agents agree on data shapes upfront, you eliminate parsing ambiguity. The data is the meaning.
- This pattern scales. With 3 agents exchanging 10 messages each: ~5K-8K tokens saved per workflow. At 100 runs/day: 500K-800K tokens/day. Real money.
Extending to production
- Replace simulated MCP tools with real ones:
mcp-server-sqlite,mcp-server-fetch,mcp-server-brave-search - Add real LLM reasoning in agents — internal calls stay within the agent; inter-agent messages stay structured JSON
- Containerize with Docker — each agent becomes a container, Agent Cards served at
/.well-known/agent.json - Add streaming — switch from
message/sendtomessage/streamfor real-time status via SSE - Add auth — Agent Cards declare auth scheme, short-lived tokens per task