The insight that changes everything

Here's what you learn the hard way when you actually run dual agents: agents doing shared work don't need natural language. Every time Agent A tells Agent B something in full sentences, you're burning tokens on ceremony. The work itself — data handoffs, status flags, structured results — can travel as compressed JSON. Natural language is for humans.

This creates a two-channel architecture:

graph TB
  Human["🧑 HUMAN LAYER
NL, context-rich, expensive"] Orch["ORCHESTRATOR
routes work, aggregates results,
translates for humans"] Analyst["ANALYST AGENT

MCP: database,
web search, calculators"] Writer["WRITER AGENT

MCP: templates,
style guides, publishing APIs"] Orch -- "Report: expand to NL
(only here)" --> Human Orch -- "A2A DataPart
structured JSON
~50-200 tokens" --> Analyst Orch -- "A2A DataPart
structured JSON
~50-200 tokens" --> Writer style Human fill:#1A1A2E,stroke:#E94560,color:#E8E8EC style Orch fill:#1A1A2E,stroke:#E94560,color:#E8E8EC style Analyst fill:#0A0A0F,stroke:#8888A0,color:#E8E8EC style Writer fill:#0A0A0F,stroke:#8888A0,color:#E8E8EC

The token math

ChannelFormatToken CostPurpose
Agent → Agent (work)DataPart (JSON)Low (~50-200)Pass data, status, instructions
Agent → Human (report)TextPart (NL)High (~500-2000)Explain what happened, provide context

Over a 6-message workflow, structured A2A messaging saves ~83% of tokens compared to natural language. At scale (100 runs/day), that's 500K-800K tokens/day saved. Real money.


The lab project

tng-advanced-lab/
├── shared/
│   ├── schemas.py          # Shared data schemas (the "contract")
│   └── token_counter.py    # Token counting utility
├── agents/
│   ├── analyst_agent.py    # Specialist: research & analysis (A2A server)
│   └── writer_agent.py     # Specialist: content production (A2A server)
├── orchestrator.py          # Coordinator + human reporting layer
└── run_all.py               # Convenience launcher

Shared Schemas — The Agent Contract

Both agents agree on data shapes upfront. No natural language needed — just structured JSON that both sides understand. This is the foundation.

"""
Shared data schemas for inter-agent communication.

THE KEY INSIGHT: These schemas ARE the agent-to-agent language.
Instead of Agent A writing "I found 3 market opportunities in the
AI agent space..." (800 tokens), it sends:
{
    "analysis_id": "a1",
    "opportunities": [
        {"name": "...", "market_size": "...", "score": 8}
    ],
    "token_cost": 120
}
Same information. 85% fewer tokens.
"""

from dataclasses import dataclass, asdict
import json

@dataclass
class MarketSignal:
    """A single market observation — compressed to essentials."""
    topic: str
    signal_type: str          # "opportunity" | "threat" | "shift" | "trend"
    confidence: float         # 0.0 - 1.0
    evidence: str             # One-line source/proof
    impact_score: int         # 1-10
    time_horizon: str         # "now" | "3mo" | "6mo" | "12mo+"

@dataclass
class AnalysisResult:
    """The Analyst Agent's complete output — structured, not narrated."""
    analysis_id: str
    query: str
    signals: list             # List of MarketSignal dicts
    summary_stats: dict
    recommended_action: str
    metadata: dict

    def to_json(self) -> str:
        return json.dumps(asdict(self), indent=None, separators=(',', ':'))

@dataclass
class WriteRequest:
    """Instructions for the Writer Agent — structured brief, not prose."""
    request_id: str
    content_type: str         # "newsletter_section" | "tweet" | "summary"
    analysis_data: dict
    tone: str                 # "builder" | "executive" | "technical"
    max_words: int
    format_hints: list

@dataclass
class TokenLedger:
    """Tracks token spending across channels."""
    agent_to_agent_tokens: int = 0    # Structured JSON (cheap)
    agent_to_human_tokens: int = 0    # Natural language (expensive)

    @property
    def total(self):
        return self.agent_to_agent_tokens + self.agent_to_human_tokens

    @property
    def efficiency_ratio(self):
        if self.total == 0: return 0
        return self.agent_to_agent_tokens / self.total

The Analyst Agent (A2A Server)

The critical design choice: when this agent returns results to the orchestrator, it sends structured DataPart JSON — not a natural language summary. The orchestrator gets pure data. Zero wasted tokens.

In production, replace the simulated tools with real MCP servers: mcp-server-sqlite for database access, mcp-server-brave-search for research, custom MCP servers wrapping financial data APIs.

# The key pattern — return DataPart, not TextPart:

result_data = {
    "analysis_id": f"analysis-{uuid.uuid4().hex[:8]}",
    "query": query_text,
    "signals": signals,       # Structured list, not paragraphs
    "summary_stats": stats,   # Numbers, not prose
    "recommended_action": f"Prioritize '{top_opp}' — highest impact",
    "metadata": {
        "processing_time_ms": processing_time,
        "sources_checked": len(signals),
        "agent": "analyst-v1"
    }
}

task.artifacts = [
    Artifact(
        name="analysis-result",
        parts=[
            # DataPart = structured JSON (low tokens, machine-readable)
            Part(root=DataPart(data=result_data))
        ]
    )
]

Three terminals, one system

# Terminal 1 — Analyst Agent (A2A server on port 8001)
python agents/analyst_agent.py

# Terminal 2 — Writer Agent (A2A server on port 8002)
python agents/writer_agent.py

# Terminal 3 — Orchestrator (A2A client)
python orchestrator.py

What you should see

🎯 The New Guard — Advanced Lab 3
   Dual-Agent Architecture: Structured Work + Human Reporting
============================================================

📡 Phase 1: Agent Discovery via A2A...
   ✅ Analyst Agent  →  Skills: ['Market Signal Analysis']
   ✅ Writer Agent   →  Skills: ['Content Producer']

🔧 Phase 2: Agent↔Agent Work Channel (structured JSON)...
   📤 → Analyst: query (12 tokens)
   📥 ← Analyst: 4 signals (248 tokens)
   📤 → Writer: newsletter_section brief (285 tokens)
   📥 ← Writer: 67w newsletter_section (198 tokens)

📋 Phase 3: Human Report (natural language expansion)...
   THIS is the only place we pay for natural language.

════════════════════════════════════════════════════════
  📊 TOKEN EFFICIENCY REPORT
════════════════════════════════════════════════════════
  Agent↔Agent (structured JSON):      1165 tokens
  Agent→Human (natural language):      487 tokens
  TOTAL spent:                        1652 tokens
────────────────────────────────────────────────────────
  💡 If agent↔agent used natural language: ~6,990 tokens
     Structured JSON actually used:        ~1,165 tokens
     SAVED: ~5,825 tokens (83% reduction)
════════════════════════════════════════════════════════

The architecture principles

  1. Agents doing shared work should never generate natural language for each other. JSON is the language of work. NL is the language of reporting.
  2. A2A's DataPart was designed for this. It's the structured data channel that makes agent-to-agent communication token-efficient by design.
  3. The human reporting layer is a distinct architectural concern. It sits at the boundary between the agent world and the human world. The only place NL expansion happens.
  4. Schemas are the contract. When both agents agree on data shapes upfront, you eliminate parsing ambiguity. The data is the meaning.
  5. This pattern scales. With 3 agents exchanging 10 messages each: ~5K-8K tokens saved per workflow. At 100 runs/day: 500K-800K tokens/day. Real money.

Extending to production

  1. Replace simulated MCP tools with real ones: mcp-server-sqlite, mcp-server-fetch, mcp-server-brave-search
  2. Add real LLM reasoning in agents — internal calls stay within the agent; inter-agent messages stay structured JSON
  3. Containerize with Docker — each agent becomes a container, Agent Cards served at /.well-known/agent.json
  4. Add streaming — switch from message/send to message/stream for real-time status via SSE
  5. Add auth — Agent Cards declare auth scheme, short-lived tokens per task