back to posts
#14 Part 4 2025-07-28 30 min

Multi-Agent Orchestration: When 6 AIs Build Your Codebase

How we coordinated 6 Claude Code instances to refactor 17 blockchain servers simultaneously

Multi-Agent Orchestration: When 6 AIs Build Your Codebase

How we coordinated 6 Claude Code instances to refactor 17 blockchain servers simultaneously

Historical Context (November 2025): Built July 2, 2025—one week after Anthropic released DXT. This documents our approach to coordinating 6 AI agents for parallel blockchain server development when manual building was the only established method. We were among the first wave of developers building production MCP infrastructure, inventing coordination patterns in real-time as the packaging ecosystem was being born.

Date: July 28, 2025 Built: Early July 2025 Inspired by: McKay Wrigley & Jason Kneen’s tmux multi-agent setups Author: Myron Koch & Claude Code Category: AI Development Workflows

The Inspiration

Credit where it’s due: McKay Wrigley and Jason Kneen pioneered the tmux multi-agent pattern.

I saw their setups - multiple Claude Code instances running in tmux windows, each working on different parts of the codebase simultaneously. It was brilliant but had one problem: no orchestration layer.

Agents would:

The insight: What if we built an MCP server specifically to orchestrate multi-agent workflows?

The Scaling Problem

We had a problem:

Solution: Run 6 Claude Code instances in parallel with orchestration.

Agent 1: Ethereum + Polygon servers
Agent 2: Solana + Bitcoin servers
Agent 3: Cosmos + Osmosis servers
Agent 4: BSC + Avalanche servers
Agent 5: Testing infrastructure
Agent 6: Documentation + validation

The Setup

Hardware Requirements

# What we needed
- MacBook Pro M3 Max
- 128GB RAM (yes, really)
- 2TB SSD (for all the git branches)
- 6 terminal windows
- Lots of coffee

# Resource monitoring
top -o MEM
# 6 Claude Code instances: ~45GB RAM
# 17 server dev environments: ~8GB RAM
# VSCode + browsers: ~15GB RAM
# Total: ~70GB actively used

Directory Structure

workspace/
├── agent-1-ethereum/
│   ├── ethereum-sepolia-mcp-server/
│   └── polygon-amoy-mcp-server/
├── agent-2-solana/
│   ├── solana-devnet-mcp-server/
│   └── bitcoin-testnet-mcp-server/
├── agent-3-cosmos/
│   ├── cosmos-theta-mcp-server/
│   └── osmosis-mcp-server/
├── agent-4-bsc/
│   ├── bsc-testnet-mcp-server/
│   └── avalanche-fuji-mcp-server/
├── agent-5-testing/
│   └── test-infrastructure/
└── agent-6-docs/
    └── documentation/

Each agent got its own workspace to prevent file conflicts.

The Coordination Protocol

Rule 1: Clear Boundaries

// agents-config.json
{
  "agent-1": {
    "name": "Ethereum Specialist",
    "servers": ["ethereum-sepolia", "polygon-amoy"],
    "allowed_files": [
      "servers/testnet/ethereum-sepolia-mcp-server/**",
      "servers/testnet/polygon-amoy-mcp-server/**"
    ],
    "forbidden_files": [
      "servers/testnet/solana-*/**",
      "servers/testnet/bitcoin-*/**",
      "servers/mainnet/**"
    ],
    "primary_tasks": [
      "Refactor to modular architecture",
      "Implement MBPS v2.1 tools",
      "TypeScript migration"
    ]
  },
  // ... 5 more agent configs
}

Rule 2: Communication Channel

We used a shared agent-status.md file:

# Multi-Agent Status Board
Last Updated: 2025-10-07 14:32:00

## Agent 1 (Ethereum/Polygon)
- **Status**: Working on Ethereum tool extraction
- **Progress**: 60% complete
- **Current File**: ethereum-sepolia-mcp-server/src/tools/core/eth-get-balance.ts
- **ETA**: 2 hours to modular completion
- **Blockers**: None
- **Next**: Polygon server after Ethereum done

## Agent 2 (Solana/Bitcoin)
- **Status**: Testing Solana token transfers
- **Progress**: 75% complete
- **Current File**: solana-devnet-mcp-server/tests/tokens.test.ts
- **ETA**: 1.5 hours to testing completion
- **Blockers**: Waiting for Agent 5's test framework
- **Next**: Bitcoin UTXO handling

## Agent 3 (Cosmos/Osmosis)
- **Status**: Implementing Osmosis staking tools
- **Progress**: 90% complete
- **Current File**: osmosis-mcp-server/src/tools/staking/osmo-delegate.ts
- **ETA**: 30 minutes
- **Blockers**: None
- **Next**: Cosmos IBC transfers

## Agent 4 (BSC/Avalanche)
- **Status**: BSC NFT tools complete, starting Avalanche
- **Progress**: 45% complete
- **Current File**: avalanche-fuji-mcp-server/src/index.ts
- **ETA**: 3 hours
- **Blockers**: None
- **Next**: Avalanche C-Chain integration

## Agent 5 (Testing)
- **Status**: Building universal test suite
- **Progress**: 80% complete
- **Current File**: test-infrastructure/smoke-tests.ts
- **ETA**: 1 hour
- **Blockers**: None
- **Next**: Deploy tests to all servers

## Agent 6 (Docs/Validation)
- **Status**: Writing MBPS v2.1 compliance guide
- **Progress**: 70% complete
- **Current File**: docs/mbps-v2.1-migration-guide.md
- **ETA**: 2 hours
- **Blockers**: None
- **Next**: Validate all servers against standard

---
**Update this file after every major milestone!**

Rule 3: Git Workflow

# Each agent gets its own feature branch
git checkout -b agent-1/ethereum-polygon-refactor
git checkout -b agent-2/solana-bitcoin-refactor
git checkout -b agent-3/cosmos-osmosis-refactor
git checkout -b agent-4/bsc-avalanche-refactor
git checkout -b agent-5/testing-infrastructure
git checkout -b agent-6/documentation

# Agents never merge directly to main
# Human orchestrator (me) reviews and merges

The Orchestration Commands

I gave each agent specific, bounded instructions:

Agent 1 Command

You are Agent 1: Ethereum/Polygon Specialist

Your ONLY responsibility:
1. Refactor ethereum-sepolia-mcp-server to modular architecture
2. Refactor polygon-amoy-mcp-server to modular architecture
3. Implement all 25 MBPS v2.1 core tools for both
4. Migrate both to TypeScript
5. Create comprehensive test suites

Files you can modify:
- servers/testnet/ethereum-sepolia-mcp-server/**
- servers/testnet/polygon-amoy-mcp-server/**

Files you CANNOT touch:
- Any other servers
- Shared documentation (Agent 6 handles this)
- Test infrastructure (Agent 5 handles this)

Update agent-status.md every 30 minutes with your progress.

When blocked, write to agent-blockers.md with:
- What you're blocked on
- Which agent can unblock you
- Estimated impact if not unblocked

Begin with ethereum-sepolia-mcp-server tool extraction.

Agent 5 Command (Testing Specialist)

You are Agent 5: Testing Infrastructure Specialist

Your ONLY responsibility:
1. Create universal test suite that works for ALL blockchain servers
2. Implement smoke tests, integration tests, core tests
3. Create test utilities and mocks
4. Set up Jest configuration templates
5. Deploy tests to all 17 servers

You are a SUPPORT agent. Your work enables other agents.

Priority order:
1. Create core test infrastructure first
2. Deploy to Agent 1's servers (Ethereum/Polygon)
3. Deploy to Agent 2's servers (Solana/Bitcoin)
4. Continue deployment as agents request

Communicate via agent-status.md when test framework is ready.

Agent 6 Command (Documentation Specialist)

You are Agent 6: Documentation & Validation Specialist

Your ONLY responsibility:
1. Write comprehensive MBPS v2.1 migration guides
2. Create tool implementation templates
3. Document naming conventions
4. Build validation scripts
5. Review all agents' work for MBPS compliance

You are a QUALITY CONTROL agent.

Your workflow:
1. Create documentation first (guides, templates)
2. Review each agent's PRs before merge
3. Run validation scripts on all servers
4. Document any deviations from standard

You have VETO power over any PR that violates MBPS v2.1.

The Communication Patterns

Pattern 1: Status Updates

Every agent updated agent-status.md every 30 minutes:

## Agent 3 (Cosmos/Osmosis) - 14:00 Update
- Completed: Osmosis staking delegation tool
- Testing: Tool registration in index.ts
- Found: Bug in validator address validation
- Fixed: Added Bech32 validation
- Next: Implement undelegation tool
- ETA: 20 minutes

Pattern 2: Blocker Protocol

When an agent got blocked, they wrote to agent-blockers.md:

# Active Blockers

## Blocker #1 (Agent 2 → Agent 5)
- **Reporter**: Agent 2 (Solana/Bitcoin)
- **Blocked On**: Test framework for SPL token transfers
- **Impact**: Cannot validate token transfer implementation
- **Needs**: Jest mock for @solana/spl-token Connection
- **ETA if unblocked**: Can complete testing in 1 hour
- **Status**: URGENT - blocking 3 tools

## Blocker #2 (Agent 4 → Agent 6)
- **Reporter**: Agent 4 (BSC/Avalanche)
- **Blocked On**: Clarification on BEP-20 tool naming
- **Impact**: Low - can proceed with tentative naming
- **Needs**: Confirmation: `bsc_transfer_token` or `bsc_transfer_bep20`?
- **ETA if unblocked**: Immediate
- **Status**: LOW PRIORITY

I (human orchestrator) would then:

  1. Read blocker file every 15 minutes
  2. Route urgent blockers to the right agent
  3. Make decisions on naming/architectural questions
  4. Unblock agents ASAP

Pattern 3: Merge Requests

Agents signaled completion via agent-ready-for-review.md:

# Ready for Review

## Agent 1: Ethereum Server Complete ✅
- **Branch**: agent-1/ethereum-polygon-refactor
- **Commit**: a7f3d9e
- **Changes**:
  - 47 tools implemented
  - Full TypeScript conversion
  - Test coverage: 94%
  - MBPS v2.1 compliant: YES
- **Validation**:
  - All tests passing: ✅
  - MCP inspector verified: ✅
  - Tool naming validated: ✅
  - No console.log statements: ✅
- **Ready for merge**: YES
- **Agent 6 review**: APPROVED

## Agent 2: Solana Server Complete ✅
- **Branch**: agent-2/solana-bitcoin-refactor
- **Commit**: b4e8c3a
[... similar details]

The Conflicts We Hit

Conflict 1: Shared Utility Functions

Problem:

Agent 1: Created src/utils/logger.ts in Ethereum server
Agent 2: Created src/utils/logger.ts in Solana server
Agent 5: Created test-infrastructure/utils/logger.ts

All different implementations!

Solution:

# I intervened
# Created canonical logger in agent-6/docs
# Agents 1-4 copied from canonical source
# Enforced: "No shared code. Copy-paste is fine."

Conflict 2: Git Branch Conflicts

Problem:

Agent 3: Modified servers/mainnet/osmosis-mcp-server
Agent 6: Also modified same file for documentation examples

Git conflict on same lines!

Solution:

# My merge protocol
1. Pull Agent 3's branch
2. Review changes
3. Merge to main
4. Agent 6: Rebase on updated main
5. Agent 6: Re-apply documentation changes
6. Human review: Ensure both changes preserved

Conflict 3: Naming Convention Disagreements

Problem:

Agent 1: Implemented eth_transferToken (camelCase)
Agent 2: Implemented sol_transfer_token (snake_case)
Agent 4: Implemented bsc_TransferToken (PascalCase)

Three different naming conventions!

Solution:

# I wrote definitive naming standard
# Posted to all agents via agent-announcements.md

ANNOUNCEMENT: Tool Naming Standard (MANDATORY)

All tool names MUST follow this pattern:
{prefix}_{action}_{resource}

Examples:
✅ eth_transfer_token
✅ sol_swap_tokens
✅ bsc_get_balance

❌ eth_transferToken
❌ solTransferToken
❌ bsc_Transfer_Token

Agent 6 will REJECT any PR violating this standard.

All agents: Please validate your existing tools.

All agents fixed their naming in < 10 minutes.

The Productivity Metrics

Timeline Comparison

Solo Development (Estimated):

Ethereum server: 4 days
Solana server: 4 days
Bitcoin server: 5 days (UTXO complexity)
Cosmos servers: 6 days
BSC server: 3 days
Avalanche server: 3 days
Testing infrastructure: 4 days
Documentation: 3 days

Total: 32 days (6.4 weeks)

Multi-Agent Development (Actual):

Day 1-2: Initial refactoring (all agents in parallel)
Day 3-4: Tool implementation
Day 5-6: Testing and validation
Day 7: Integration and merge

Total: 7 days (1.4 weeks)

Speedup: 4.5x faster

Code Production Rate

Solo: ~1,200 lines/day
Multi-agent: ~5,400 lines/day

Daily output:
- Agent 1: 900 lines
- Agent 2: 850 lines
- Agent 3: 1,100 lines
- Agent 4: 800 lines
- Agent 5: 950 lines
- Agent 6: 800 lines
Total: 5,400 lines/day

Quality Metrics

Bugs found in review: 23
(Solo development usually: ~40 bugs for same scope)

Test coverage: 92% average
(Solo development usually: ~75%)

MBPS v2.1 compliance: 100%
(Agent 6 caught all violations before merge)

Why better quality?
- Agent 5 built tests while others built features
- Agent 6 reviewed EVERYTHING
- Agents didn't get tired or sloppy
- Consistent patterns across all servers

The Unexpected Benefits

Benefit 1: Pattern Propagation

Agent 1: Discovered elegant error handling pattern
Agent 1: Posted to agent-patterns.md
Agents 2-4: Immediately adopted same pattern

Result: All 17 servers have identical error handling
Without intervention, each solo session would have different patterns.

Benefit 2: Cross-Pollination

Agent 3: Cosmos IBC transfer implementation
Agent 6: "This pattern would work for Ethereum bridge transfers"
Agent 1: Adapted IBC pattern for Ethereum L2 bridges

One good idea → propagated to all EVM chains in 2 hours

Benefit 3: Specialized Expertise

Agent 2 became THE Solana expert
Agent 3 became THE Cosmos expert
Agent 5 became THE testing expert

Each agent developed deep context in their domain.
Better than context-switching across all chains.

The Challenges We Faced

Challenge 1: Context Drift

Problem: After 4 hours, Agent 2 “forgot” it was working on Bitcoin.

Agent 2: "I've completed the Ethereum NFT tools"
Me: "You're Agent 2. You handle Solana/Bitcoin, not Ethereum."
Agent 2: "Oh! Right. Let me check my assignment..."

Solution: Added role reminders to every prompt:

YOU ARE AGENT 2: SOLANA/BITCOIN SPECIALIST
Your ONLY servers: solana-devnet, bitcoin-testnet
NOT your servers: ethereum, polygon, cosmos, bsc, avalanche

Challenge 2: Merge Conflicts

Despite separate workspaces, we still hit conflicts:

# Both agents modified the same root-level files
agent-1 modified: package.json (adding Ethereum deps)
agent-3 modified: package.json (adding Cosmos deps)

# Resolution
Me: Created package.json per server
No more shared root package.json

Challenge 3: Communication Overhead

Time spent reading agent-status.md: 15 min/hour
Time spent resolving blockers: 10 min/hour
Time spent reviewing PRs: 20 min/hour
Time spent coordination: 45 min/hour

Total: 45 minutes per hour on orchestration
But: 6 agents * 55 min productive work = 5.5 hours of work per hour
Net productivity: 5.5x - 0.75 = 4.75x

Challenge 4: Inconsistent Quality

Agent 1: Excellent TypeScript, perfect tests
Agent 4: Sloppy types, incomplete tests
Agent 6: Rejected Agent 4's PR

Agent 4 (revision 2): Much better, but still issues
Agent 6: Rejected again

Agent 4 (revision 3): Finally approved

Lesson: Agent 6 (quality control) was ESSENTIAL. Without it, quality would have varied wildly.

The Orchestration Workflow

My typical hour looked like:

00:00 - Review agent-status.md
        Check all agents progressing

05:00 - Scan agent-blockers.md
        Route urgent blockers

10:00 - Review agent-1's completed tool
        Provide feedback

20:00 - Merge agent-2's completed server
        Test integration

35:00 - Answer agent-4's architecture question
        Update agent-announcements.md

45:00 - Review agent-6's validation report
        Approve 3 PRs for merge

55:00 - Update master status doc
        Plan next hour's priorities

I was a project manager, not a developer.

The Tools We Built

Tool 1: Agent Status Dashboard

#!/bin/bash
# scripts/agent-dashboard.sh

clear
echo "════════════════════════════════════════════════════════"
echo "  MULTI-AGENT ORCHESTRATION DASHBOARD"
echo "════════════════════════════════════════════════════════"
echo ""

# Agent status
for i in {1..6}; do
  status=$(grep -A 6 "^## Agent $i" agent-status.md | grep "Status:" | cut -d: -f2)
  progress=$(grep -A 6 "^## Agent $i" agent-status.md | grep "Progress:" | cut -d: -f2)
  echo "Agent $i: $status ($progress)"
done

echo ""
echo "════════════════════════════════════════════════════════"
echo "  ACTIVE BLOCKERS"
echo "════════════════════════════════════════════════════════"

blocker_count=$(grep -c "^## Blocker" agent-blockers.md)
echo "Total: $blocker_count"

if [ $blocker_count -gt 0 ]; then
  grep -A 3 "^## Blocker" agent-blockers.md | grep "Status:" | cut -d: -f2
fi

echo ""
echo "Last updated: $(date)"

Tool 2: Automatic Blocker Notifications

// scripts/watch-blockers.ts
import fs from 'fs';
import { exec } from 'child_process';

let lastBlockerCount = 0;

setInterval(() => {
  const content = fs.readFileSync('agent-blockers.md', 'utf8');
  const blockerCount = (content.match(/^## Blocker/gm) || []).length;

  if (blockerCount > lastBlockerCount) {
    // New blocker detected
    const urgentBlockers = content.match(/Status: URGENT/g);
    
    if (urgentBlockers) {
      // Send notification
      exec(`osascript -e 'display notification "URGENT BLOCKER!" with title "Agent Blocked"'`);
      console.log('🚨 URGENT BLOCKER DETECTED!');
    }
  }

  lastBlockerCount = blockerCount;
}, 30000); // Check every 30 seconds

Tool 3: Merge Readiness Validator

// scripts/validate-merge-readiness.ts
interface MergeReadiness {
  branch: string;
  testsPass: boolean;
  typeScriptCompiles: boolean;
  mcpInspectorVerified: boolean;
  toolNamingValid: boolean;
  noConsoleLogs: boolean;
  agent6Approved: boolean;
}

async function validateMergeReadiness(branch: string): Promise<MergeReadiness> {
  console.log(`Validating ${branch}...`);

  const result: MergeReadiness = {
    branch,
    testsPass: await runTests(branch),
    typeScriptCompiles: await checkBuild(branch),
    mcpInspectorVerified: await checkInspector(branch),
    toolNamingValid: await validateToolNames(branch),
    noConsoleLogs: await checkConsoleStatements(branch),
    agent6Approved: await checkApproval(branch)
  };

  const ready = Object.values(result).every(v => v === true || typeof v === 'string');

  if (ready) {
    console.log('✅ READY FOR MERGE');
  } else {
    console.log('❌ NOT READY');
    Object.entries(result).forEach(([key, value]) => {
      if (value === false) {
        console.log(`  - ${key}: FAILED`);
      }
    });
  }

  return result;
}

The Golden Rules We Learned

Rule 1: Clear Boundaries Are Everything

Without clear boundaries:
- Agents step on each other's toes
- Duplicate work
- Merge conflicts

With clear boundaries:
- Each agent owns specific servers
- No overlap
- Clean merges

Rule 2: Communication Must Be Structured

Bad: "Agent 2, how's it going?"
Good: Agent updates status file every 30 minutes

Bad: Ad-hoc questions between agents
Good: Blocker protocol with severity levels

Rule 3: Quality Control Agent Is Mandatory

Without Agent 6:
- Inconsistent patterns
- Naming violations
- Low test coverage

With Agent 6:
- Uniform quality
- MBPS v2.1 compliance
- Professional output

Rule 4: Testing Agent Enables Everyone

Without Agent 5:
- Each agent builds own tests
- Duplicate effort
- Inconsistent test quality

With Agent 5:
- Universal test framework
- Agents focus on features
- Consistent test patterns

Rule 5: Human Orchestrator Is Critical

Agents cannot self-organize (yet).
Need human to:
- Resolve conflicts
- Make architectural decisions
- Merge PRs
- Handle exceptions

The Cost Analysis

Resource Costs

Claude Code Pro: $20/month per agent
Total: $120/month

Development time saved:
Solo: 32 days
Multi-agent: 7 days
Saved: 25 days

At $500/day contractor rate:
Saved: 25 × $500 = $12,500

ROI: $12,500 / $120 = 104x return

Hardware Costs

MacBook Pro M3 Max (128GB): $4,000

Could have used:
6 × AWS EC2 instances (128GB RAM)
Cost: ~$2,400/month

One-time hardware: Better investment
Amortized over 2 years: $167/month

Time Costs

My orchestration time: 45 min/hour
Total project: 7 days × 8 hours = 56 hours
Orchestration time: 56 × 0.75 = 42 hours

But without orchestration:
Solo development: 32 × 8 = 256 hours

Time saved: 214 hours
My time cost: 42 hours
Net time saved: 172 hours

When Multi-Agent Works

Multi-agent is GREAT for:

Multi-agent is TERRIBLE for:

The Future Possibilities

Auto-Orchestration

// Future: AI orchestrator instead of human
class AIOrchestrator {
  async assignTask(task: string): Promise<string> {
    // Analyze task complexity
    const complexity = await this.analyzeComplexity(task);
    
    // Determine best agent
    const agent = await this.selectAgent(complexity);
    
    // Assign with context
    return await this.delegateToAgent(agent, task);
  }

  async resolveBlocker(blocker: Blocker): Promise<void> {
    // Identify blocking agent and blocked agent
    // Route to correct resolver
    // Provide solution
  }
}

Agent Specialization Learning

// Agents learn from each other's work
class AgentLearning {
  async learnFromPattern(pattern: CodePattern): Promise<void> {
    // Extract successful patterns from Agent 1
    const patterns = await this.extractPatterns(agent1.completedWork);
    
    // Share with all other agents
    await this.distributePatterns(patterns, [agent2, agent3, agent4]);
    
    // Agents apply learned patterns automatically
  }
}

Dynamic Agent Scaling

Start with 2 agents
Task velocity too slow? → Spawn Agent 3
Agent 3 completed tasks quickly? → Spawn Agent 4
Blockers piling up? → Spawn QA Agent
Documentation falling behind? → Spawn Doc Agent

Auto-scale based on workload.

The Reality Check

Multi-agent development is NOT:

Multi-agent development IS:

The Checklist

Before attempting multi-agent development:

If YES to all: Multi-agent can 4-5x your productivity. If NO to any: Stick with single-agent development.

The Lessons Learned

  1. Communication protocols are more important than code
  2. Quality control agent is non-negotiable
  3. Testing agent enables everyone else
  4. Human orchestrator is still essential
  5. Clear boundaries prevent 90% of conflicts
  6. Structured status updates scale better than ad-hoc
  7. Git branch per agent is mandatory
  8. Merge conflicts are inevitable, plan for them
  9. Pattern propagation is a superpower
  10. Multi-agent is 4-5x faster, not 6x (overhead)

The Numbers

Final Metrics:

Would we do it again? Absolutely.

References


This is part of our ongoing series documenting architectural patterns and insights from building the Blockchain MCP Server Ecosystem. Sometimes the best way to build at scale is to build in parallel.