DeepSky ATC: 50-Agent Coordination Verified

March 21, 2026

Milestone: GAT Successfully Scaled to 50 Agents

Reward validation:

  • Theoretical maximum: +25,000 total (+350 progress for 70km diagonal flights + 150 bonus/solo × 50 agents)
  • Empirical success: episode_reward_max reaching +24,130
  • Mean reward climbing to +16,000

Performance metrics:

  • Mean reward: ~380 per agent
  • Training: 100M agent steps (Iteration 200+)
  • Peak performance achieved

PhD Rationale

This proves the GAT architecture successfully generalized the “Zipper Merge” logic from 8 agents to 50.

High mean reward indicates the policy is robust across random spawn geometries. Not just cherry-picked success cases.


Stage 4 Configuration

Random scenario, 50 agents:

  • Bounds: 50km × 50km box
  • Arrival threshold: 2km
  • Exception zone: 5km
  • Max episode steps: 400

Agents spawn at random positions with random headings. Must navigate to center goal zone while maintaining separation (5 NM minimum).


Key Finding

GAT attention mechanism enables coordination at scale. The policy learned to:

  • Prioritize high closure-rate threats over nearby parallel traffic
  • Coordinate zipper merge in dense airspace
  • Maintain 20% observability (K=10 neighbors out of 49) without catastrophic performance degradation

Final GAT production run complete. Ready for baseline comparisons.