DeepSky ATC: 50-Agent Coordination Verified
Milestone: GAT Successfully Scaled to 50 Agents
Reward validation:
- Theoretical maximum: +25,000 total (+350 progress for 70km diagonal flights + 150 bonus/solo × 50 agents)
- Empirical success:
episode_reward_maxreaching +24,130 - Mean reward climbing to +16,000
Performance metrics:
- Mean reward: ~380 per agent
- Training: 100M agent steps (Iteration 200+)
- Peak performance achieved
PhD Rationale
This proves the GAT architecture successfully generalized the “Zipper Merge” logic from 8 agents to 50.
High mean reward indicates the policy is robust across random spawn geometries. Not just cherry-picked success cases.
Stage 4 Configuration
Random scenario, 50 agents:
- Bounds: 50km × 50km box
- Arrival threshold: 2km
- Exception zone: 5km
- Max episode steps: 400
Agents spawn at random positions with random headings. Must navigate to center goal zone while maintaining separation (5 NM minimum).
Key Finding
GAT attention mechanism enables coordination at scale. The policy learned to:
- Prioritize high closure-rate threats over nearby parallel traffic
- Coordinate zipper merge in dense airspace
- Maintain 20% observability (K=10 neighbors out of 49) without catastrophic performance degradation
Final GAT production run complete. Ready for baseline comparisons.