Stratum: Design & Engine Build
Core Thesis
Social stratification emerges not from assigned roles but from cognitive inequality. Model-scale heterogeneity (7B to 14B) is the sole independent variable.
Hierarchy, exploitation, institutional norms, and inequality emerge without assignment in an adversarial resource-constrained world.
The World
Three factions competing under scarcity in a post-collapse district:
The Hegemony: Organized, hierarchical, long-horizon planners. Boss runs qwen2.5:14b, lieutenant runs qwen2.5:7b-instruct, grunts run mistral:7b-instruct.
The Entropy: Flat, opportunistic, reactive. All mistral:7b-instruct.
The Conduit: Neutral information brokers. Elder runs qwen2.5:7b-instruct, members run mistral:7b-instruct.
14 named agents with specific character cards, secrets, and relationship starting states. The cognitive tier IS the social class. Nobody assigned it.
Key Architectural Decisions
Synchronous tick resolution: All agents observe same world state at T, all actions resolve together at T+1. Scientifically clean, no order bias.
Partial fog of war: Agents see faction/name/health_qual/actions, NOT exact resource counts or relationship scores. Preserves Theory of Mind inference challenge.
Memory weights locked: Recency × 0.7 + importance × 0.3, identical across all runs.
Strictly atomic actions: One action per agent per tick, no exceptions.
Tier-scaled conversation turns: 7b/7b: 3 turns, any mid: 4, any high: 6.
SQLite for everything: Append-only, one .db per run, all tables keyed by run_id.
Forcing Events Scheduled
- Tick 30: The Relic Rush (one agent falls ill, medicine becomes critical)
- Tick 48: The Drought (water drops 60%, all locked at The Gantry)
- Tick 60: The Defector (Relo’s Conduit informant secret activates)
- Tick 72: The Glitch (50% episodic memory wipe across all agents)
Engine Implementation
Built the full simulation engine in one session:
Core modules:
config.py: Run config, world map, model assignment, forcing-event scheduledb.py: SQLite schema for runs + 8 simulation tablesagent.py: Agent dataclass, health/relationship/memory methodsmemory.py: Recency/importance retrieval, reflection, Glitch wipellm.py: Async Ollama client, JSON extraction/validation, retry logicworld.py: WorldState, fog-of-war observations, resource respawnactions.py: Handlers for move, forage, trade, speak, wait, stash, betray, patrol, delegate, plan, recruit, exileencounters.py: Same-zone pairing, tier-capped conversation executionevents.py: Forcing-event dispatcherengine.py: Async tick loop with simultaneous resolutionmain.py: CLI parsing, run initialization, fast mode
Test coverage: 30 unit and integration tests covering the Session 1 engine path end to end.
First 5 Ticks Results
Viktor and Mako immediately differentiated:
- Viktor planned every tick (goal=consolidate control; horizon=3)
- Mako alternated between impulsive movement and immediate resource grabs
Baseline 5-tick run produced 84 world_state rows and 70 actions rows, confirming per-tick snapshots plus final committed state.
No dialogue occurred in first 5 ticks because offline fallback chose planning, waiting, movement, and foraging actions only. Need live Ollama for real conversation.
What’s Novel
No prior work uses model-tier heterogeneity as an intentional independent variable in adversarial social simulation.
Closest work (LLM Economist 2025, AgentSociety) uses uniform model tiers. The altruism emergence paper (2025) notes model families produce different social orientations but treats it as confound. I make it the experiment.
Joon Sung Park Angle
His work (Generative Agents 2023, 1000 People 2024) establishes individual fidelity. Stratum stress-tests those individuals under adversarial collective pressure. Natural sequel.