Skip to content

Kaka-cheaper/Polisim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

23 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Polisim

Layered LLM-driven multi-agent simulation engine. Define worlds in YAML, recreate social dynamics, pause and intervene mid-simulation.

ๅˆ†ๅฑ‚้ฉฑๅŠจ็š„ๅคšๅฎžไฝ“ไปฟ็œŸๅผ•ๆ“Ž ยท LLM agents ยท YAML ๅฎšไน‰ไธ–็•Œ ยท ๅฏๆš‚ๅœ / ๅฏๅนฒ้ข„ / ๅฏๅ›žๆบฏๅˆ†ๆž

License: MIT Python 3.10+ Tests Architecture


What is Polisim?

Polisim is a simulation engine for running multi-entity scenarios where LLM agents make decisions inside a structured, observable, interruptible world.

You describe the world in a YAML file (entities, actions, message types, relations), declare a scenario (specific instances + initial state + breakpoints), and write a small Python rules module (how actions become effects). Polisim then:

  • Drives a tick-based simulation where each entity's decision can come from an LLM, a deterministic rule, or a random fallback
  • Records every event to an append-only log + per-tick snapshots, so any moment is replayable / inspectable
  • Lets you pause at any tick, intervene by injecting messages or forcing actions, then resume
  • Generates a dual-track analysis report at the end: a deterministic Phase A track (turning points, entity trajectories, environment dynamics) plus an optional LLM-enhanced Phase C track (situation judgment, narrative summary, action suggestions)

Built for narrative simulations of markets, negotiations, opinion dynamics, organizational decision-makingโ€”any domain where you want to ask "what happens when these agents interact under these rules?"

Key features

  • ๐Ÿ› 6-layer architecture with strict layer boundaries (World / Scenario / Rules / Runtime / Event Log / Analysis)
  • ๐Ÿค– Three decision modes coexist: llm (any OpenAI-compatible provider), rule (deterministic), random (sampling)
  • ๐Ÿ“œ YAML-defined worlds with full schema validation (JSON Schema + Pydantic + cross-layer semantic checks)
  • โธ Pause / resume / intervene at any tickโ€”force actions, inject messages, modify entity attributes mid-run
  • ๐Ÿ“Š Dual-track analysis: deterministic statistics (always on) + optional LLM-narrated summary
  • ๐Ÿงช 670 passing tests, including end-to-end CLI tests for two production-ready scenarios
  • ๐Ÿ›  Extensible by design: bring your own rules module, your own LLM provider, your own analysis enhancers

Quick start

Installation

git clone https://github.com/Kaka-cheaper/Polisim.git
cd Polisim
pip install -e ".[dev]"

Requires Python 3.10+.

Run the minimal market scenario (5 ticks)

python -m cli run scenarios/minimal_market/scenario.yaml --ticks 5

Sample output:

[run_id] minimal-market_2026-04-26T10-30-00_a1b2
[t=  1] decision_proposed actor=company_a action=do_nothing mode=llm
[t=  1] decision_proposed actor=company_b action=promote mode=rule
[t=  1] action_executed actor=company_b
[t=  1] attribute_changed actor=company_b cash: 100 -> 80
[t=  1] attribute_changed actor=company_b reputation: 50 -> 55
[t=  1] snapshot_saved tick=1
...
[final] tick=5 events=42 entities=2
[analysis] final.md (Phase A) -> runs/minimal-market_.../analysis/

Run with real OpenAI (Phase B)

$env:OPENAI_API_KEY = "sk-..."  # or any OpenAI-compatible endpoint
python -m cli run scenarios/three_party_negotiation/scenario.yaml `
    --llm-provider openai --ticks 8

Add LLM-enhanced analysis (Phase C)

python -m cli run scenarios/minimal_market/scenario.yaml `
    --llm-enhance --output-language zh-CN --prompt-history-size 3

The final report at runs/<run_id>/analysis/final.md will include four LLM-generated sections in the configured language:

  • World overview โ€” plain-language explanation of the initial entities, attributes, relations and scenario goal
  • Narrative summary โ€” trajectory described in natural language with tick references
  • Situation judgement โ€” final state assessment with cited evidence (specific tick + attribute change)
  • Next action suggestions โ€” actionable suggestions, each embedding supporting evidence

--prompt-history-size N controls how many recent decisions (D-016) are injected into each LLM prompt's actor_view.recent_decisions (default 3, range 0-10).

Architecture

flowchart TB
    subgraph Definition["Definition (static, declared once)"]
        World["World Definition<br/>(YAML)<br/><i>entity_types / action_types /<br/>message_types / relation_types</i>"]
        Scenario["Scenario<br/>(YAML)<br/><i>entities / relations /<br/>scheduled_events / breakpoints</i>"]
    end

    subgraph Execution["Execution (per-tick orchestration)"]
        Rules["Rules<br/>(Python)<br/><i>validate_action /<br/>resolve_effects /<br/>actions_handled</i>"]
        Runtime["Runtime<br/>(orchestrator)<br/><i>tick loop / activation /<br/>conflict resolution /<br/>pause+intervene</i>"]
        Provider["LLM Provider<br/>(pluggable)<br/><i>OpenAI / Mock /<br/>any compatible API</i>"]
    end

    subgraph Output["Output (append-only, replay-friendly)"]
        EventLog["Event Log<br/>(JSONL)<br/><i>per-tick events +<br/>snapshots</i>"]
        Analysis["Analysis<br/>(Markdown + JSON)<br/><i>Phase A: deterministic +<br/>Phase C: LLM-enhanced</i>"]
    end

    World --> Runtime
    Scenario --> Runtime
    Rules --> Runtime
    Provider -.->|llm decisions| Runtime
    Runtime --> EventLog
    EventLog --> Analysis
Loading

The 6 layers communicate only through declared interfacesโ€”no cross-layer pollution. This makes each layer independently testable and replaceable: swap rules without touching runtime, swap LLM providers without touching scenarios, swap analysis without touching the event log.

Built-in scenarios

1. minimal_market โ€” Walkthrough scenario (2 entities, 5 ticks)

Two competing companies (company_a LLM-driven, company_b rule-driven) decide between promote and do_nothing based on cash, reputation, and a global market_pressure environment variable. Demonstrates: basic action effects, attribute clamping, scheduled events, snapshot lifecycle.

python -m cli run scenarios/minimal_market/scenario.yaml

โ†’ See scenarios/minimal_market/

2. three_party_negotiation โ€” Architecture coverage scenario (3 entities, 8 ticks)

Three negotiators (Alice LLM, Bob rule, Charlie random) propose / accept / reject offers to each other. Trust relations evolve via update_value; reaching trust=80 triggers a breakpoint. Demonstrates: directed messages, relation dynamics, multi-effect actions, pluggable decision modes, breakpoint-driven pause.

python -m cli run scenarios/three_party_negotiation/scenario.yaml

โ†’ See scenarios/three_party_negotiation/

Project structure

Polisim/
โ”œโ”€โ”€ cli/                    CLI entry point (run / step / replay)
โ”œโ”€โ”€ core/                   Orchestration layer
โ”‚   โ”œโ”€โ”€ runtime.py          Tick loop, activation, conflict resolution
โ”‚   โ”œโ”€โ”€ analysis.py         Phase A + Phase C analysis
โ”‚   โ”œโ”€โ”€ semantic_validator.py  D-013 cross-layer semantic checks
โ”‚   โ”œโ”€โ”€ definition_loader.py   World loading + schema validation
โ”‚   โ”œโ”€โ”€ scenario_loader.py     Scenario loading + cross-file refs
โ”‚   โ”œโ”€โ”€ rules_loader.py     Dynamic rules module loader (D-010)
โ”‚   โ”œโ”€โ”€ llm_policy.py       LLM protocol layer
โ”‚   โ”œโ”€โ”€ events.py           Append-only event log + snapshots
โ”‚   โ”œโ”€โ”€ errors.py           Unified SimEngineError hierarchy (D-011)
โ”‚   โ””โ”€โ”€ providers/          LLMProvider ABC + OpenAI / Mock impls
โ”œโ”€โ”€ models/                 Pydantic models (world / scenario / runtime / config / analysis)
โ”œโ”€โ”€ rules/                  Rules modules (BaseRules + 2 concrete impls)
โ”œโ”€โ”€ schemas/                JSON schemas (single source of truth)
โ”œโ”€โ”€ scenarios/              YAML scenarios (minimal_market + three_party_negotiation)
โ”œโ”€โ”€ tests/                  670 tests across all layers
โ””โ”€โ”€ docs/                   Design docs / requirements / pitfalls / progress

Documentation

All design and process documentation is in docs/:

Document What's inside
AGENTS.md Entry point for AI coding assistantsโ€”MUST/MUST NOT rules + doc navigation
docs/00-overview/progress.md Session-to-session progress log; current state and next steps
docs/00-overview/ๅฆ‚ไฝ•ไฝฟ็”จ่ฟ™ๅฅ—ๆ–‡ๆกฃไธŽ้…็ฝฎไฝ“็ณป.md Human-facing intro to the doc system
docs/01-requirements/้ชŒๆ”ถๆ ‡ๅ‡†.md Acceptance criteriaโ€”source of truth when implementation diverges from docs
docs/01-requirements/ๆœ€ๅฐ็คบไพ‹Walkthrough.md Step-by-step walkthrough of the minimal_market scenario
docs/02-design/ 8 design docs covering each architectural layer
docs/03-implementation/pitfalls.md Known issues, edge cases, and pitfalls discovered during development
docs/00-overview/LLM่พ…ๅŠฉๅปบๆจกๆ–นๆกˆ.md Roadmap for future LLM-assisted modeling (Phase 2)

Testing

pytest tests/ -q

Currently 670 tests passing in ~12 seconds, covering:

  • All Pydantic models (world / scenario / runtime / config / analysis)
  • All loaders (definition / scenario / rules) with three-tier validation
  • Runtime full lifecycle (step / run_until / pause+intervene / breakpoints / snapshot modes)
  • Both rules modules (minimal_market + three_party_negotiation)
  • LLM protocol (mock + OpenAI provider with mocked HTTP)
  • Analysis (Phase A + Phase C with multi-language injection)
  • CLI end-to-end (run / step / replay across both scenarios)
  • Cross-layer semantic validation (D-013)

Plus a real-OpenAI smoke test at scripts/smoke_openai.py for end-to-end verification with live API calls.

Roadmap

v0.1.1 engine rigorization is complete (D-014 strong action params + D-015 (scoped) AttributeEffect.new_value + D-016 PromptContext / enrich_prompt hook + LLM analysis upgrade with world_overview + evidence citation). Next directions, by priority:

  • v0.2 web UI: single-repo monorepo additionโ€”FastAPI WebSocket backend + React real-time situation panel consuming decision_proposed.payload.prompt_context (D-016 enables this)
  • D-015 full: EntityCreate / EntityDestroy / ChainedAction effect types (deferred from v0.1.1)
  • Phase B.3: protocol-level retry with exponential backoff (currently relying on OpenAI SDK's built-in retries)
  • Phase 2 LLM-assisted modeling: core/modeling_loop.py + guided Q&A frontend (D-013 semantic validator already provides the "self-repair loop" infrastructure; see LLM่พ…ๅŠฉๅปบๆจกๆ–นๆกˆ.md)
  • More scenarios: information cascade, opinion dynamics, organizational decision-making

See progress.md section "ไธ‹ไธ€ๆญฅ่ฏฅๅšไป€ไนˆ" for the live priority list.

Contributing

This is currently a single-developer project being shaped session-by-session. If you want to:

  • Report a bug or pitfall: open an issue with reproduction steps; pitfalls discovered during development are logged in pitfalls.md
  • Propose a new scenario: open a discussion describing the world / entities / actions / what dynamics you want to study
  • Contribute code: read AGENTS.md firstโ€”it lists the architectural rules that any contributor (human or AI) must follow

License

MIT

Acknowledgments

The architectural discipline of this project is heavily influenced by:

  • Layered orchestration patterns from compiler design and game engines
  • Append-only event logs from event sourcing and CQRS
  • Multi-agent decision protocols from contemporary LLM agent frameworks (AgentVerse, AutoGen, CrewAI)
  • Configuration validation from pydantic and JSON Schema

About

Layered LLM-driven multi-agent simulation engine. Define worlds in YAML, recreate social dynamics, pause and intervene mid-simulation.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors