About SysMoBench
SysMoBench is a comprehensive benchmark for assessing AI's capacity to create formal specifications of complex, real-world computing systems. While recent advances in generative AI show promise in generating certain forms of specifications, existing work mostly targets small code, not complete systems. SysMoBench bridges this gap by focusing on concurrent and distributed systems using TLA+, the predominant specification language for these domains.
Key Features
Automated Evaluation
Four-phase assessment pipeline: syntax validation, runtime analysis, behavioral conformance, and property verification.
Real-World Systems
Nine carefully selected artifacts spanning synchronization primitives, consensus protocols, and distributed replication.
Extensible
Add new systems with only system code, trace instrumentation, and invariant templates — no reference models needed.