swe-bench pro across open-weight models hasn't moved in 10 weeks
mar 27th - jun 1st. 5 major open-weight releases. net progress on swe-bench pro: +0.3%. scores swung -5% and back within that window
looks like open-weight labs are optimizing hard for agent tasks and multimodality – but not for coding anymore

Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities
— MiniMax (official) (@MiniMax_AI) June 1, 2026
- Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas
- MiniMax Sparse Attention scales context to 1M
-… pic.twitter.com/TF891iJukF
Addy Crezee