MiniMax Unveils M2.7 AI Model with Pioneering 'Self-Evolution' Capabilities
Chinese AI firm MiniMax has launched its new flagship Agent model, M2.7, introducing a novel 'self-evolution' path where the model can autonomously participate in its own training and optimization. The model demonstrates competitive performance on key software engineering and office productivity benchmarks, rivaling top global models.
MiniMax Claims Breakthrough in AI Self-Improvement with M2.7 Release
AI company MiniMax has launched its latest flagship Agent model, M2.7, marking a significant step by introducing what it calls a "model self-evolution" pathway. The core innovation lies in an "Agent Harness" system that allows the model to deeply participate in its own training and optimization loop, a move the company frames as a shift towards more autonomous AI development.
The Self-Evolution Mechanism
According to MiniMax, M2.7 can construct complex Agent Harness frameworks itself. During its own development, the model was reportedly used to build and update dozens of complex skills within a reinforcement learning harness, drive its own learning processes, and then optimize those processes based on the results.
In practical internal research and development scenarios, the company states M2.7 can handle approximately 30% to 50% of the workflow. In one cited experiment, M2.7 autonomously ran over 100 iteration cycles to optimize a software engineering scaffold, analyzing failures, planning changes, modifying code, and evaluating results, leading to a claimed 30% performance improvement on an internal benchmark.
Benchmark Performance: Rivaling Top Models
MiniMax released a series of benchmark scores positioning M2.7 against leading global counterparts:
- SWE-Pro (Software Engineering): M2.7 achieved a 56.22% success rate, stated to be on par with GPT-5.3-Codex.
- VIBE-Pro (Repo-Level Code Generation): Scored 55.6%, nearly matching Anthropic's Opus 4.6.
- Terminal Bench 2 (System Understanding): Achieved 57.0%.
- GDPval-AA (Professional Knowledge): Attained an Elo rating of 1495, which MiniMax claims is the highest among open-source models.
The model also showed strong performance on the MLE Bench Lite, averaging a 66.6% medal rate, reportedly tying with Google's Gemini 3.1 and trailing behind Opus 4.6 and GPT-5.4.
Expanded Applications: From Coding to Office Work
Beyond core coding, MiniMax highlighted M2.7's capabilities in professional office tasks and complex environment interaction:
- Software Engineering: The model is designed for real-world tasks like end-to-end project delivery, log analysis, bug troubleshooting, and code security. MiniMax claims it has helped reduce production system recovery times to under three minutes in some instances.
- Office Productivity: Enhancements were made for complex editing of Word, Excel, and PowerPoint files, supporting multi-round, high-fidelity modifications.
- Finance Analysis Example: In a demonstration, M2.7 could autonomously read annual reports and analyst briefings, cross-reference research, build revenue forecast models, and generate PowerPoint and Word reports—output deemed usable as a first draft by practitioners.
- Agent Interaction & Entertainment: The model features improved "identity preservation" and emotional intelligence for more natural interactions. MiniMax simultaneously open-sourced "OpenRoom," a prototype Web GUI framework for interactive AI agents, signaling a push beyond pure productivity into interactive entertainment scenarios.
Availability
The M2.7 model is now fully available on the MiniMax Agent platform and its open API service. The company is encouraging developers and users to explore its capabilities across the announced domains.