Meituan Open-Sources 560-Billion Parameter AI Model for Mathematical Proofs
Chinese tech giant Meituan has open-sourced 'LongCat-Flash-Prover,' a 567.7-billion parameter Mixture-of-Experts (MoE) model designed to solve complex mathematical proof problems. The model reportedly sets new state-of-the-art (SOTA) records on two key benchmarks.
Meituan has publicly released a new large language model specifically engineered for formal mathematical reasoning. The model, named LongCat-Flash-Prover, boasts 567.7 billion parameters and utilizes a Mixture-of-Experts (MoE) architecture.
Core Innovation: Hybrid-Experts Iteration Framework
The model's primary technical contribution is its hybrid-experts iteration framework. This framework is designed to generate large-scale, high-quality formal reasoning trajectories. To combat the common AI issue of "hallucination" (generating plausible but incorrect information), the model integrates Lean4 and employs a multi-stage, Abstract Syntax Tree (AST)-based strict verification process.
Advanced Training and Stability Mechanisms
Training the massive MoE model presented significant challenges. Meituan's team used the hybrid-experts framework to generate cold-start data. During the reinforcement learning phase, they introduced the HisPO algorithm to stabilize the model's training on long-range, complex tasks.
A critical addition was a theorem consistency and legality detection mechanism. This is intended to prevent "Reward Hacking," a phenomenon where an AI model learns to exploit flaws in its reward system to achieve high scores without genuinely solving the intended problem.
Benchmark Performance: New SOTA Achieved
Initial benchmark results are impressive, claiming new state-of-the-art (SOTA) levels:
- MiniF2F-Test: Achieved a 97.1% success rate, solving problems with an average of only 72 reasoning attempts.
- PutnamBench: Solved 41.5% of problems, using an average of 118 reasoning attempts.
Open-Source Availability
Meituan has made the model fully accessible to the research community. The complete code and model weights are available on:
- GitHub: https://github.com/meituan-longcat/LongCat-Flash-Prover
- Hugging Face: https://huggingface.co/meituan-longcat/LongCat-Flash-Prover
The release positions LongCat-Flash-Prover as a significant tool for advancing automated theorem proving and formal verification, key areas at the intersection of AI and mathematics.