P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published Nov 17, 2025 • 134
ThinkMorph: Emergent Properties in Multimodal Interleaved Chain-of-Thought Reasoning Paper • 2510.27492 • Published Oct 30, 2025 • 82
TrustJudge: Inconsistencies of LLM-as-a-Judge and How to Alleviate Them Paper • 2509.21117 • Published Sep 25, 2025 • 29
Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration Paper • 2509.14760 • Published Sep 18, 2025 • 53
Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published Jun 30, 2025 • 89
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction Paper • 2507.02025 • Published Jul 2, 2025 • 35
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28, 2025 • 131
Scaling Reasoning, Losing Control: Evaluating Instruction Following in Large Reasoning Models Paper • 2505.14810 • Published May 20, 2025 • 62
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published Apr 29, 2025 • 98 • 15