OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value Paper • 2512.14051 • Published 21 days ago • 40
REST: Stress Testing Large Reasoning Models by Asking Multiple Problems at Once Paper • 2507.10541 • Published Jul 14, 2025 • 29
InverTune: Removing Backdoors from Multimodal Contrastive Learning Models via Trigger Inversion and Activation Tuning Paper • 2506.12411 • Published Jun 14, 2025
ScaleDiff: Scaling Difficult Problems for Advanced Mathematical Reasoning Paper • 2509.21070 • Published Sep 25, 2025 • 9
Scaling Code-Assisted Chain-of-Thoughts and Instructions for Model Reasoning Paper • 2510.04081 • Published Oct 5, 2025 • 23
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning Paper • 2507.17512 • Published Jul 23, 2025 • 36
CipherBank: Exploring the Boundary of LLM Reasoning Capabilities through Cryptography Challenges Paper • 2504.19093 • Published Apr 27, 2025 • 18
A Strategic Coordination Framework of Small LLMs Matches Large LLMs in Data Synthesis Paper • 2504.12322 • Published Apr 11, 2025 • 28
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs Paper • 2503.17439 • Published Mar 21, 2025 • 15
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion Paper • 2503.16212 • Published Mar 20, 2025 • 25
MetaLadder: Ascending Mathematical Solution Quality via Analogical-Problem Reasoning Transfer Paper • 2503.14891 • Published Mar 19, 2025 • 22