MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization Paper • 2503.16874 • Published Mar 21, 2025 • 44
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning Paper • 2505.23380 • Published May 29, 2025 • 22
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning Paper • 2505.23754 • Published May 29, 2025 • 15
Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence Paper • 2505.20325 • Published May 23, 2025 • 46
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published Mar 24, 2025 • 119
Pre-trained Large Language Models Learn Hidden Markov Models In-context Paper • 2506.07298 • Published Jun 8, 2025 • 26
Give Me FP32 or Give Me Death? Challenges and Solutions for Reproducible Reasoning Paper • 2506.09501 • Published Jun 11, 2025 • 19
Evolving Prompts In-Context: An Open-ended, Self-replicating Perspective Paper • 2506.17930 • Published Jun 22, 2025 • 18
Dynamic Chunking for End-to-End Hierarchical Sequence Modeling Paper • 2507.07955 • Published Jul 10, 2025 • 26
Self-Tuning: Instructing LLMs to Effectively Acquire New Knowledge through Self-Teaching Paper • 2406.06326 • Published Jun 10, 2024 • 2
Enhancing Online Reinforcement Learning with Meta-Learned Objective from Offline Data Paper • 2501.07346 • Published Jan 13, 2025
Filtering Learning Histories Enhances In-Context Reinforcement Learning Paper • 2505.15143 • Published May 21, 2025
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning Paper • 2507.17512 • Published Jul 23, 2025 • 36
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models Paper • 2507.14241 • Published Jul 17, 2025 • 17
Implicit Actor Critic Coupling via a Supervised Learning Framework for RLVR Paper • 2509.02522 • Published Sep 2, 2025 • 25
VLA-Adapter: An Effective Paradigm for Tiny-Scale Vision-Language-Action Model Paper • 2509.09372 • Published Sep 11, 2025 • 243
VisPlay: Self-Evolving Vision-Language Models from Images Paper • 2511.15661 • Published Nov 19, 2025 • 42