Schoenfeld's Anatomy of Mathematical Reasoning by Language Models Paper • 2512.19995 • Published 15 days ago • 14
Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction Paper • 2512.18880 • Published 17 days ago • 24
V-REX: Benchmarking Exploratory Visual Reasoning via Chain-of-Questions Paper • 2512.11995 • Published 26 days ago • 9
Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs Paper • 2511.07419 • Published Nov 10, 2025 • 26
BLIP3o-NEXT: Next Frontier of Native Image Generation Paper • 2510.15857 • Published Oct 17, 2025 • 24
Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29, 2025 • 140
Understanding the Thinking Process of Reasoning Models: A Perspective from Schoenfeld's Episode Theory Paper • 2509.14662 • Published Sep 18, 2025 • 13
VisR-Bench: An Empirical Study on Visual Retrieval-Augmented Generation for Multilingual Long Document Understanding Paper • 2508.07493 • Published Aug 10, 2025 • 8
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs Paper • 2507.07996 • Published Jul 10, 2025 • 34
FaSTA^*: Fast-Slow Toolpath Agent with Subroutine Mining for Efficient Multi-turn Image Editing Paper • 2506.20911 • Published Jun 26, 2025 • 41
Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test Paper • 2506.21551 • Published Jun 26, 2025 • 28
Optimizing Length Compression in Large Reasoning Models Paper • 2506.14755 • Published Jun 17, 2025 • 10
Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency Paper • 2506.08343 • Published Jun 10, 2025 • 54
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14, 2025 • 98
GraphicBench: A Planning Benchmark for Graphic Design with Language Agents Paper • 2504.11571 • Published Apr 15, 2025 • 1
WALL-E 2.0: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents Paper • 2504.15785 • Published Apr 22, 2025 • 22
ColorBench: Can VLMs See and Understand the Colorful World? A Comprehensive Benchmark for Color Perception, Reasoning, and Robustness Paper • 2504.10514 • Published Apr 10, 2025 • 48
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning Paper • 2504.05520 • Published Apr 7, 2025 • 11