X-Stream: Exploring MLLMs as Multiplexers for Multi-Stream Understanding Paper • 2606.02482 • Published 8 days ago • 35
Crafter: A Multi-Agent Harness for Editable Scientific Figure Generation from Diverse Inputs Paper • 2605.30611 • Published 12 days ago • 192
Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models Paper • 2605.28132 • Published 13 days ago • 25
SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation Paper • 2605.22536 • Published 19 days ago • 28
WorldKV: Efficient World Memory with World Retrieval and Compression Paper • 2605.22718 • Published 19 days ago • 41
Stop When Reasoning Converges: Semantic-Preserving Early Exit for Reasoning Models Paper • 2605.17672 • Published 23 days ago • 22
Model-Adaptive Tool Necessity Reveals the Knowing-Doing Gap in LLM Tool Use Paper • 2605.14038 • Published 27 days ago • 15
Lance: Unified Multimodal Modeling by Multi-Task Synergy Paper • 2605.18678 • Published 22 days ago • 78
KVPO: ODE-Native GRPO for Autoregressive Video Alignment via KV Semantic Exploration Paper • 2605.14278 • Published 26 days ago • 37
LongLive-2.0: An NVFP4 Parallel Infrastructure for Long Video Generation Paper • 2605.18739 • Published 22 days ago • 112
Edit-Compass & EditReward-Compass: A Unified Benchmark for Image Editing and Reward Modeling Paper • 2605.13062 • Published 27 days ago • 33
MinT: Managed Infrastructure for Training and Serving Millions of LLMs Paper • 2605.13779 • Published 27 days ago • 219
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation Paper • 2605.13724 • Published 27 days ago • 101
Qianfan-OCR: A Unified End-to-End Model for Document Intelligence Paper • 2603.13398 • Published Mar 11 • 155
Kinema4D: Kinematic 4D World Modeling for Spatiotemporal Embodied Simulation Paper • 2603.16669 • Published Mar 17 • 70
TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas Paper • 2603.16448 • Published Mar 17 • 58
Deep Forcing: Training-Free Long Video Generation with Deep Sink and Participative Compression Paper • 2512.05081 • Published Dec 4, 2025 • 33