A Survey of Data Agents: Emerging Paradigm or Overstated Hype? Paper • 2510.23587 • Published Oct 27 • 65
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published Oct 7 • 141
Thinking with Camera: A Unified Multimodal Model for Camera-Centric Understanding and Generation Paper • 2510.08673 • Published Oct 9 • 125
Lumine: An Open Recipe for Building Generalist Agents in 3D Open Worlds Paper • 2511.08892 • Published Nov 12 • 197
GeoVista: Web-Augmented Agentic Visual Reasoning for Geolocalization Paper • 2511.15705 • Published about 1 month ago • 92
SignRoundV2: Closing the Performance Gap in Extremely Low-Bit Post-Training Quantization for LLMs Paper • 2512.04746 • Published 16 days ago • 13
UniUGP: Unifying Understanding, Generation, and Planing For End-to-end Autonomous Driving Paper • 2512.09864 • Published 10 days ago • 10
Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning Paper • 2512.07461 • Published 12 days ago • 73
N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models Paper • 2512.16561 • Published 2 days ago • 17