ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published Nov 26 • 109
Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published 26 days ago • 50
The End of Manual Decoding: Towards Truly End-to-End Language Models Paper • 2510.26697 • Published Oct 30 • 116
HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application Paper • 2510.19631 • Published Oct 22 • 27
DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking Paper • 2510.20168 • Published Oct 23 • 27
Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents Paper • 2510.14438 • Published Oct 16 • 13
NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents Paper • 2510.07172 • Published Oct 8 • 28
CLUE: Non-parametric Verification from Experience via Hidden-State Clustering Paper • 2510.01591 • Published Oct 2 • 27
Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published Aug 1 • 93
WebEvolver: Enhancing Web Agent Self-Improvement with Coevolving World Model Paper • 2504.21024 • Published Apr 23 • 2
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression Paper • 2412.17483 • Published Dec 23, 2024 • 34
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization Paper • 2410.19609 • Published Oct 25, 2024 • 18