SkillMimic-V2: Learning Robust and Generalizable Interaction Skills from Sparse and Noisy Demonstrations Paper • 2505.02094 • Published May 4, 2025 • 19
view changelog Hugging Face Changelog Service Accounts for Enterprise organizations 16 days ago • 131
PhysiFormer: Learning to Simulate Mechanics in World Space Paper • 2606.27364 • Published 4 days ago • 9
view article Article Introducing the FFASR Leaderboard: Benchmarking ASR in the Real World +3 daniel-treble, whojavumusic, alessia-treble, georg-goetz, bezzam • 5 days ago • 5
Beyond NL2Code: A Structured Survey of Multimodal Code Intelligence Paper • 2606.15932 • Published 13 days ago • 38
MemGUI-Agent: An End-to-End Long-Horizon Mobile GUI Agent with Proactive Context Management Paper • 2606.19926 • Published 11 days ago • 42
SWE-agent-LM Collection A collection of language models trained on SWE-smith + (mini-)SWE-agent for SWE-bench tasks • 3 items • Updated Dec 14, 2025 • 3
SWE-smith Collection SWE-smith datasets of task instances for different programming languages • 9 items • Updated Mar 9 • 4
SWE-bench Collection SWE-bench (Lite, Verified, Multimodal, Multilingual) all in one place! • 5 items • Updated Dec 14, 2025 • 10
EnterpriseClawBench: Benchmarking Agents from Real Workplace Sessions Paper • 2606.23654 • Published 7 days ago • 79
Qwen-AgentWorld: Language World Models for General Agents Paper • 2606.24597 • Published 6 days ago • 137
Multi-LCB: Extending LiveCodeBench to Multiple Programming Languages Paper • 2606.20517 • Published 11 days ago • 60
🍎 Qwopus3.6 Collection This collection features the advanced Qwopus3.6 series of multimodal large models, which are fine-tuned from the Qwen3.6 base models with a focus on e • 10 items • Updated May 23 • 70
🚀 Qwen-MTP Collection ⚡ MTP (Multi Token Prediction) speculative decoding enables models like Qwen3.6 to have ~1.4-2.2x faster generation with no change in accuracy. • 8 items • Updated 9 days ago • 29