**QVAC Genesis II: Expanding the Largest and Highest-Quality Multi-domain Educational Synthetic Dataset for LLM Pre-training** about 19 hours ago
Announcing LiteCoder-Terminal: Lightweight Terminal Agents with <1k Synthesized Trajectories 2 days ago • 9
Introducing AutoBench 2.0: Our New Benchmarking Platform is Out Just in Time to Evaluate GPT 5.2. 3 days ago • 1
cua-bench: A Framework for Benchmarking, Training Data, and RL Environments for Computer-Use Agents 4 days ago • 6