Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning Paper • 2512.05591 • Published Dec 5, 2025 • 16
Klear-AgentForge Collection Effective supervised fine-tuning (SFT) with synthetic data followed by multi-turn reinforcement learning (RL) for boosting agentic models. • 3 items • Updated Nov 13, 2025 • 3
mini-swe-agent-plus Collection A collection of mini-swe-agent-plus and corresponding rollout traces that drive Qwen3-8B to a 39% solve rate on SWE-bench Verified. Enjoy! • 2 items • Updated Nov 12, 2025
Kwai-Klear/SWE-smith-mini_swe_agent_plus-trajectories-66k Viewer • Updated Nov 6, 2025 • 66k • 678 • 9
Kwai-Klear/SWE-smith-mini_swe_agent_plus-trajectories-66k Viewer • Updated Nov 6, 2025 • 66k • 678 • 9