Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models Paper • 2510.10964 • Published Oct 13 • 3
Devstral 2 Collection A couple of agentic LLMs for software engineering tasks, excelling at using tools to explore codebases, edit multiple files, and power SWE Agents. • 3 items • Updated 16 days ago • 37
Nemotron-Post-Training-v3 Collection Collection of datasets used in the post-training phase of Nemotron Nano v3. • 7 items • Updated 2 days ago • 49