Fairy2i: Training Complex LLMs from Real LLMs with All Parameters in {pm 1, pm i} Paper • 2512.02901 • Published 25 days ago • 5
Every Token Counts: Generalizing 16M Ultra-Long Context in Large Language Models Paper • 2511.23319 • Published 29 days ago • 22
Metis: Training Large Language Models with Advanced Low-Bit Quantization Paper • 2509.00404 • Published Aug 30 • 6
Jamba 1.7 Collection The AI21 Jamba family of models are hybrid SSM-Transformer foundation models, blending speed, efficient long context processing, and accuracy. • 4 items • Updated Jul 2 • 12
BitVLA Collection 1-bit Vision-Language-Action Models for Robotics Manipulation • 9 items • Updated Jun 30 • 3