QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models Paper • 2310.08041 • Published Oct 12, 2023 • 1
Lossy and Lossless (L$^2$) Post-training Model Size Compression Paper • 2308.04269 • Published Aug 8, 2023
From Markov to Laplace: How Mamba In-Context Learns Markov Chains Paper • 2502.10178 • Published Feb 14
Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers Paper • 2406.16450 • Published Jun 24, 2024