lewtun (Lewis Tunstall)

upvoted 3 articles 1 day ago

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

+4

3 days ago

•

54

Article

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

11 days ago

•

77

Article

Shadow AI - Where are the CIOs?

2 days ago

•

29

upvoted a collection 3 days ago

Nemotron-Cascade

Collection

Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 17 items • Updated 1 day ago • 32

upvoted an article 5 days ago

Article

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

5 days ago

•

92

upvoted a paper 13 days ago

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

Paper • 2508.09726 • Published Aug 13 • 15

upvoted 2 articles 15 days ago

Article

Yay! Organizations can now publish blog Articles

Jan 20

•

53

Article

We Got Claude to Fine-Tune an Open Source LLM

17 days ago

•

520

upvoted 3 papers 18 days ago

Kimi K2: Open Agentic Intelligence

Paper • 2507.20534 • Published Jul 28 • 8

The BrowserGym Ecosystem for Web Agent Research

Paper • 2412.05467 • Published Dec 6, 2024 • 23

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Paper • 2511.07317 • Published Nov 10 • 13

upvoted an article 19 days ago

Article

Transformers v5: Simple model definitions powering the AI ecosystem

+2

20 days ago

•

246

upvoted a paper 24 days ago

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 138

upvoted an article 25 days ago

Article

Continuous batching from first principles

+1

26 days ago

•

281

upvoted an article about 1 month ago

Article

Introducing Cogito v2.1

Nov 19

•

17

upvoted a collection about 1 month ago

Cogito v2.1

Collection

2 items • Updated Nov 19 • 14

upvoted an article about 2 months ago

Article

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

Sep 16

•

17

upvoted 2 papers about 2 months ago

An efficient probabilistic hardware architecture for diffusion-like models

Paper • 2510.23972 • Published Oct 28 • 3

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

Paper • 2510.25992 • Published Oct 29 • 45

upvoted an article about 2 months ago

Article

3+ Years of ML & Society at Hugging Face 🤗🤝🧑‍🤝‍🧑

Oct 29

•

13

Lewis Tunstall PRO

AI & ML interests

Organizations

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

Shadow AI - Where are the CIOs?

Nemotron-Cascade

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

Yay! Organizations can now publish blog Articles

We Got Claude to Fine-Tune an Open Source LLM

Kimi K2: Open Agentic Intelligence

The BrowserGym Ecosystem for Web Agent Research

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Transformers v5: Simple model definitions powering the AI ecosystem

DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Continuous batching from first principles

Introducing Cogito v2.1

Cogito v2.1

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

An efficient probabilistic hardware architecture for diffusion-like models

Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning

3+ Years of ML & Society at Hugging Face 🤗🤝🧑‍🤝‍🧑

Lewis Tunstall PRO

AI & ML interests

Organizations

lewtun's activity

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

Apriel-1.6-15b-Thinker: Cost-efficient Frontier Multimodal Performance

Shadow AI - Where are the CIOs?

Nemotron 3 Nano \- A new Standard for Efficient, Open, and Intelligent Agentic Models

Yay! Organizations can now publish blog Articles

We Got Claude to Fine-Tune an Open Source LLM

Transformers v5: Simple model definitions powering the AI ecosystem

Continuous batching from first principles

Introducing Cogito v2.1

Ultra-Long Sequence Parallelism: Ulysses + Ring-Attention Technical Principles and Implementation

3+ Years of ML & Society at Hugging Face 🤗🤝🧑‍🤝‍🧑