Edit Models filters

Model Tree

Apps

Docker Model Runner

Inference Providers

OVHcloud AI Endpoints

HF Inference API

Misc

Inference Endpoints

text-generation-inference

Eval Results (legacy)

text-embeddings-inference

4-bit precision

8-bit precision

Mixture of Experts

Carbon Emissions

Models

22,697

Base only

Active filters: grpo

emre/Qwen-0.5B-GRPO

Text Generation • Updated Feb 3, 2025 • 25 •

peulsilva/reasoning-qwen-epoch0

Text Generation • 0.5B • Updated Feb 3, 2025 • 2 •

peulsilva/reasoning-qwen-epoch1

Text Generation • 0.5B • Updated Feb 3, 2025 • 2

spinech/qwen2.5-3b-r1-arc-train-synthetic

Text Generation • 3B • Updated Feb 4, 2025 • 5

peulsilva/reasoning-qwen-epoch2

Text Generation • 0.5B • Updated Feb 3, 2025 • 3 •

Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math

Text Generation • 8B • Updated Feb 4, 2025 • 7

Dongwei/Qwen-2.5-7B_Math

Text Generation • 8B • Updated Feb 4, 2025 • 2

Dongwei/Qwen2.5-1.5B-Open-R1-GRPO_Math

Text Generation • 2B • Updated Feb 3, 2025 • 2

Dongwei/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_Math

Text Generation • 2B • Updated Feb 3, 2025 • 2

peulsilva/reasoning-qwen-epoch3

Text Generation • 0.5B • Updated Feb 3, 2025 • 2

mradermacher/DeepSeek-R1-Distill-Qwen-7B-GRPO-GGUF

8B • Updated Feb 4, 2025 • 198

skzxjus/Qwen2.5-7B-Open-R1-GRPO

Text Generation • 8B • Updated Feb 8, 2025 • 6

AndreasX1206/Qwen2-0.5B-countdown

Text Generation • 0.5B • Updated Feb 4, 2025 • 3 •

mradermacher/Qwen-0.5B-GRPO-GGUF

0.5B • Updated Feb 3, 2025 • 69

alicogniai/Qwen2.5-1.5B-Open-R1-GRPO

Text Generation • 2B • Updated Feb 16, 2025 • 3

ununtrium/Qwen2.5-1.5B-Open-R1-GRPO

Text Generation • 2B • Updated Feb 11, 2025 • 3

mradermacher/DeepSeek-R1-Distill-Qwen-7B-GRPO-i1-GGUF

8B • Updated Feb 4, 2025 • 324

yuta0x89/llmjp13b-numinacot-epoch2-GRPO

Text Generation • 14B • Updated Feb 11, 2025 • 7

yeshsurya/Qwen2.5-7B-Math-with_50stepGRPO

Text Generation • 8B • Updated Feb 12, 2025 • 9

mradermacher/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_Math-GGUF

2B • Updated Feb 4, 2025 • 107

mradermacher/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math-GGUF

8B • Updated Feb 4, 2025 • 127

mradermacher/DeepSeek-R1-Qwen-2.5-1.5b-Latest-Unstructured-To-Structured-GGUF

2B • Updated Feb 4, 2025 • 130 • 1

hyunw3/qwen-2.5-0.5b-r1-countdown_lr5e-6

Text Generation • 0.5B • Updated Jun 3, 2025 • 4 •

khuang2/qwen-2.5-3b-r1-countdown

Text Generation • 3B • Updated Feb 5, 2025 • 7 • 2

spinech/qwen2.5-3b-r1-arc-train-thinker

Text Generation • 3B • Updated Feb 5, 2025 • 5 • 1

Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math_lowlr

Text Generation • 8B • Updated Feb 4, 2025 • 2

Dongwei/Qwen-2.5-7B_Math_smalllr

Text Generation • 8B • Updated Feb 4, 2025 • 6

Dongwei/Qwen2.5-1.5B-Open-R1-GRPO_Math_smalllr

Text Generation • 2B • Updated Feb 4, 2025 • 3

Dongwei/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_Math_smalllr

Text Generation • 2B • Updated Feb 4, 2025 • 3

mradermacher/Qwen2.5-1.5B-Thinking-v1.1-GGUF

2B • Updated Feb 4, 2025 • 112 • 2