henryL7/tulu-qwen3-14b-pointwise-no-think-distilled-kl-0.0_tulu3-grpo_Qwen3-4B-Instruct-step1300 Text Generation • 4B • Updated 11 days ago • 19
henryL7/tulu-qwen3-14b-pointwise-no-think-distilled-kl-0.0_tulu3-grpo_Qwen2.5-7B-Instruct-step700 Text Generation • 8B • Updated 11 days ago • 15
henryL7/tulu-qwen3-14b-pointwise-no-think-distilled-kl-0.0_tulu3-grpo_Llama-3.1-8B-Instruct-step1000 Text Generation • 8B • Updated 11 days ago • 16
henryL7/tulu-qwen3-4b-pointwise-grpo-kl-0.0_tulu3-grpo_Qwen3-4B-Instruct-step1100 Text Generation • 4B • Updated 11 days ago • 16
henryL7/tulu-qwen3-4b-pointwise-grpo-kl-0.0_tulu3-grpo_Qwen2.5-7B-Instruct-step800 Text Generation • 8B • Updated 11 days ago • 19
henryL7/tulu-qwen3-4b-pointwise-grpo-kl-0.0_tulu3-grpo_Llama-3.1-8B-Instruct-step800 Text Generation • 8B • Updated 11 days ago • 17
henryL7/tulu-qwen3-4b-pointwise-distilled-grpo-kl-0.0_tulu3-grpo_Qwen3-4B-Instruct-step1300 Text Generation • 4B • Updated 11 days ago • 13
henryL7/tulu-qwen3-4b-pointwise-distilled-grpo-kl-0.0_tulu3-grpo_Qwen2.5-7B-Instruct-step1100 Text Generation • 8B • Updated 11 days ago • 13
henryL7/tulu-qwen3-4b-pointwise-distilled-grpo-kl-0.0_tulu3-grpo_Llama-3.1-8B-Instruct-step1100 Text Generation • 8B • Updated 11 days ago • 17
henryL7/gpt-oss-120b-qwen3-8b-pairwise-no-think-distill Text Generation • 8B • Updated 26 days ago • 27