Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
SpectralPO
community
Activity Feed
Follow
7
AI & ML interests
None defined yet.
Recent Activity
PeterLauLukCh
authored
a paper
about 18 hours ago
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
PeterLauLukCh
authored
a paper
about 18 hours ago
GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators
PeterLauLukCh
submitted
a paper
7 days ago
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
View all activity
Team members
3
SpectralPO
's models
27
Sort: Recently updated
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-QwQ-Ablation
8B
•
Updated
Jul 19
•
10
SpectralPO/DeepSeek-R1-Distill-Qwen-32B-GRPO
Updated
Jul 19
SpectralPO/DeepSeek-R1-Distill-Qwen-32B-SPO
Updated
Jul 19
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-Qwen3-235B
8B
•
Updated
Jul 18
•
8
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-QwQ
8B
•
Updated
Jul 18
•
8
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO-DeepSeek-V3
8B
•
Updated
Jul 16
•
12
•
1
SpectralPO/DeepSeek-R1-Distill-Llama-8B-SPO
8B
•
Updated
May 18
•
5
SpectralPO/DeepSeek-R1-Distill-Llama-8B-GRPO
8B
•
Updated
May 18
•
9
SpectralPO/Qwen2.5-32B-Instruct-GRPO
33B
•
Updated
May 13
•
7
SpectralPO/Qwen2.5-32B-Instruct-SPO
33B
•
Updated
May 13
•
6
SpectralPO/32B-SPO-GRPO-mixed
33B
•
Updated
May 13
•
6
SpectralPO/DeepSeek-R1-Distill-Qwen-14B-GRPO
15B
•
Updated
May 12
•
9
SpectralPO/DeepSeek-R1-Distill-Qwen-SPO
15B
•
Updated
May 12
•
5
SpectralPO/Qwen2.5-14B-Instruct-SPO
15B
•
Updated
May 9
•
5
SpectralPO/Qwen2.5-14B-Instruct-GRPO
15B
•
Updated
May 9
•
5
SpectralPO/extraSPO
8B
•
Updated
May 2
•
5
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-SPO
8B
•
Updated
May 2
•
7
SpectralPO/DeepSeek-R1-Distill-Qwen-7B-GRPO
8B
•
Updated
May 2
•
7
SpectralPO/Qwen2.5-7B-Instruct-N1
8B
•
Updated
Apr 27
•
8
SpectralPO/Qwen2.5-7B-Instruct-SPO
8B
•
Updated
Apr 27
•
5
SpectralPO/Qwen2.5-7B-Instruct-GRPO
8B
•
Updated
Apr 27
•
9
SpectralPO/Qwen2.5-14B-Instruct-pos
15B
•
Updated
Apr 18
•
8
SpectralPO/Qwen2.5-14B-Instruct-neg
15B
•
Updated
Apr 18
•
9
SpectralPO/Qwen2.5-32B-Instruct-pos
33B
•
Updated
Apr 16
•
10
SpectralPO/Qwen2.5-32B-Instruct-neg-2
33B
•
Updated
Apr 15
•
8
SpectralPO/Qwen2.5-32B-Instruct-neg
33B
•
Updated
Apr 13
•
7
SpectralPO/s1K-7B-RSPO-neg
8B
•
Updated
Apr 12
•
6