rubricrm/rubric_rm_qwen2.5_7B_LR1.0e-6_filtered_sky_code_8k_math_10k_rubric_evidence_classify_4k4k_PPO Updated Apr 18, 2025
rubricrm/rubric_rm_qwen2.5_7B_LR1.0e-6_filtered_sky_code_8k_math_10k_rubric_evidence_classify_4k4k_PPO Updated Apr 18, 2025
rubricrm/qwen2.5_7B_LR5.0e-6_evidence_rubric_4k4k_separate_reward_function_largeBz Updated Apr 9, 2025
rubricrm/qwen2.5_7B_LR5.0e-6_evidence_rubric_4k4k_separate_reward_function_largeBz Updated Apr 9, 2025
Large Language Models on Graphs: A Comprehensive Survey Paper • 2312.02783 • Published Dec 5, 2023 • 2
Towards Unified Multi-Modal Personalization: Large Vision-Language Models for Generative Recommendation and Beyond Paper • 2403.10667 • Published Mar 15, 2024 • 1
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning Paper • 2503.09516 • Published Mar 12, 2025 • 36