Trust Region Q Adjoint Matching
Paper • 2605.27079 • Published • 2
None defined yet.
TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration
Mitigating Perceptual Judgment Bias in Multimodal LLM-as-a-Judge via Perceptual Perturbation and Reward Modeling