Multimodal
updated
MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with
Holistic Platform and Adaptive Hybrid Policy Optimization
Paper
•
2510.08540
•
Published
•
109
Diffusion Transformers with Representation Autoencoders
Paper
•
2510.11690
•
Published
•
165
Spotlight on Token Perception for Multimodal Reinforcement Learning
Paper
•
2510.09285
•
Published
•
36
Towards Mixed-Modal Retrieval for Universal Retrieval-Augmented
Generation
Paper
•
2510.17354
•
Published
•
33
RL makes MLLMs see better than SFT
Paper
•
2510.16333
•
Published
•
48
ThinkMorph: Emergent Properties in Multimodal Interleaved
Chain-of-Thought Reasoning
Paper
•
2510.27492
•
Published
•
82
Visual Representation Alignment for Multimodal Large Language Models
Paper
•
2509.07979
•
Published
•
83
Kwai Keye-VL 1.5 Technical Report
Paper
•
2509.01563
•
Published
•
37
SAM 3: Segment Anything with Concepts
Paper
•
2511.16719
•
Published
•
125
Self-Improving VLM Judges Without Human Annotations
Paper
•
2512.05145
•
Published
•
18