InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14, 2025 • 306
MMHU: A Massive-Scale Multimodal Benchmark for Human Behavior Understanding Paper • 2507.12463 • Published Jul 16, 2025 • 26
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM Fine-Tuning Data from Unstructured Documents Paper • 2507.04009 • Published Jul 5, 2025 • 51
SpeedSearcher-aicrowd/Llama-3.2-11B-Vision-Instruct-direct-v1 Image-to-Text • 11B • Updated Jun 12, 2025
SpeedSearcher-aicrowd/Llama-3.2-11B-Vision-Instruct-direct-v1 Image-to-Text • 11B • Updated Jun 12, 2025
SpeedSearcher-aicrowd/crag-mm-single-turn-public-query-cls Viewer • Updated Jun 10, 2025 • 3.87k • 7
SpeedSearcher-aicrowd/crag-mm-single-turn-public-query-cls Viewer • Updated Jun 10, 2025 • 3.87k • 7