Pham Minh Tuan's picture

Pham Minh Tuan

1TuanPham

·

vTuanPham

AI & ML interests

None yet

Recent Activity

liked a model 4 days ago

allura-forge/Llama-3.3-8B-Instruct

liked a model 11 days ago

apple/Sharp

liked a model 11 days ago

facebook/sam-audio-large

View all activity

Organizations

upvoted a collection 16 days ago

T5Gemma 2

3 items • Updated 17 days ago • 57

upvoted a collection 18 days ago

sam-audio

11 items • Updated 19 days ago • 108

upvoted a paper 25 days ago

DINO-X: A Unified Vision Model for Open-World Object Detection and Understanding

Paper • 2411.14347 • Published Nov 21, 2024 • 16

upvoted 2 collections about 1 month ago

Mistral Large 3

A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated Dec 2, 2025 • 81

Olmo 3

Artifacts for the Olmo 3 release. • 9 items • Updated 12 days ago • 156

upvoted a collection about 2 months ago

SAM3

5 items • Updated Nov 19, 2025 • 110

upvoted a collection 2 months ago

gpt-oss-safeguard

gpt-oss-safeguard-120b and gpt-oss-safeguard-20b are safety reasoning models built-upon gpt-oss • 2 items • Updated Oct 29, 2025 • 58

upvoted a paper 2 months ago

Skyfall-GS: Synthesizing Immersive 3D Urban Scenes from Satellite Imagery

Paper • 2510.15869 • Published Oct 17, 2025 • 48

upvoted a collection 4 months ago

Qwen3-Next

4 items • Updated 4 days ago • 171

upvoted a collection 5 months ago

DINOv3

DINOv3: foundation models producing excellent dense features, outperforming SotA w/o fine-tuning - https://arxiv.org/abs/2508.10104 • 13 items • Updated Aug 21, 2025 • 436

upvoted a collection 6 months ago

Gemma 3n

4 items • Updated Jul 10, 2025 • 255

upvoted a collection 8 months ago

Perception Encoder

17 items • Updated Jul 11, 2025 • 73

upvoted 2 collections 9 months ago

Describe Anything

Multimodal Large Language Models for Detailed Localized Image and Video Captioning • 7 items • Updated 12 days ago • 61

HiDream-I1

A collections of HiDream-I1 models. • 4 items • Updated Apr 8, 2025 • 32

upvoted a paper 9 months ago

Gemini Robotics: Bringing AI into the Physical World

Paper • 2503.20020 • Published Mar 25, 2025 • 29

upvoted 2 collections 10 months ago

💫StarVector Models

StarVector is a multimodal LLM for Scalable Vector Graphics (SVG) generation, producing structured SVG code directly from images and text. • 2 items • Updated Mar 20, 2025 • 96

Gemma 3 Release

28 items • Updated Aug 11, 2025 • 577

upvoted a collection 11 months ago

PaliGemma 2 Mix

13 items • Updated Jul 10, 2025 • 64

upvoted a paper 11 months ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published Feb 14, 2025 • 55

upvoted a collection 11 months ago

Qwen2.5-1M

The long-context version of Qwen2.5, supporting 1M-token context lengths • 3 items • Updated 4 days ago • 126