SmolVLM2-2.2B-Instruct GGUF

GGUF quantizations of HuggingFaceTB/SmolVLM2-2.2B-Instruct for use with llama.cpp and Ollama.

Model Description

SmolVLM2 is a compact 2.2B parameter vision-language model from HuggingFace with video understanding capabilities. It's designed to be fast and efficient while maintaining strong performance on vision-language tasks.

Key Features

  • Compact & Fast - Only 2.2B parameters, runs efficiently on consumer hardware
  • Vision & Video - Understands both images and video frames
  • Instruction-tuned - Optimized for following user instructions
  • Apache 2.0 - Fully open source

Available Quantizations

Filename Quant Size Description
SmolVLM2-2.2B-Instruct-Q4_K_M.gguf Q4_K_M 1.0 GB Best balance of quality and speed (recommended)
SmolVLM2-2.2B-Instruct-Q8_0.gguf Q8_0 1.8 GB Higher quality
SmolVLM2-2.2B-Instruct.gguf F16 3.4 GB Full precision

Usage

With Ollama

# Pull and run (Q4_K_M by default)
ollama run richardyoung/smolvlm2-2.2b-instruct

# Or specific quantization
ollama run richardyoung/smolvlm2-2.2b-instruct:q8_0
ollama run richardyoung/smolvlm2-2.2b-instruct:f16

With llama.cpp

# Download a quantization
wget https://huggingface.co/richardyoung/SmolVLM2-2.2B-Instruct-GGUF/resolve/main/SmolVLM2-2.2B-Instruct-Q4_K_M.gguf

# Run with llama.cpp
./llama-cli -m SmolVLM2-2.2B-Instruct-Q4_K_M.gguf -p "Describe this image:" --image your_image.jpg

Technical Requirements

  • Minimum: 4GB RAM, any modern CPU
  • Recommended: 8GB RAM or Apple Silicon Mac

Chat Template

SmolVLM2 uses the ChatML format:

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{user_message}<|im_end|>
<|im_start|>assistant
{assistant_response}<|im_end|>

Links

Credits

  • Original Model: HuggingFace
  • Quantization: Richard Young (deepneuro.ai)

License

Apache 2.0

Downloads last month
25
GGUF
Model size
2B params
Architecture
llama
Hardware compatibility
Log In to view the estimation

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for richardyoung/SmolVLM2-2.2B-Instruct-GGUF