roTripathi commited on
Commit
b5903c7
·
verified ·
1 Parent(s): 1e356ab

Switch name to Molmo2-O-7B

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -36,7 +36,7 @@ You can find all models in the Molmo2 family [here](https://huggingface.co/colle
36
 
37
  **Learn more** about the Molmo2 family [in our announcement blog post](https://allenai.org/blog/molmo2).
38
 
39
- Molmo2 7B is based on [Olmo3-7B-Instruct](https://huggingface.co/allenai/Olmo-3-7B-Instruct) and uses [SigLIP 2](https://huggingface.co/google/siglip-so400m-patch14-384) as vision backbone.
40
  It outperforms others in the class of open weight and data models on short videos, counting, and captioning, and is competitive on long-videos.
41
 
42
  Ai2 is commited to open science. The Molmo2 datasets are available [here](https://huggingface.co/collections/allenai/molmo2-data).
@@ -63,7 +63,7 @@ pip install torch pillow einops torchvision accelerate decord2 molmo_utils
63
  from transformers import AutoProcessor, AutoModelForImageTextToText
64
  import torch
65
 
66
- model_id="allenai/Molmo2-7B"
67
 
68
  # load the processor
69
  processor = AutoProcessor.from_pretrained(
@@ -122,7 +122,7 @@ import torch
122
  from molmo_utils import process_vision_info
123
  import re
124
 
125
- model_id="allenai/Molmo2-7B"
126
 
127
  # load the processor
128
  processor = AutoProcessor.from_pretrained(
@@ -219,7 +219,7 @@ import torch
219
  from molmo_utils import process_vision_info
220
  import re
221
 
222
- model_id="allenai/Molmo2-7B"
223
 
224
  # load the processor
225
  processor = AutoProcessor.from_pretrained(
@@ -315,7 +315,7 @@ import torch
315
  import requests
316
  from PIL import Image
317
 
318
- model_id="allenai/Molmo2-7B"
319
 
320
  # load the processor
321
  processor = AutoProcessor.from_pretrained(
@@ -377,7 +377,7 @@ import re
377
  from PIL import Image
378
  import requests
379
 
380
- model_id="allenai/Molmo2-7B"
381
 
382
  # load the processor
383
  processor = AutoProcessor.from_pretrained(
@@ -500,7 +500,7 @@ For details on the evals, refer to the main video results table in our [technica
500
  | VideoChat-Flash-7B | 56.1 |
501
  | Molmo2-4B | 62.8 |
502
  | Molmo2-8B | 63.1 |
503
- | **Molmo2-7B (this model)** | 59.7 |
504
 
505
  ## License and Use
506
 
 
36
 
37
  **Learn more** about the Molmo2 family [in our announcement blog post](https://allenai.org/blog/molmo2).
38
 
39
+ Molmo2-O-7B is based on [Olmo3-7B-Instruct](https://huggingface.co/allenai/Olmo-3-7B-Instruct) and uses [SigLIP 2](https://huggingface.co/google/siglip-so400m-patch14-384) as vision backbone.
40
  It outperforms others in the class of open weight and data models on short videos, counting, and captioning, and is competitive on long-videos.
41
 
42
  Ai2 is commited to open science. The Molmo2 datasets are available [here](https://huggingface.co/collections/allenai/molmo2-data).
 
63
  from transformers import AutoProcessor, AutoModelForImageTextToText
64
  import torch
65
 
66
+ model_id="allenai/Molmo2-O-7B"
67
 
68
  # load the processor
69
  processor = AutoProcessor.from_pretrained(
 
122
  from molmo_utils import process_vision_info
123
  import re
124
 
125
+ model_id="allenai/Molmo2-O-7B"
126
 
127
  # load the processor
128
  processor = AutoProcessor.from_pretrained(
 
219
  from molmo_utils import process_vision_info
220
  import re
221
 
222
+ model_id="allenai/Molmo2-O-7B"
223
 
224
  # load the processor
225
  processor = AutoProcessor.from_pretrained(
 
315
  import requests
316
  from PIL import Image
317
 
318
+ model_id="allenai/Molmo2-O-7B"
319
 
320
  # load the processor
321
  processor = AutoProcessor.from_pretrained(
 
377
  from PIL import Image
378
  import requests
379
 
380
+ model_id="allenai/Molmo2-O-7B"
381
 
382
  # load the processor
383
  processor = AutoProcessor.from_pretrained(
 
500
  | VideoChat-Flash-7B | 56.1 |
501
  | Molmo2-4B | 62.8 |
502
  | Molmo2-8B | 63.1 |
503
+ | **Molmo2-O-7B (this model)** | 59.7 |
504
 
505
  ## License and Use
506