Update README.md

5397b33 verified 20 days ago

4.34 kB

	---
	license: gemma
	license_link: https://ai.google.dev/gemma/terms
	library_name: transformers
	pipeline_tag: image-text-to-text
	extra_gated_heading: Access Gemma on Hugging Face
	extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
	agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging
	Face and click below. Requests are processed immediately.
	extra_gated_button_content: Acknowledge license
	base_model: google/gemma-3-4b-it
	base_model_relation: quantized
	---
	# gemma-3-4b-it-int4-cw-ov
	* Model creator: [google](https://huggingface.co/google)
	* Original model: [gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it)

	## Description
	This is [gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2025/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT4 by [NNCF](https://github.com/openvinotoolkit/nncf).

	> [!NOTE]
	> The model is optimized for inference on NPU using these [instructions.](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai/inference-with-genai-on-npu.html#export-an-llm-model-via-hugging-face-optimum-intel)


	## Quantization Parameters

	Weight compression was performed using `nncf.compress_weights` with the following parameters:

	* mode: INT4_SYM
	* ratio: 1.0

	## Compatibility

	The provided OpenVINO™ IR model is compatible with:

	* OpenVINO version 2025.4.0 and higher
	* Optimum Intel 1.27.0 and higher

	## Running Model Inference with [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai)

	1. Install packages required for using OpenVINO GenAI:
	```
	pip install openvino openvino-tokenizers openvino-genai

	pip install huggingface_hub
	```

	2. Download model from HuggingFace Hub:

	```
	import huggingface_hub as hf_hub

	model_id = "OpenVINO/gemma-3-4b-it-int4-cw-ov"
	model_path = "gemma-3-4b-it-int4-cw-ov"

	hf_hub.snapshot_download(model_id, local_dir=model_path)

	```

	3. Run model inference:

	```
	import openvino_genai as ov_genai
	import requests
	from PIL import Image
	from io import BytesIO
	import numpy as np
	import openvino as ov

	device = "NPU"
	pipe = ov_genai.VLMPipeline(model_path, device)

	def load_image(image_file):
	if isinstance(image_file, str) and (image_file.startswith("http") or image_file.startswith("https")):
	response = requests.get(image_file)
	image = Image.open(BytesIO(response.content)).convert("RGB")
	else:
	image = Image.open(image_file).convert("RGB")
	image_data = np.array(image.getdata()).reshape(1, image.size[1], image.size[0], 3).astype(np.uint8)
	return ov.Tensor(image_data)

	prompt = "What is unusual in this picture?"

	url = "https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/d5fbbd1a-d484-415c-88cb-9986625b7b11"
	image_tensor = load_image(url)

	def streamer(subword: str) -> bool:
	print(subword, end="", flush=True)
	return False

	pipe.start_chat()
	output = pipe.generate(prompt, image=image_tensor, max_new_tokens=100, streamer=streamer)
	pipe.finish_chat()
	```

	More GenAI usage examples can be found in OpenVINO GenAI library [docs](https://github.com/openvinotoolkit/openvino.genai/blob/master/src/README.md) and [samples](https://github.com/openvinotoolkit/openvino.genai?tab=readme-ov-file#openvino-genai-samples)

	## Limitations

	Check the original model card for [original model card](https://huggingface.co/google/gemma-3-4b-it) for limitations.

	## Legal information

	The original Gemma Model and Gemma Model Derivatives are distributed under the [Gemma Terms of Use](https://ai.google.dev/gemma/terms). To the extent permissible under the Gemma Terms of Use, Intel’s modifications are distributed under Apache 2.0. Model details can be found in the [original model card](https://huggingface.co/google/gemma-3-4b-it).

	## Disclaimer

	Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See [Intel’s Global Human Rights Principles](https://www.intel.com/content/dam/www/central-libraries/us/en/documents/policy-human-rights.pdf). Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.