A Hindi instruction-tuned version of Qwen3-4B, fine-tuned to follow instructions and respond naturally in Hindi. Built for developers, researchers, and builders who need a capable, openly-licensed Hindi language model that runs on modest hardware.

Part of the Hindi LLM Series โ€” a collection focused on bringing strong Indic-language models to local and edge deployment.

๐Ÿ’ก Looking to run this locally on CPU? Use the GGUF version (Q4/Q5/Q8) with llama.cpp, Ollama, or LM Studio.

Highlights

  • Strong Hindi instruction-following โ€” trained on 10K curated Hindi instructionโ€“response pairs
  • Bilingual โ€” handles both Hindi (Devanagari) and English
  • Compact โ€” 4B parameters, runs comfortably on a single consumer GPU; quantizes well for CPU
  • Open license โ€” Apache 2.0, commercial use allowed

Example

Prompt:

เคญเคพเคฐเคค เค•เฅ€ เคฐเคพเคœเคงเคพเคจเฅ€ เค•เฅเคฏเคพ เคนเฅˆ? เคเค• เคตเคพเค•เฅเคฏ เคฎเฅ‡เค‚ เค‰เคคเฅเคคเคฐ เคฆเฅ‡เค‚เฅค

Response:

<!-- PASTE A REAL OUTPUT FROM YOUR TEST HERE -->

(Replace the block above with an actual response from your testing โ€” authentic examples build far more trust than invented ones.)

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, dtype="auto", device_map="auto")

messages = [{"role": "user", "content": "เคฎเฅเคเฅ‡ เคธเฅเคตเคธเฅเคฅ เคฐเคนเคจเฅ‡ เค•เฅ‡ เคคเฅ€เคจ เค†เคธเคพเคจ เคคเคฐเฅ€เค•เฅ‡ เคฌเคคเคพเค“เฅค"}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True))

Model Details

Property Value
Base model Qwen/Qwen3-4B-Instruct-2507
Parameters ~4B
Fine-tuning method LoRA (r=32, ฮฑ=32) via Unsloth
Training data 10K Hindi instructionโ€“response pairs
Languages Hindi (hi), English (en)
Context length inherited from base
License Apache 2.0

Training

Fine-tuned with Unsloth for efficient LoRA training. The dataset was filtered to keep only genuinely Hindi (Devanagari) responses, then formatted with the Qwen chat template and trained for one full epoch. The resulting LoRA was merged into 16-bit weights and exported.

Intended Use & Limitations

Intended for: Hindi chat and assistant applications, instruction-following, Indic-language experimentation, and local/edge deployment via GGUF.

Limitations: As a 4B model, it can make factual errors and may produce inconsistent results on complex reasoning or specialized domains. It inherits any biases present in the base model and training data. Validate outputs before production use.

Citation

If you use this model, a link back to this repository is appreciated.


Part of the ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi LLM Series by pankajpandey-dev.

Downloads last month
99
Safetensors
Model size
4B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2

Adapter
(5492)
this model
Quantizations
1 model

Collection including pankajpandey-dev/Qwen3-4B-Hindi-Instruct-v2