Instructions to use gateremark/kikuyu_translategemma_4b_v7_highrank_rslora with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use gateremark/kikuyu_translategemma_4b_v7_highrank_rslora with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="gateremark/kikuyu_translategemma_4b_v7_highrank_rslora") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("gateremark/kikuyu_translategemma_4b_v7_highrank_rslora") model = AutoModelForImageTextToText.from_pretrained("gateremark/kikuyu_translategemma_4b_v7_highrank_rslora") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use gateremark/kikuyu_translategemma_4b_v7_highrank_rslora with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "gateremark/kikuyu_translategemma_4b_v7_highrank_rslora" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "gateremark/kikuyu_translategemma_4b_v7_highrank_rslora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/gateremark/kikuyu_translategemma_4b_v7_highrank_rslora
- SGLang
How to use gateremark/kikuyu_translategemma_4b_v7_highrank_rslora with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "gateremark/kikuyu_translategemma_4b_v7_highrank_rslora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "gateremark/kikuyu_translategemma_4b_v7_highrank_rslora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "gateremark/kikuyu_translategemma_4b_v7_highrank_rslora" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "gateremark/kikuyu_translategemma_4b_v7_highrank_rslora", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio
How to use gateremark/kikuyu_translategemma_4b_v7_highrank_rslora with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for gateremark/kikuyu_translategemma_4b_v7_highrank_rslora to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for gateremark/kikuyu_translategemma_4b_v7_highrank_rslora to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for gateremark/kikuyu_translategemma_4b_v7_highrank_rslora to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="gateremark/kikuyu_translategemma_4b_v7_highrank_rslora", max_seq_length=2048, ) - Docker Model Runner
How to use gateremark/kikuyu_translategemma_4b_v7_highrank_rslora with Docker Model Runner:
docker model run hf.co/gateremark/kikuyu_translategemma_4b_v7_highrank_rslora
Uploaded finetuned model
- Developed by: gateremark
- License: apache-2.0
- Finetuned from model: google/translategemma-4b-it
This Gemma3 / TranslateGemma model was trained with Unsloth and Hugging Face's TRL library.
Kikuyu TranslateGemma-4B V7
Fine-tuned English -> Kikuyu translation model based on Google's TranslateGemma-4B-it.
This is the current fast production model behind C-elo Translate. It was trained as a smaller, faster alternative to the earlier 12B model while improving automatic evaluation scores and manual translation quality.
Live demo: c-elo.com/c-elo-ai
Previous 12B model: gateremark/kikuyu_translategemma_12b_merged_V2
Model Details
| Attribute | Value |
|---|---|
| Base model | google/translategemma-4b-it |
| Model family | TranslateGemma / Gemma3 |
| Hub size | ~5B parameters, BF16 safetensors |
| Fine-tuning method | rsLoRA, high-rank LoRA |
| LoRA rank / alpha | r=256, alpha=256 |
| Training data | 30,430 English-Kikuyu pairs |
| Direction | English -> Kikuyu |
| BLEU | 21.93 |
| chrF++ | 42.87 |
| Eval loss | 0.7518 |
| Framework | Unsloth + TRL + Transformers |
| Training platform | Modal, NVIDIA H100 |
Why This Model
The earlier 12B Kikuyu TranslateGemma model reached 19.61 BLEU, but it was large and slower to cold-start in production. This V7 4B-family model is smaller, faster to load, and evaluated better on the same held-out split:
| Model | BLEU | chrF++ | Notes |
|---|---|---|---|
| 12B LoRA V2 | 19.61 | - | Earlier production model |
| 4B V2 LoRA r256 | 17.76 | 38.31 | Strong first 4B run |
| 4B V3 DoRA r128 | 15.81 | 35.88 | DoRA did not improve this setup |
| 4B V6 LoRA r256, 4 epochs | 17.67 | 38.28 | Longer training did not improve V2 |
| 4B V7 rsLoRA r256 | 21.93 | 42.87 | Current champion |
Usage
Recommended: Unsloth / Gemma3Processor Path
This model uses the TranslateGemma/Gemma3 chat template. For reliable generation, use the processor for apply_chat_template() and the underlying text tokenizer for tokenization/decoding.
import torch
from unsloth import FastLanguageModel
model_id = "gateremark/kikuyu_translategemma_4b_v7_highrank_rslora"
model, processor = FastLanguageModel.from_pretrained(
model_name=model_id,
max_seq_length=4096,
dtype=None,
load_in_4bit=False, # Set True if you need lower VRAM and accept possible quality changes.
)
text_tokenizer = (
getattr(processor, "tokenizer", None)
or getattr(processor, "text_tokenizer", None)
or processor
)
if text_tokenizer.pad_token_id is None:
text_tokenizer.pad_token = text_tokenizer.eos_token
model.config.pad_token_id = text_tokenizer.pad_token_id
text_tokenizer.padding_side = "left"
FastLanguageModel.for_inference(model)
terminators = []
for token_id in [
text_tokenizer.eos_token_id,
text_tokenizer.convert_tokens_to_ids("<end_of_turn>"),
text_tokenizer.convert_tokens_to_ids("<eos>"),
]:
if (
isinstance(token_id, int)
and token_id >= 0
and token_id != getattr(text_tokenizer, "unk_token_id", None)
and token_id not in terminators
):
terminators.append(token_id)
def translate_to_kikuyu(text: str) -> str:
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"source_lang_code": "en",
"target_lang_code": "ki",
"text": text,
}
],
}
]
formatted_text = processor.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
inputs = text_tokenizer(
[formatted_text],
return_tensors="pt",
padding=True,
)
inputs = {key: value.to(model.device) for key, value in inputs.items()}
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=128,
do_sample=False,
eos_token_id=terminators,
pad_token_id=text_tokenizer.pad_token_id,
)
input_len = inputs["input_ids"].shape[1]
response = text_tokenizer.decode(
outputs[0][input_len:],
skip_special_tokens=True,
)
return response.strip()
print(translate_to_kikuyu("Hello, how are you?"))
# Example output: Ndũmĩrĩrie, ũraigua atĩa?
Minimal Inference Notes
- Use
target_lang_code="ki"for Kikuyu. - Use left padding for batched generation with decoder-only models.
- Deterministic decoding (
do_sample=False) is recommended for translation. - The model is trained for English -> Kikuyu. Reverse Kikuyu -> English was not part of this run.
Training Details
Dataset
- Dataset: gateremark/english-kikuyu-translations
- Size: 30,430 parallel English-Kikuyu sentence pairs
- Split: 95% train / 5% eval
- Train examples: 28,908
- Eval examples: 1,522
V7 Hyperparameters
| Parameter | Value |
|---|---|
| Method | rsLoRA |
| LoRA rank | 256 |
| LoRA alpha | 256 |
| LoRA dropout | 0 |
| DoRA | False |
| Epochs | 3 |
| Learning rate | 1e-4 |
| Batch size | 32 effective batch |
| Optimizer | AdamW 8-bit |
| Weight decay | 0.01 |
| Precision | BF16 |
| Max sequence length | 4096 |
Target Modules
[
"q_proj",
"k_proj",
"v_proj",
"o_proj",
"gate_proj",
"up_proj",
"down_proj",
]
Evaluation
Evaluation was run on the held-out 5% split from the same dataset using BLEU and chrF++.
| Metric | Score |
|---|---|
| BLEU | 21.93 |
| chrF++ | 42.87 |
| Eval loss | 0.7518 |
Automatic metrics are useful for regression testing, but Kikuyu quality should also be checked with native-speaker review because morphology, idiom, tone, and dialect variation are not fully captured by BLEU.
Sample Translations
| English | Kikuyu |
|---|---|
| Hello, how are you? | Hihi, ũrĩ atĩa? |
| The weather is beautiful today. | Rĩera nĩ rĩega mũno ũmũthĩ |
| I love learning new languages. | Nĩ nyendete kwĩruta thiomi njerũ |
Intended Use
- English -> Kikuyu translation tools
- Kikuyu language learning applications
- Low-resource African language NLP research
- Cultural and linguistic preservation projects
- Prototyping multilingual AI interfaces for Kikuyu speakers
Limitations
- Direction: English -> Kikuyu only. Kikuyu -> English was not trained in this run.
- Language coverage: Optimized for Kikuyu (
ki), not other Gikuyu-related dialects or neighboring Bantu languages. - Domain: Best for general text. Technical, legal, medical, poetic, or highly idiomatic content may need human review.
- Evaluation: BLEU and chrF++ do not fully measure naturalness, dialect fit, or cultural nuance.
- Production use: Review outputs before high-stakes use.
Citation
@misc{gatere2026kikuyutranslategemma4bv7,
author = {Mark Gatere},
title = {Kikuyu TranslateGemma-4B V7: rsLoRA Fine-tuning for English to Kikuyu Translation},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/gateremark/kikuyu_translategemma_4b_v7_highrank_rslora}}
}
Acknowledgments
- Google for TranslateGemma-4B-it
- Unsloth for efficient fine-tuning
- Hugging Face for model and dataset hosting
- Modal for GPU training and deployment infrastructure
- Kikuyu speakers and reviewers supporting C-elo's translation work
- Downloads last month
- 58
Model tree for gateremark/kikuyu_translategemma_4b_v7_highrank_rslora
Base model
google/translategemma-4b-itEvaluation results
- BLEU on gateremark/english-kikuyu-translationsself-reported21.930
- chrF++ on gateremark/english-kikuyu-translationsself-reported42.870
