sarvamai
sarvamai/sarvam-translate
No description available.
Model Documentation
Sarvam-Translate
Sarvam-Translate is an advanced translation model built by Sarvam AI in partnership with AI4Bharat, specifically designed for comprehensive, document-level translation across the 22 official Indian languages, built on Gemma3-4B-IT. It addresses modern translation needs by moving beyond isolated sentences to handle long-context inputs, diverse content types, and various formats. Sarvam-Translate aims to provide high-quality, contextually aware translations for Indian languages, which have traditionally lagged behind high-resource languages in LLM performance.Learn more about Sarvam-Translate in our detailed blog post.
Key Features
Supported languages list
Assamese, Bengali, Bodo, Dogri, Gujarati, English, Hindi, Kannada, Kashmiri, Konkani, Maithili, Malayalam, Manipuri, Marathi, Nepali, Odia, Punjabi, Sanskrit, Santali, Sindhi, Tamil, Telugu, UrduQuickstart
The following code snippet demonstrates how to use Sarvam-Translate using Transformers.python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "sarvamai/sarvam-translate"
Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name).to('cuda:0')
Translation task
tgt_lang = "Hindi"
input_txt = "Be the change you wish to see in the world."
Chat-style message prompt
messages = [
{"role": "system", "content": f"Translate the text below to {tgt_lang}."},
{"role": "user", "content": input_txt}
]
Apply chat template to structure the conversation
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
Tokenize and move input to model device
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
Generate the output
generated_ids = model.generate(
**model_inputs,
max_new_tokens=1024,
do_sample=True,
temperature=0.01,
num_return_sequences=1
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
output_text = tokenizer.decode(output_ids, skip_special_tokens=True)
print("Input:", input_txt)
print("Translation:", output_text)
vLLM Deployment
Server:
bash
vllm serve sarvamai/sarvam-translate --port 8000 --dtype bfloat16 --max-model-len 8192
Client:
python
from openai import OpenAI
Modify OpenAI's API key and API base to use vLLM's API server.
openai_api_key = "EMPTY"
openai_api_base = "http://localhost:8000/v1"
client = OpenAI(
api_key=openai_api_key,
base_url=openai_api_base,
)
models = client.models.list()
model = models.data[0].id
tgt_lang = 'Hindi'
input_txt = 'Be the change you wish to see in the world.'
messages = [{"role": "system", "content": f"Translate the text below to {tgt_lang}."}, {"role": "user", "content": input_txt}]
response = client.chat.completions.create(model=model, messages=messages, temperature=0.01)
output_text = response.choices[0].message.content
print("Input:", input_txt)
print("Translation:", output_text)
With Sarvam APIs
Refer our python client documentation.
Sample code:
python
from sarvamai import SarvamAI
client = SarvamAI()
response = client.text.translate(
input="Be the change you wish to see in the world.",
source_language_code="en-IN",
target_language_code="hi-IN",
speaker_gender="Male",
model="sarvam-translate:v1",
)
Files & Weights
| Filename | Size | Action |
|---|---|---|
| model-00001-of-00002.safetensors | 4.62 GB | |
| model-00002-of-00002.safetensors | 3.39 GB |