Sarvam AI, an emerging force in India’s generative AI ecosystem, has launched Sarvam-1, a groundbreaking language model tailored specifically for Indian languages. This open-source model offers support for 10 Indian languages, including Bengali, Hindi, Tamil, along with English. Launched in October 2024, Sarvam-1 builds upon the company’s previous release, Sarvam 2B, which was introduced in August 2024.
What Makes Sarvam-1 Special?
Built with 2 billion parameters, Sarvam-1 delivers powerful natural language processing capabilities. Parameters are a measure of a model’s complexity—more parameters often mean better performance. For perspective, Microsoft’s Phi-3 Mini features 3.8 billion parameters. With fewer than 10 billion parameters, Sarvam-1 is classified as a Small Language Model (SLM), in contrast to Large Language Models (LLMs) like OpenAI’s GPT-4, which boast trillions of parameters.
Technical Overview
Sarvam-1 operates on the NeMo framework developed by NVIDIA, with training powered by 1,024 GPUs from Yotta. A key challenge in creating this model was the scarcity of high-quality datasets for Indian languages. To address this, Sarvam AI built its own comprehensive training dataset—Sarvam-2T.
The Sarvam-2T Dataset
- 2 trillion tokens, evenly spread across the 10 supported languages.
- Uses synthetic data generation to improve dataset quality and coverage.
- Approximately 20% of the corpus is in Hindi, with significant portions in English and programming languages.
This multilingual diversity ensures that Sarvam-1 excels at both monolingual and multilingual tasks, providing robust language capabilities across a wide spectrum.
Performance and Benchmarks
Sarvam-1 stands out in handling Indic language scripts by optimizing token usage, which improves efficiency. Despite having fewer parameters, Sarvam-1 has outperformed several larger models like Meta’s Llama-3 and Google’s Gemma-2 in benchmarks such as MMLU and ARC-Challenge.
Key Achievements
- TriviaQA Benchmark:
- Accuracy for Indic languages: 86.11
- Outperforms Meta’s Llama-3.1 8B, which scored 61.47
- Inference Speeds:
- 4-6 times faster than larger models like Gemma-2-9B and Llama-3.1-8B
These metrics highlight Sarvam-1’s computational efficiency and ability to deliver high performance even with fewer resources.
Real-World Applications
With its speed and efficiency, Sarvam-1 is well-suited for edge deployment—critical for use cases in resource-limited environments like rural areas. This makes it a practical tool for various applications, including:
- Chatbots and virtual assistants
- Voice recognition systems in regional languages
- Translation tools for multilingual communication
Open Access for Developers
Sarvam-1 is available for download on Hugging Face, a leading platform for open-source AI models. This accessibility empowers developers, researchers, and businesses to leverage the model for projects that require high-quality Indian language processing.