Sarvam-1: AI for Multilingual India

Sarvam 1
Sarvam-1 is an open-source AI language model by Sarvam AI, designed for Indian languages, offering high efficiency, multilingual support, and fast performance.

Sarvam AI, an emerging force in India’s generative AI ecosystem, has launched Sarvam-1, a groundbreaking language model tailored specifically for Indian languages. This open-source model offers support for 10 Indian languages, including Bengali, Hindi, Tamil, along with English. Launched in October 2024, Sarvam-1 builds upon the company’s previous release, Sarvam 2B, which was introduced in August 2024.

What Makes Sarvam-1 Special?

Built with 2 billion parameters, Sarvam-1 delivers powerful natural language processing capabilities. Parameters are a measure of a model’s complexity—more parameters often mean better performance. For perspective, Microsoft’s Phi-3 Mini features 3.8 billion parameters. With fewer than 10 billion parameters, Sarvam-1 is classified as a Small Language Model (SLM), in contrast to Large Language Models (LLMs) like OpenAI’s GPT-4, which boast trillions of parameters.

Technical Overview

Sarvam-1 operates on the NeMo framework developed by NVIDIA, with training powered by 1,024 GPUs from Yotta. A key challenge in creating this model was the scarcity of high-quality datasets for Indian languages. To address this, Sarvam AI built its own comprehensive training dataset—Sarvam-2T.

The Sarvam-2T Dataset

  • 2 trillion tokens, evenly spread across the 10 supported languages.
  • Uses synthetic data generation to improve dataset quality and coverage.
  • Approximately 20% of the corpus is in Hindi, with significant portions in English and programming languages.

This multilingual diversity ensures that Sarvam-1 excels at both monolingual and multilingual tasks, providing robust language capabilities across a wide spectrum.

Performance and Benchmarks

Sarvam-1 stands out in handling Indic language scripts by optimizing token usage, which improves efficiency. Despite having fewer parameters, Sarvam-1 has outperformed several larger models like Meta’s Llama-3 and Google’s Gemma-2 in benchmarks such as MMLU and ARC-Challenge.

Key Achievements

  • TriviaQA Benchmark:
    • Accuracy for Indic languages: 86.11
    • Outperforms Meta’s Llama-3.1 8B, which scored 61.47
  • Inference Speeds:
    • 4-6 times faster than larger models like Gemma-2-9B and Llama-3.1-8B

These metrics highlight Sarvam-1’s computational efficiency and ability to deliver high performance even with fewer resources.

Real-World Applications

With its speed and efficiency, Sarvam-1 is well-suited for edge deployment—critical for use cases in resource-limited environments like rural areas. This makes it a practical tool for various applications, including:

  • Chatbots and virtual assistants
  • Voice recognition systems in regional languages
  • Translation tools for multilingual communication

Open Access for Developers

Sarvam-1 is available for download on Hugging Face, a leading platform for open-source AI models. This accessibility empowers developers, researchers, and businesses to leverage the model for projects that require high-quality Indian language processing.

Leave a Reply
You May Also Like