Run any Skill in Manus with one click

sarvam-ai

Indian AI toolkit powered by Sarvam AI — text-to-speech, speech-to-text, document intelligence, translation, transliteration, language detection, and chat completion across 23 Indian languages. Use when working with Indian languages, Hindi/Tamil/Bengali text, Sarvam AI, or when the user needs translation, transcription, or TTS for South Asian languages.

Run Skill in Manus

Overview

Install command

npx skills add https://github.com/ankitjh4/indic-ai-skills --skill sarvam-ai

Copy and paste this command into Claude Code to install the skill

Source

ankitjh4/indic-ai-skills

Stars63

Forks10

UpdatedMarch 25, 2026 at 11:17

File Explorer

7 files

SKILL.md

readonly

name	sarvam-ai
description	Indian AI toolkit powered by Sarvam AI — text-to-speech, speech-to-text, document intelligence, translation, transliteration, language detection, and chat completion across 23 Indian languages. Use when working with Indian languages, Hindi/Tamil/Bengali text, Sarvam AI, or when the user needs translation, transcription, or TTS for South Asian languages.
metadata	{"author":"ankitjh4","category":"External","display-name":"Indian AI Toolkit (Sarvam)"}

Sarvam AI — Indian Language Toolkit

Comprehensive AI toolkit for 23 Indian languages: TTS, STT, Document Intelligence, Translation, Transliteration, Language Detection, and Chat.

Setup

Get a free API key at https://dashboard.sarvam.ai
Set environment variable: export SARVAM_API_KEY="your-api-key"

Supported Languages

hi-IN Hindi, en-IN English, bn-IN Bengali, gu-IN Gujarati, kn-IN Kannada, ml-IN Malayalam, mr-IN Marathi, or-IN/od-IN Odia, pa-IN Punjabi, ta-IN Tamil, te-IN Telugu, ur-IN Urdu, as-IN Assamese, bodo-IN/brx-IN Bodo, doi-IN Dogri, ks-IN Kashmiri, kok-IN Konkani, mai-IN Maithili, mni-IN Manipuri, ne-IN Nepali, sa-IN Sanskrit, sat-IN Santali, sd-IN Sindhi

Text-to-Speech

python3 scripts/tts.py "नमस्ते, आप कैसे हैं?" --language hi-IN --speaker meera

Parameter	Default	Description
`text`	—	Text to convert (max 2500 chars)
`--language`	hi-IN	Language code
`--speaker`	meera	Voice name
`--output`	output.wav	Output file
`--sample-rate`	24000	Audio sample rate

Speakers — Female: Meera, Priya, Neha, Simran, Kavya, Ishita, Shreya, and more. Male: Shubh, Aditya, Rahul, Amit, Dev, Arjun, and more.

Speech-to-Text

Three modes: REST (quick, <30s), WebSocket (real-time streaming), Batch (long audio, diarization).

# REST — quick transcription
python3 scripts/speech_to_text.py rest audio.mp3

# WebSocket — real-time streaming
python3 scripts/speech_to_text.py websocket audio.wav

# Batch — multiple files with speaker diarization
python3 scripts/speech_to_text.py batch audio1.mp3 audio2.mp3 --diarization --num-speakers 3 --output-dir ./transcripts/

Batch workflow: create job → upload files → start → poll status (Accepted → Pending → Running → Completed) → download results.

Formats: WAV, MP3, AAC, AIFF, OGG, OPUS, FLAC, MP4/M4A, AMR, WMA, WebM, PCM

Document Intelligence

Extract text from PDFs and images (JPEG/PNG).

python3 scripts/document_intelligence.py document.pdf --language hi-IN --format md
python3 scripts/document_intelligence.py --job-id <id> --download -o ./output/

Formats: md (default), html, json. Max 200 MB, 500 pages.

Translation

# Auto-detect source, translate to Hindi
python3 scripts/text_processing.py translate "Hello, how are you?" --target hi-IN

# Mayura model with colloquial mode
python3 scripts/text_processing.py translate "What's up?" --target hi-IN --model mayura:v1 --mode modern-colloquial

Models: sarvam-translate:v1 (23 languages), mayura:v1 (12 languages, supports modes and transliteration)

Modes (mayura only): formal, modern-colloquial, classic-colloquial, code-mixed

Transliteration

python3 scripts/text_processing.py transliterate "नमस्ते" --source hi-IN --target en-IN
python3 scripts/text_processing.py transliterate "namaste" --source en-IN --target hi-IN --spoken-form

Language Detection

python3 scripts/text_processing.py detect "नमस्ते दुনিয়া"
# Output: Language: hi-IN, Script: Deva

Chat Completion

Two models: sarvam-105b (flagship, complex reasoning) and sarvam-m (efficient, general chat).

python3 scripts/text_processing.py chat "Explain quantum computing" --model sarvam-105b
python3 scripts/text_processing.py chat "What is the capital of India?" --model sarvam-m --temperature 0.8

Resources

Dashboard: https://dashboard.sarvam.ai
Docs: https://docs.sarvam.ai
Cookbook: https://github.com/sarvamai/sarvam-ai-cookbook

More from this repository

same repository

cashfree

ankitjh4/indic-ai-skills

Use this skill whenever the user wants to integrate Cashfree payment APIs. Triggers include: creating orders or payment sessions, accepting payments via UPI/cards/netbanking/wallets, generating payment links to share via SMS/email, handling refunds, verifying webhook signatures, fetching payment or settlement status, building a checkout flow, writing Python code for Cashfree, switching between test and production environments, or understanding Cashfree error codes. Also trigger when user mentions Cashfree PG, Cashfree Payouts, payment gateway India, or x-client-id credentials.

2026-04-2563

shiprocket

ankitjh4/indic-ai-skills

Use this skill whenever the user wants to integrate Shiprocket logistics APIs. Triggers include: creating shipment orders, generating AWB numbers, tracking parcels by AWB or order ID, checking courier serviceability between pincodes, calculating shipping rates, generating shipping labels, handling NDR (Non-Delivery Reports), cancelling orders, requesting pickups, or building any eCommerce shipping workflow using Shiprocket. Also trigger when user mentions Shiprocket courier, logistics API, AWB tracking, or delivery partner selection.

2026-04-2463

vedic-astrology

ankitjh4/indic-ai-skills

Complete Vedic astrology chart generation and interpretation. Generate birth charts (D1-D60), calculate Panchanga, Shadbala, Vimshottari Dasha, Ashtakavarga, and provide interpretations using Krishnamurthi Paddhati (KP) system, classical Parashara principles, and traditional texts. Supports both natal and horary (Prasna) charts.

2026-04-0863

indian-constitution

ankitjh4/indic-ai-skills

Query Indian Constitution articles and BNS 2023 (Bharatiya Nyaya Sanhita) criminal law sections using RAG with semantic search. Look up fundamental rights, directive principles, emergency provisions, and BNS offences with cited article/section numbers. Use when the user asks about Indian law, constitutional articles, fundamental rights, BNS sections, IPC equivalents, or legal questions about India.

2026-03-2563

phonepe

ankitjh4/indic-ai-skills

Process UPI payments via PhonePe — initiate collect requests, generate QR codes, check transaction status, and handle refunds. Use when the user asks about PhonePe payments, UPI integration, Indian mobile payments, or UPI collect/QR payment flows.

2026-03-2563

rapido

ankitjh4/indic-ai-skills

Rapido ride-booking API integration for India — location search, geocoding, fare estimates, ride booking, tracking, and wallet management. Reverse-engineered from Rapido PWA. Use when the user wants to search locations via Rapido, geocode Indian addresses, book a Rapido ride, check Rapido fares, or interact with the Rapido platform.

2026-03-2563

Source

ankitjh4

ankitjh4/indic-ai-skills

View GitHub Repository View Creator Repositories

Install command

Download

Run Skill in Manus

Useful forSOC

Software DevelopersComputer and Mathematical Occupations15-1252L4

name	sarvam-ai
description	Indian AI toolkit powered by Sarvam AI — text-to-speech, speech-to-text, document intelligence, translation, transliteration, language detection, and chat completion across 23 Indian languages. Use when working with Indian languages, Hindi/Tamil/Bengali text, Sarvam AI, or when the user needs translation, transcription, or TTS for South Asian languages.
metadata	{"author":"ankitjh4","category":"External","display-name":"Indian AI Toolkit (Sarvam)"}

Sarvam AI — Indian Language Toolkit

Comprehensive AI toolkit for 23 Indian languages: TTS, STT, Document Intelligence, Translation, Transliteration, Language Detection, and Chat.

Setup

Get a free API key at https://dashboard.sarvam.ai
Set environment variable: export SARVAM_API_KEY="your-api-key"

Supported Languages

Text-to-Speech

python3 scripts/tts.py "नमस्ते, आप कैसे हैं?" --language hi-IN --speaker meera

Parameter	Default	Description
`text`	—	Text to convert (max 2500 chars)
`--language`	hi-IN	Language code
`--speaker`	meera	Voice name
`--output`	output.wav	Output file
`--sample-rate`	24000	Audio sample rate

Speakers — Female: Meera, Priya, Neha, Simran, Kavya, Ishita, Shreya, and more. Male: Shubh, Aditya, Rahul, Amit, Dev, Arjun, and more.

Speech-to-Text

Three modes: REST (quick, <30s), WebSocket (real-time streaming), Batch (long audio, diarization).

# REST — quick transcription
python3 scripts/speech_to_text.py rest audio.mp3

# WebSocket — real-time streaming
python3 scripts/speech_to_text.py websocket audio.wav

# Batch — multiple files with speaker diarization
python3 scripts/speech_to_text.py batch audio1.mp3 audio2.mp3 --diarization --num-speakers 3 --output-dir ./transcripts/

Batch workflow: create job → upload files → start → poll status (Accepted → Pending → Running → Completed) → download results.

Formats: WAV, MP3, AAC, AIFF, OGG, OPUS, FLAC, MP4/M4A, AMR, WMA, WebM, PCM

Document Intelligence

Extract text from PDFs and images (JPEG/PNG).

python3 scripts/document_intelligence.py document.pdf --language hi-IN --format md
python3 scripts/document_intelligence.py --job-id <id> --download -o ./output/

Formats: md (default), html, json. Max 200 MB, 500 pages.

Translation

# Auto-detect source, translate to Hindi
python3 scripts/text_processing.py translate "Hello, how are you?" --target hi-IN

# Mayura model with colloquial mode
python3 scripts/text_processing.py translate "What's up?" --target hi-IN --model mayura:v1 --mode modern-colloquial

Models: sarvam-translate:v1 (23 languages), mayura:v1 (12 languages, supports modes and transliteration)

Modes (mayura only): formal, modern-colloquial, classic-colloquial, code-mixed

Transliteration

python3 scripts/text_processing.py transliterate "नमस्ते" --source hi-IN --target en-IN
python3 scripts/text_processing.py transliterate "namaste" --source en-IN --target hi-IN --spoken-form

Language Detection

python3 scripts/text_processing.py detect "नमस्ते दुনিয়া"
# Output: Language: hi-IN, Script: Deva

Chat Completion

Two models: sarvam-105b (flagship, complex reasoning) and sarvam-m (efficient, general chat).

python3 scripts/text_processing.py chat "Explain quantum computing" --model sarvam-105b
python3 scripts/text_processing.py chat "What is the capital of India?" --model sarvam-m --temperature 0.8

Resources

Dashboard: https://dashboard.sarvam.ai
Docs: https://docs.sarvam.ai
Cookbook: https://github.com/sarvamai/sarvam-ai-cookbook