SARVAM AI just launched

prwwwm · February 20, 2026, 2:42am

Sarvam AI is a Bengaluru based artificial intelligence startup building foundation models. Its early model, Sarvam-1 (2B parameters), focused on multilingual Indic text, followed by larger models such as Sarvam-30B and Sarvam-105B for advanced reasoning and long-context tasks.

The company also develops speech-to-text, text-to-speech, and vision systems to support enterprise and public-sector use cases. Sarvam AI has released several models with open weights, hence they are leaning towards an open source approach.

prwwwm · February 20, 2026, 2:44am

apparantly their app has a waiting list. if anyone got their invite code can have the access. please get me one.

Satyanveshi · February 20, 2026, 3:15am

Pricing

DrFreeman · February 20, 2026, 5:30am

they did a kickass demo in the AI summit esp. the multilingual translation capabilities…. sadly all such good events were crowded out in the media coverage on golgappa university

btw are these available for download? would like to try the models and see.

prwwwm · February 20, 2026, 5:34am

yes, there were some good showcases by fractal too. sarvam was the limelight taken by those attention seekers.
btw here’s the app link: https://play.google.com/store/apps/details?id=ai.sarvam.indus

you can try there oss models, they are available in huggingface.

DrFreeman · February 20, 2026, 5:49am

are these 30B and 105B models available as OSS? was not able to find in LM Studio currently. will keep an eye though.

Master_chief · February 20, 2026, 6:17am

So whats the verdict ,this any good?

seamon · February 20, 2026, 6:20am

All their OSS Models: sarvamai (Sarvam AI)

I am guessing the 30B model is a finetune of Qwen 3 30B A3b and the 105B model is likely a finetune of a GLM.

DrFreeman · February 20, 2026, 7:04am

No. they are not finetunes. both LLMs built from ground up by sarvam.

seamon · February 20, 2026, 8:03am

I highly highly doubt that. Considering all their previous models were finetunes as well.

I am 99% sure the 30b is a Qwen 3 fine tune. Everyone does this - even Microsoft.
The 105b model is likely a downsized version of a GLM.

Also, making a model from scratch including a completely new architecture is a bad idea in general because you will have to come up with inference tooling and debug problems that have already been solved in general.

I doubt Sarvam AI has the budget and talent to compete with Chinese and American labs to come up with entirely new architectures which might not even work as well as established ones.

DrFreeman · February 20, 2026, 8:29am

The have categorically stressed that they built the two models from scratch. It would be a massive PR blunder from their side if they were finetunes. They would get roasted on SM 10x more than golgappa univ then.

These are not much exotic in the LLM space, it is only a matter of training data and compute. They claim to out perform the Chinese models but I guess it is mostly for Indian context usage and in practice they are probably at the level where Chinese AI labs were in 2025.

seamon · February 20, 2026, 8:40am

If they keep them closed, there won’t be any way to verify it. Besides, finetuning a Base Model is arguably building it from scratch (if you stretch the definition) as the Data is yours.

Unless I see proof and they explicitly talk about architecture etc., I am 90% sure these are finetunes.

It’s not hard to outperform the Base Models if you feed it good Data especially on Indian languages which the Base Models are not even trained in.

Edit: They are nowhere near Chinese Labs were in 2025. In 2025, Chinese Labs were releasing the best OSS models by far.

lambda · February 20, 2026, 9:09am

Do we have their research papers? Or anything talking about the model architecture?

DrFreeman · February 20, 2026, 10:51am

Not at all. anyone who calls a finetuned model as built from scratch is outright lying.

in that case they will face well deserved flak. you can wait some time and see.

Few people who tried it posted on X as such. also we have to compare these two models performance with the similar parameter models, not the 500B or 1T parameter models.

seamon · February 20, 2026, 11:07am

Lot of buzz words in the article but it kinda confirms 2 things:

The 30b model is likely based on Nvidia Nemotron 3 Nano 30b A3b. More than a finetune but less than a new architecture from scratch. The Nvidia Models feels heavily inspired by Qwen 3 30b A3b.

The 105b model is likely based on Nvidia’s Nemotron Super/Ultra.

Edit: I would say this is similar to how companies say they made their own website when in reality, they use Square Space or Wix or a similar website builder tool.

DrFreeman · February 20, 2026, 11:18am

Using the Nemotron framework is more like one step more than using CUDA.

No. this is more like companies building their websites using React.

seamon · February 20, 2026, 11:23am

I still maintain that the most important thing here are the Datasets.

It doesn’t take that much effort after you have the Data. It’s all about training - either through this Nemotron Framework or Finetuning and upscaling an existing base model.

You can rent compute if you have even a decent sized budget.

Data is what will make or break your model.

lambda · February 20, 2026, 3:10pm

I’m looking for something like this - https://arxiv.org/pdf/2505.09388

The architecture itself might not be new but it seems performant enough.

It’ll be nice if they publish the proper details rather than just benchmarks.

esszee · February 20, 2026, 3:27pm

Good to see homegrown LLMs.
They have made tall claims and hopefully they’re able to back it up.

seamon · February 20, 2026, 3:37pm

Lmao I hope they at least release proper benchmarks.

Topic		Replies	Views
Which generative AI tool have you used the most and how? Emerging Tech aiml	102	11833	July 8, 2025
Have you noticed that AI tools have started making more mistakes? Emerging Tech aiml	16	1249	July 22, 2025
Meet ChatGPT (Artificial Intelligence), good alternative to Google? Emerging Tech aiml	231	143728	February 2, 2026
LLM's usefulness Software & Tools	23	14968	August 4, 2024
WWDC 2025: Did Apple Just Hit Snooze? Emerging Tech news	57	2724	August 28, 2025

SARVAM AI just launched

Related topics