Open Research Lab for Indic AI

Accelerating the growth of community-driven initiatives in Indic AI research

About The Program

We are thrilled to introduce the Open Research Lab for Indic AI, a pioneering program dedicated to accelerating the growth of community-driven initiatives in Indic AI research. At, we are solving problems for India - with a focus on conversational AI for Indian languages.

The program is currently open and accepting applications.

We look to further this mission with the help of the community, while giving it back every resource that we build together along the way!

What we offer

Program Highlights

Networking Opportunities

Connect with fellow researchers, industry experts, and thought leaders in the field of Indic AI.

Cloud Compute Resources

Gain access to powerful cloud computing resources to support your research and experimentation needs. All costs on us!

Expert Guidance

Receive mentorship and guidance from seasoned AI professionals and academics.

Real-World Data Access

Utilize real-world data collected by to enhance the relevance and impact of your research.

Interactive Webinars

Webinars and Q&A session for interested researchers to learn more about the program.

Collaborative Projects

Collaborative projects and partnerships that can arise from joining the Open Research Lab

Indic AI Resources

Check out our datasets and models

New Campaign Image

OpenOrca-Top5percent dataset

It contains data instances (conversations) that only use the top 5% most frequently used words from the original OpenOrca dataset, focusing on high-frequency vocabulary.

The data fields are consistent with the original OpenOrca dataset (id, system_prompt, question, response) to ensure compatibility with existing models and tools.

IndicVarna-100k dataset

Contains labeled text data in 10 major Indian languages translated from an English emotion dataset, providing a multilingual resource for sentiment analysis.

Each language sample is divided into 3 sentiment categories (Negative, Neutral, Positive) with an equal number of samples in each, providing a balanced classification task.

The consistent three-column format of uuid, text, label makes it easy to use out-of-the-box with HuggingFace pipelines for text classification and other downstream NLP tasks.

IndicVarna dataset

Muril Indicvarna Tiny Sentiment model

A sample sentiment analysis model built using Callchimp's IndicVarna dataset and Google's Muril encoder.

The model supports 10 most used Indian languages and achieves over 94% training accuracy.

Frequently Asked questions

Applications Now Open

Whether you are an experienced researcher or just starting your journey in AI, the Open Research Lab for Indic AI offers a platform for collaboration, innovation, and growth. Join us in shaping the future of Indic AI research.

Need more convincing?

Read our blog to understand how the use of callchimp AI would revolutionize your business and take your numbers higher!

Convince me

Take control of business

With Callchimp, steer your business communication the way you want with our trend-setting, AI bulk calling system.

Take control