Episode Details

Back to Episodes

India’s AI still doesn’t speak India. Can it?

Episode 676 Published 1 month ago
Description

ChatGPT butchers Punjabi with spelling errors and Bollywood-style Hindi bleeding through. Hindi bots trained on newspapers miss dialects like Awadhi and Bhojpuri entirely, while Tamil AI ignores the rich variations between Kongu and Madurai speech.

Sure, Gurugram collected ₹200 crore in taxes using Hindi AI calls, but that's because Hindi dominates datasets. Most other languages remain stuck in translation hell. Private companies optimize for speed over nuance, government corpora like Bhashini sit underused, and multimodal data that captures tone and emotion is too expensive to build.

The result? AI is flattening India's 780 languages into sanitized, standardized versions that erase the very dialects it claims to serve.

Read the newsletter here. Find the Duolingo article here

Daybreak is produced from the newsroom of The Ken, India’s first subscriber-only business news platform. Subscribe for more exclusive, deeply-reported, and analytical business stories.

Listen Now

Love PodBriefly?

If you like Podbriefly.com, please consider donating to support the ongoing development.

Support Us