Captions
Captions are time-synchronised text displayed on video to represent spoken dialogue, sound effects, or speaker identification.
Captions are time-synchronised on-screen text that represents spoken dialogue and non-speech audio information in a video. They differ from subtitles in an important way: subtitles translate dialogue for viewers who don't understand the spoken language and typically represent only dialogue; captions are designed for deaf and hard-of-hearing viewers and include additional information like speaker identification, sound effects ([music], [applause], [door slams]), tone indicators ([sarcastic], [whispered]), and off-screen sound cues. In practice, many creators use "subtitles" and "captions" interchangeably — especially in contexts like YouTube where the platform calls everything "captions" but most text content is dialogue-only. Captions come in two delivery formats: open captions (burned into the video, always visible, can't be turned off) and closed captions (stored as a separate track, toggleable by the viewer, displayed in the viewer's player). Standard caption file formats include SRT (simple timed text), VTT (web-standard), SCC (broadcast), and TTML (XML-based). Captions are legally required on much television and online video content in many jurisdictions, including the US (ADA) and UK (Equality Act).
How it works
Caption production workflows fall into three categories: human-transcribed (professional captioners produce captions from scratch, high accuracy but expensive — ~$1-3/minute of video), automated (speech recognition produces captions from audio, instant but accuracy varies by accent/language/audio quality), and hybrid (AI draft + human cleanup, balancing cost and quality — the 2026 default for professional creators). For Indian-language captions specifically, automated transcription quality varies sharply by language: Hindi and Bengali ASR is competent; Tamil, Telugu, and Malayalam varies; less-common Indian languages often require human transcribing. Caption styling rules differ across platforms: YouTube allows standard fonts and colours in CC; Netflix has strict style guides; TV broadcast captions use white-on-black for maximum readability. For karaoke-style social media captions, the rules are looser — creators experiment with bold fonts, saturated colours, and aggressive animation.
Examples
Deaf accessibility
YouTube video uses closed captions including speaker names, sound cues ([birdsong], [glass breaking]), and music descriptions — enabling a deaf viewer to experience the full content.
Social media reach
Instagram reports ~85% of Reels are watched muted — captions are effectively required for content to perform on the platform.
Legal compliance
US broadcast TV is required to have captions under the ADA; major streamers (Netflix, Disney+) are subject to similar requirements in multiple markets.
Why this matters for Indian-language TTS
Indian-language captioning is a fast-growing need — 500M+ Indian internet users prefer Indian-language content, and mute scrolling is standard on Indian Instagram and YouTube. Quality Indian-language captions require native-script rendering (not transliteration) and correct word-boundary handling. VoisLabs' audio-to-video pipeline produces Indian-language captions natively, solving the rendering gap that Western tools leave.
Related terms
Closed Captions
Closed captions are subtitles stored in a separate track that viewers can toggle on or off, supporti…
Karaoke Subtitles
Karaoke subtitles highlight each word or syllable as it is spoken, similar to how song lyrics appear…
SRT File (SubRip Subtitle)
An SRT file is a simple text format for time-coded subtitles, widely supported across video editors,…
Burned-in Subtitles
Burned-in subtitles are permanently rendered into the video image — always visible, can't be toggled…
Learn more
Frequently Asked Questions
What is the difference between captions and subtitles?
Are captions legally required?
Can I have captions in Hindi for a Hindi video?
Try VoisLabs — Indian-language TTS done right
1 minute free per day. 12 languages. Native Indian-script karaoke subtitles. No card required.
Start freeLast verified: 2026-04-21