Sample Rate
Sample rate is how many times per second audio is measured — 44.1 kHz is CD standard, 48 kHz is video-production standard.
Sample rate is the number of times per second an audio signal is measured (sampled) during digital recording, measured in hertz (Hz) or kilohertz (kHz). According to the Nyquist theorem, the sample rate must be at least 2× the highest frequency you want to capture — since human hearing extends to ~20 kHz, sample rates of 44.1 kHz (CD quality) or 48 kHz (video/broadcast standard) capture the full audible range. Common sample rates: 8 kHz (telephone quality, voice-only), 16 kHz (radio-like voice), 22.05 kHz (half of CD), 44.1 kHz (CD Audio standard), 48 kHz (video, DVD, broadcast standard), 96 kHz (high-resolution audio, professional studios), 192 kHz (ultra-high-resolution, specialist use). For voice-only content, 22-24 kHz sample rate is often sufficient; for music and professional production, 44.1 or 48 kHz is standard; for mastering and archival, 96 kHz is used by audiophile workflows. Sample rate conversion (downsampling from 48 to 44.1, upsampling from 22 to 44.1) is lossy to some degree — record or synthesise at the target sample rate when possible.
How it works
Sample rate and bit depth together determine raw audio data rate (before compression): at 44.1 kHz × 16-bit × 2 channels = 1411 kbps = ~10 MB per minute. Higher sample rates capture higher frequencies (above 22 kHz for sample rates above 44.1 kHz) — theoretically useful for mastering headroom, questionably useful for final delivery since human hearing rarely reaches those frequencies. For TTS output, synthesis happens at a native sample rate (typically 24 kHz or 48 kHz depending on the model) and is resampled as needed for output. Video production uses 48 kHz as standard because video timecodes align better at 48 kHz than 44.1 kHz — for audio-to-video workflows, 48 kHz is generally preferred. Podcast distribution typically uses 44.1 kHz (the legacy audio-CD standard inherited from music distribution). Modern TTS platforms like VoisLabs output at standard sample rates appropriate for the export format — 44.1 or 48 kHz for video-embedded audio.
Examples
Telephone voice
8 kHz sample rate captures up to 4 kHz — enough for intelligible voice but loses high-frequency detail (sibilance, fricatives), producing the distinctive "telephone sound".
CD music
44.1 kHz × 16-bit × 2 channels — the Red Book audio CD standard since 1982. Captures frequencies up to 22 kHz, covering full human hearing range.
Video production
48 kHz × 24-bit × 2 channels is the standard for professional video audio. Timecode alignment and editing precision are better at 48 kHz than 44.1 kHz.
Why this matters for Indian-language TTS
Sample rate affects voice clarity particularly at the sibilant end — /s/, /sh/, /ʂ/ sounds (common in Hindi, Tamil, Malayalam words) are high-frequency and degrade at low sample rates. For Indian-language TTS, 22 kHz is the minimum for clean voice; 44.1 or 48 kHz for professional output. VoisLabs synthesises voices at native 48 kHz for video pipeline compatibility.
Related terms
Bitrate
Bitrate is the amount of data used per second of audio, measured in kbps — higher bitrate means bett…
MP3
MP3 is a lossy audio compression format that produces small files with good audio quality — the de f…
WAV
WAV (Waveform Audio File Format) is an uncompressed audio container developed by Microsoft and IBM, …
AAC (Advanced Audio Coding)
AAC is a lossy audio codec that produces better audio quality than MP3 at the same bitrate — the def…
Mono vs Stereo
Mono is single-channel audio; stereo is two-channel (left + right) audio with directional informatio…
Learn more
Frequently Asked Questions
What sample rate should I record voice at?
Is 96 kHz better than 48 kHz?
Does sample rate affect file size?
Try VoisLabs — Indian-language TTS done right
1 minute free per day. 12 languages. Native Indian-script karaoke subtitles. No card required.
Start freeLast verified: 2026-04-21