Extract audio from video — extract vs convert, bitrates, and multi-track MP4s
"Extract audio from video" and "convert MP4 to MP3" sound like the same thing. They're not — and picking the right one decides whether your output is lossless and instant, or a small quality compromise that takes longer. Here's the honest guide, with bitrate math, multi-track handling, and podcast-vs-music workflows.
Extraction vs conversion — the distinction that matters
Every MP4 (and MOV, MKV, WEBM) is a container. It holds a video stream, an audio stream, sometimes subtitles, sometimes multiple of each. When you "get the audio out," there are two totally different operations you might mean:
- Extract — pull the existing audio stream out of the container and save it as a standalone file, without re-encoding. The bytes of the audio are identical to what was inside the MP4. Lossless. Near-instant (seconds, not minutes). Output format is determined by the source codec: AAC becomes .m4a or .aac, Opus becomes .opus, MP3 becomes .mp3.
- Convert — decode the source audio and re-encode it as a different format, typically MP3. Small quality loss (imperceptible in most cases, but it stacks if you convert repeatedly). Slower, because every sample gets processed.
If your source MP4 has AAC audio — and the vast majority do, because AAC is what YouTube, iPhone cameras, screen recorders, and most streaming services ship — and you're happy with an .m4a file, extraction is the right answer. It's faster and genuinely lossless. The only reason to convert to MP3 instead is compatibility: some older car stereos, legacy portable players, and weird corporate upload fields still require MP3.
How to decide: keep the codec or convert to MP3
Two questions, in this order:
- Where will this file play? If it's going to a phone, a computer, a modern browser, VLC, Spotify-like apps, Audacity, or Apple Music — anything post-2012 plays .m4a/AAC natively. Keep the codec. If it's going into a decade-old car stereo or a form that only accepts ".mp3," convert.
- Will you edit it further? If you're going to trim, normalize, or mix in another tool, extract losslessly first — then export from that tool to whatever final format you need. Don't do two lossy conversions.
A useful shorthand: if the sentence in your head is "I need the audio from this video," you want extract. If the sentence is "I need an MP3," you want convert.
The short version (both paths)
- Open Extract audio.
- Drop your MP4 / MOV / MKV / WEBM.
- Pick Keep original codec (lossless, fast) or Convert to MP3 (pick bitrate below).
- If the file has multiple tracks, pick which one (dialog, commentary, other language).
- Click Download.
Bitrate choices — what the numbers actually mean
Only relevant if you're converting to MP3. Bitrate is how many kilobits of audio the MP3 stores per second of playback. Higher number = more data = better fidelity + bigger file.
- 128 kbps — about 1 MB per minute. Fine for spoken word (lectures, audiobooks, podcasts, interviews). Noticeably compressed on music — cymbals sound papery, reverb tails collapse.
- 192 kbps — our default, about 1.4 MB per minute. Transparent for most listeners on most music. Blind A/B tests consistently show most people can't reliably tell 192 from lossless on a typical playlist.
- 256 kbps — about 1.9 MB per minute. The audible-improvement zone for trained ears on high-dynamic-range material: classical, well-mixed rock, jazz with lots of brush cymbal. Casual listening on earbuds: indistinguishable from 192.
- 320 kbps — the MP3 ceiling, about 2.4 MB per minute. Objectively bigger; subjectively usually indistinguishable from 256. Use if you're archiving or if you'll re-encode later and want the maximum headroom.
Honesty corner: you cannot add fidelity that wasn't in the source. YouTube ships AAC at roughly 128 kbps. A phone's built-in mic records AAC at 64-128. Re-encoding that to MP3 320 doesn't recover anything — it just bloats the file while preserving the original's ceiling. Match the bitrate to the source, or go one notch higher if you plan to edit.
Multiple audio tracks — how to pick the right one
This is where consumer converters quietly fail. Movies, TV shows, commentary podcasts, and a lot of streaming rips carry more than one audio track. Common setups:
- Track 1: original-language dialog (English 5.1)
- Track 2: director's commentary (English stereo)
- Track 3: dubbed audio (Spanish 5.1)
- Track 4: descriptive audio for accessibility
Most online converters take track 1 silently and don't even surface the others. If you wanted the commentary and you're staring at the main audio, the error is invisible. Our tool enumerates every audio stream with its language tag and duration, and lets you pick. If you're unsure which is which, extract each losslessly (seconds each) and preview.
Workflow: podcast / interview / lecture
If what you're pulling is mostly speech, the priorities flip away from fidelity and toward clarity and file size:
- Keep the codec if you can. AAC handles speech extremely efficiently at low bitrates — 96 kbps AAC sounds as clean as 192 kbps MP3 on voice.
- If converting, drop to mono. Speech is inherently mono; stereo doubles the file for zero perceived gain. Our tool exposes a mono toggle in Advanced.
- Normalize loudness. Podcast platforms (Spotify, Apple Podcasts) target around -16 LUFS. If your source was recorded quiet, normalize before uploading so your listeners don't have to yank the volume up.
- 128 kbps MP3 mono is the sweet spot for a distributable podcast file: ~500 KB per minute, universally compatible, sounds clean.
Workflow: music / music video / concert rip
Opposite priorities: you're preserving stereo separation, high-frequency detail, and dynamic range.
- Extract losslessly whenever possible. If the source is AAC, keep it as .m4a. If it's Opus (common on WEBM from YouTube), keep it as .opus. Both codecs are stereo- faithful and widely supported.
- If you must have MP3, use 256 or 320 kbps stereo. Don't drop below 192 on music unless file size is an emergency.
- Don't normalize hard. Music masters already encode deliberate loudness choices. Aggressive normalization flattens the mix. If you need matched loudness across a playlist, use ReplayGain-style tagging instead of destructive normalization.
- Preserve sample rate. Most video ships at 48 kHz. Most music services distribute at 44.1 kHz. If you resample, do it once with a good algorithm — not three times through three tools.
Other inputs we handle the same way
Same tool, different source formats:
- WAV → MP3 if you have uncompressed audio from a DAW or recorder
- FLAC → MP3 for lossless music archives going to a portable device
- AAC → MP3 when you've already extracted AAC and need MP3 for compatibility
- MP4 → MP3 direct, single-click
Honest comparison — desktop and online alternatives
Audacity (free, desktop)
Fantastic for edit-and-export workflows. If you need to trim, denoise, compress, or mix, Audacity is the right tool. Overkill and slow for "I just want the audio out of this one file." Also doesn't stream-copy — it always decodes and re-encodes.
ffmpeg CLI (free, desktop, power tool)
The engine under the hood of almost every converter on the internet, including ours. If you're comfortable in a terminal, ffmpeg -i input.mp4 -vn -acodec copy output.m4a is the canonical lossless-extract command and takes two seconds on a 1 GB file. Not friendly for people who don't want to type.
VLC (free, desktop)
Can convert via Media → Convert / Save, but the UI is famously confusing, the defaults re-encode even when copy would work, and batch operations require command-line flags anyway. Good playback tool, mediocre converter.
HandBrake (free, desktop)
Has an audio-passthrough preset that stream-copies the audio out, but the UI is centered on video re-encoding. To extract audio only, you drop the video track manually in Video tab, which is unobvious. Powerful, not the fastest path for this specific task.
Online-Audio-Converter and similar sites
Upload your file to their server, server runs ffmpeg, you download. Works fine until you consider: your 500 MB concert rip is now on someone else's disk with your IP in their logs. Also: almost none of them expose the lossless-copy option — they convert everything to MP3 by default, which is actively the wrong answer half the time.
Our tool
Runs ffmpeg in your browser via WebAssembly. Exposes the extract-vs- convert choice explicitly. Lists every audio track. No upload, no watermark, no sign-up. Trade-off: first visit downloads the ffmpeg wasm (~25 MB, cached forever after), and it's single-threaded, so it's 2-3× slower than a desktop ffmpeg on the same hardware. For extraction that's irrelevant (extract is near-instant anyway). For full re-encode of a two-hour movie, consider a desktop tool.
Common questions
Is the lossless extract really lossless?
Yes. When you pick "Keep original codec," we use ffmpeg's stream-copy mode (-acodec copy), which writes the audio bytes verbatim to the new container. No decode, no re-encode, no quality loss. You can verify by comparing checksums of the audio samples.
My output is .m4a, not .mp3 — can my phone play that?
Every iPhone and every Android phone from the last 10+ years plays .m4a natively. So does every modern car infotainment system, Windows Media Player, VLC, Spotify, Apple Music, and every browser. The only places .m4a fails are very old car stereos and some industrial/automotive upload forms that literally reject anything that isn't named .mp3.
Will this pull audio from a YouTube link?
No. We don't scrape streaming services — that's against their terms of service and most use cases violate copyright. You need a video file you already have on disk. Download the video legitimately (YouTube Premium's download feature, a purchased download, a local recording), then drop it into our tool.
How do I pick the right audio track without guessing?
Our tool shows every audio stream's language tag (e.g. eng, spa, commentary) when the source file embeds it. Most movie rips do. Home recordings usually only have one track, so there's nothing to pick. When tags are missing, extract each one losslessly (seconds each) and preview.
Does this work for podcasts with chapter markers?
If the source file has chapter metadata and you keep the original codec, chapters are preserved. If you convert to MP3, chapters are dropped — MP3 supports them only via non-standard ID3 extensions that many players ignore. Another reason to extract rather than convert when the codec is already suitable.
Does this work offline?
Yes, after first load. Visit the tool once to let your browser cache the ffmpeg wasm binary, then you can disconnect from the internet and drop files — extraction and conversion both still run. Genuinely local.
How big a file can I process?
Limited by your browser's RAM. Typical desktops handle 2 GB MP4s comfortably. For extraction (stream-copy) the memory footprint is tiny even on huge files — we've tested 8 GB 4K movie rips. For full re-encode, 2-3 GB is a safer ceiling; beyond that, reach for a desktop tool.
Ready?
Extract audio →. Pick extract for speed and fidelity, convert when you need MP3 specifically, and match the bitrate to what's actually in your source.