Record audio through your browser microphone and save as MP3. No sign-up. No uploads. 100% private.
Recording audio directly in a web browser has become one of the most practical capabilities available to modern internet users. Whether you are a student capturing lecture notes, a journalist recording a quick voice memo, a podcast host testing microphone levels, or a remote worker saving meeting notes, having instant, private, in-browser voice recording eliminates the need to install any software. This tool gives you that capability with zero friction: click record, speak, stop, and download a universally compatible MP3 file - all without sending a single byte of your voice data to any server.
Understanding how browser audio recording works behind the scenes can help you get better results and make more informed decisions about the tools you use. The sections below explain the core technologies - the MediaDevices API, the Web Audio API, PCM audio data, and MP3 encoding - in plain language, so you can confidently use this tool and understand exactly what is happening to your voice data at every step.
When you record audio in a browser using native browser APIs, the output is typically saved in WebM format (using the Opus or Vorbis codec) or as an OGG file. While these formats are technically efficient, they have a significant practical problem: they are not universally supported. Windows Media Player, many professional audio editors, older smartphones, and countless online platforms struggle or outright refuse to open WebM audio files.
MP3 (MPEG-1 Audio Layer III) is the most universally supported audio format in the world. Virtually every device, operating system, media player, podcast platform, video editor, and transcription service accepts MP3 without any conversion step. This is why this tool encodes your recording directly into MP3 using the lamejs library before you download it.
What about WAV files? WAV stores audio as raw, uncompressed PCM (Pulse-Code Modulation) data - the digital representation of a sound wave captured as a sequence of numerical amplitude values thousands of times per second. WAV files are excellent for professional audio editing because they are lossless, but a one-minute WAV recording at standard quality can be 10 MB or larger. The same recording encoded as a 128 kbps MP3 is roughly 1 MB - roughly 10 times smaller - with no perceptible quality difference for voice audio. For voice memos and dictation, MP3 at 128 kbps or higher is the clear practical choice.
This tool uses two browser-native JavaScript APIs to access and process your microphone entirely within your browser tab - no network requests are ever made for your audio data.
First, the MediaDevices API (specifically navigator.mediaDevices.getUserMedia()) requests access to your microphone hardware and provides a live audio stream directly into the JavaScript environment. This stream never leaves your device.
Second, the Web Audio API (using an AudioContext and a ScriptProcessorNode) intercepts the raw audio samples from that stream in real time. The raw samples - called PCM data - are accumulated in memory as floating-point numbers representing the amplitude (loudness) of the audio at each moment in time. At 44,100 samples per second, every second of audio is captured as 44,100 individual numbers.
When you click Stop, those accumulated PCM samples are passed to the lamejs MP3 encoder, which is a pure JavaScript port of the industry-standard LAME encoder. lamejs converts the PCM data into a compressed MP3 binary file inside your browser. The result is handed to you as a Blob download - a local file transfer from JavaScript memory to your downloads folder - with zero server involvement at any stage.
The MediaDevices API is a standardized web platform interface that gives JavaScript code controlled access to media input devices - primarily your camera and microphone. It is part of the official W3C Web specification and is implemented in all major browsers including Chrome, Firefox, Safari, and Edge.
Browsers enforce a mandatory permission prompt before any website can access your microphone. This is a critical privacy protection: without it, any webpage you visit could secretly activate your microphone. When you first click the Record button on this page, your browser will show a popup asking whether to allow this site to use your microphone. You must click "Allow" to proceed. If you accidentally click "Block," you can re-enable access through your browser settings - in Chrome, click the padlock icon to the left of the address bar, then set Microphone to "Allow."
Importantly, this permission is granted only for the current session in most configurations, and it applies only to this specific page origin. Granting microphone access here does not give any other website access to your microphone. Additionally, because this tool processes everything locally, granting microphone access here does not result in any audio being transmitted anywhere - you are simply allowing the JavaScript on this page to read audio samples from your hardware, which it then uses only to build your local MP3 file.
PCM (Pulse-Code Modulation) is the fundamental digital representation of sound. When a microphone captures audio, it converts the continuous air pressure wave of sound into a sequence of numbers. This happens by sampling the wave at a precise, constant rate - typically 44,100 times per second (known as a 44.1 kHz sample rate, the same rate used on audio CDs). Each sample is a number representing the amplitude (height) of the audio wave at that instant. The result is a stream of raw numbers: precise, complete, but extremely large in file size.
MP3 encoding dramatically reduces this file size using a technique called perceptual audio coding. The encoder analyzes the PCM data and identifies audio information that human ears are unlikely to notice - sounds masked by louder sounds, frequencies above the threshold of human hearing, and quiet sounds that occur immediately after very loud sounds (a phenomenon called temporal masking). It discards this inaudible data, then compresses what remains. The aggressiveness of this process is controlled by the bitrate, measured in kilobits per second (kbps). At 128 kbps, the encoder allocates 128,000 bits per second to represent the audio - a roughly 10:1 compression ratio compared to uncompressed WAV at CD quality - while preserving virtually all perceptible audio quality for voice recordings.
Environment: The single biggest factor in voice recording quality is the room you record in. Hard reflective surfaces (bare walls, tile floors, glass) create echo and reverberation that muddy your voice. If possible, record in a small room with soft furnishings - carpet, curtains, and upholstered furniture absorb reflections. Closets full of clothing are surprisingly effective makeshift recording booths.
Microphone placement: Position your microphone 4 to 8 inches from your mouth, slightly off-axis (angled about 15-30 degrees rather than pointed directly at your lips). This reduces plosive sounds - the harsh bursts of air that occur on words starting with "P" and "B" - and mouth noise. If you are using a laptop's built-in microphone, get as close to the microphone grille as possible without shouting.
Background noise: Close windows and doors, turn off fans and air conditioning if possible, and silence any notifications on nearby devices. Even a distant air vent or refrigerator hum can be picked up by sensitive microphones and become distracting in recordings.
Bitrate selection: For voice memos, podcast recordings, and dictation, 128 kbps provides excellent quality with small file sizes. For interviews or content you plan to edit and publish, choose 192 kbps or 320 kbps to give yourself more headroom in post-production. The 64 kbps option is suitable for simple voice notes where file size is the top priority.
Use the visualizer: Watch the waveform on this page before you start your real recording. Speak at your intended volume and check that the waveform bars are active but not clipping - consistently maxing out at full height indicates your input volume is too high and distortion may occur. Adjust your microphone's input volume in your operating system's sound settings if needed.