Dennis Hackethal’s Blog
My blog about philosophy, coding, and anything else that interests me.
Recording, Sending, and Receiving Audio Using JavaScript
Say you're working on a web-based chat app. You wish to let users record and send audio messages, similar to the feature the iOS Messages app offers. Luckily, modern browsers provide APIs that make implementing this feature surprisingly easy.
Recording audio
let recording;
navigator.mediaDevices.getUserMedia({ audio: true }).then(stream => {
mediaRecorder = new MediaRecorder(stream);
mediaRecorder.addEventListener('dataavailable', event => {
// event.data is a blob.
// It looks something like: Blob {size: 26515, type: 'audio/webm;codecs=opus'}
recording = event.data;
});
mediaRecorder.addEventListener('stop', () => {
// Do something with the `recording` variable.
});
mediaRecorder.start();
setTimeout(() => mediaRecorder.stop(), 5000);
});
Since we only want to record audio, not video, we pass { audio: true }
to getUserMedia
. Keep in mind that calling this function for the first time will cause the browser to ask the user's permission to record. In this example, we run it immediately for the purpose of illustration, but in real life you may want to wait until it's actually time to record – for example, until the user clicks a record button.
Likewise, we stop the recording arbitrarily after five seconds, but you could easily imagine other triggers such as the user clicking something or letting go of a record button.
Since we call mediaRecord.start
with no arguments, the dataavailable
event will trigger only once: when the recording is stopped. You can optionally pass an integer representing milliseconds to the start
function, in which case dataavailable
will trigger every time that amount of milliseconds passes. In that scenario, instead of a simple recording
variable, you'd want something like a recordings
array and push event.data
onto it. Otherwise, you'd only ever hold on to the most recent part of the recording.
event.data
is a blob – a binary large object. We'll get into how to process it below.
It's important to process the blob in the stop
event handler. If you do it anywhere else – e.g., within the call to setTimeout
itself – you can run into a race condition where your recording
variable is undefined
because it hasn't been set yet. (I ran into this issue myself.)
Sending audio
Now that we have our audio blob, we need to send it over the wire. To do this, we first have to serialize the blob somehow. If we try to directly turn a blob into, say, JSON, it will always return a stringified empty object, even if the blob isn't empty:
JSON.stringify(new Blob(['hello world'], {type: 'text/plain'}))
// => '{}'
Luckily, modern browsers' built-in FileReader
API solves this problem easily. We can turn the blob into, say, a base-64 string. Let's modify our stop
event listener:
mediaRecorder.addEventListener('stop', () => {
let reader = new FileReader();
reader.onloadend = e => {
if (e.target.readyState !== FileReader.DONE) {
throw 'Something wen't wrong trying to serialize the audio recording.';
}
// e.target.result evaluates to a base-64 string representing the recording.
// It looks something like: data:audio/webm;codecs=opus;base64,GkXfo59ChoEBQveBA...
// Now, we can send it over the wire:
someSocket.send(e.target.result);
};
reader.readAsDataURL(recording);
});
The reason e.target.result
is a base-64 string is that we call readAsDataURL
on the reader
. Other methods, such as readAsText
, return other data.
Receiving audio
On the receiving end, you'll be expecting a base-64 string representing the audio recording. If all you want to do is play the audio, you don't need to convert it back to a blob. Just pass the base-64 string as an audio
element's src
:
someSocket.on('receiveAudio', base64String => {
let audio = document.createElement('audio');
audio.controls = true;
audio.src = base64String;
document.body.appendChild(audio);
});
Once the user hits the play button on the audio element, he'll be able to hear the recording. If you don't want to deal with audio elements, you can simply do:
let audio = new Audio(base64String);
audio.play();
Just note that, for the latter approach, browsers may not actually play the audio unless the user has interacted with the page.
That's it!
PS: If, for whatever reason, you do need to turn the base-64 string back into blob, a common trick is to use fetch
:
let response = await fetch(base64String);
response.blob();
// => Blob {size: ..., type: '...'}
PPS: You may have noticed that browsers display an icon to the user as you record. For example, at the time of writing, Chrome displays a red circle in the tab header. Note that simply stopping recording does not make this indicator go away. You need to close the audio track. As described here, at the beginning of the stop
handler, write:
stream.getTracks().forEach(t => t.stop());
Then the recording indicator should disappear.
What people are saying