Dennis Hackethal’s Blog

My blog about philosophy, coding, and anything else that interests me.

Published

Recording, Sending, and Receiving Audio Using JavaScript

Say you’re working on a web-based chat app. You wish to let users record and send audio messages, similar to the feature the iOS Messages app offers. Luckily, modern browsers provide APIs that make implementing this feature surprisingly easy.

Recording audio

let recording;

navigator.mediaDevices.getUserMedia({ audio: true }).then(stream => {
  mediaRecorder = new MediaRecorder(stream);

  mediaRecorder.addEventListener('dataavailable', event => {
    // event.data is a blob.
    // It looks something like: Blob {size: 26515, type: 'audio/webm;codecs=opus'}
    recording = event.data;
  });

  mediaRecorder.addEventListener('stop', () => {
    // Do something with the `recording` variable.
  });

  mediaRecorder.start();

  setTimeout(() => mediaRecorder.stop(), 5000);
});

Since we only want to record audio, not video, we pass { audio: true } to getUserMedia. Keep in mind that calling this function for the first time will cause the browser to ask the user’s permission to record. In this example, we run it immediately for the purpose of illustration, but in real life you may want to wait until it’s actually time to record – for example, until the user clicks a record button.

Likewise, we stop the recording arbitrarily after five seconds, but you could easily imagine other triggers such as the user clicking something or letting go of a record button.

Since we call mediaRecord.start with no arguments, the dataavailable event will trigger only once: when the recording is stopped. You can optionally pass an integer representing milliseconds to the start function, in which case dataavailable will trigger every time that amount of milliseconds passes. In that scenario, instead of a simple recording variable, you’d want something like a recordings array and push event.data onto it. Otherwise, you’d only ever hold on to the most recent part of the recording.

event.data is a blob – a binary large object. We’ll get into how to process it below.

It’s important to process the blob in the stop event handler. If you do it anywhere else – e.g., within the call to setTimeout itself – you can run into a race condition where your recording variable is undefined because it hasn’t been set yet. (I ran into this issue myself.)

Sending audio

Now that we have our audio blob, we need to send it over the wire. To do this, we first have to serialize the blob somehow. If we try to directly turn a blob into, say, JSON, it will always return a stringified empty object, even if the blob isn’t empty:

JSON.stringify(new Blob(['hello world'], {type: 'text/plain'}))
// => '{}'

Luckily, modern browsers’ built-in FileReader API solves this problem easily. We can turn the blob into, say, a base-64 string. Let’s modify our stop event listener:

mediaRecorder.addEventListener('stop', () => {
  let reader = new FileReader();

  reader.onloadend = e => {
    if (e.target.readyState !== FileReader.DONE) {
      throw 'Something wen't wrong trying to serialize the audio recording.';
    }

    // e.target.result evaluates to a base-64 string representing the recording.
    // It looks something like: data:audio/webm;codecs=opus;base64,GkXfo59ChoEBQveBA...
    // Now, we can send it over the wire:
    someSocket.send(e.target.result);
  };

  reader.readAsDataURL(recording);
});

The reason e.target.result is a base-64 string is that we call readAsDataURL on the reader. Other methods, such as readAsText, return other data.

Receiving audio

On the receiving end, you’ll be expecting a base-64 string representing the audio recording. If all you want to do is play the audio, you don’t need to convert it back to a blob. Just pass the base-64 string as an audio element’s src:

someSocket.on('receiveAudio', base64String => {
  let audio = document.createElement('audio');

  audio.controls = true;
  audio.src = base64String;

  document.body.appendChild(audio);
});

Once the user hits the play button on the audio element, he’ll be able to hear the recording. If you don’t want to deal with audio elements, you can simply do:

let audio = new Audio(base64String);
audio.play();

Just note that, for the latter approach, browsers may not actually play the audio unless the user has interacted with the page.

That’s it!

PS: If, for whatever reason, you do need to turn the base-64 string back into blob, a common trick is to use fetch:

let response = await fetch(base64String);
response.blob();
// => Blob {size: ..., type: '...'}

PPS: You may have noticed that browsers display an icon to the user as you record. For example, at the time of writing, Chrome displays a red circle in the tab header. Note that simply stopping recording does not make this indicator go away. You need to close the audio track. As described here, at the beginning of the stop handler, write:

stream.getTracks().forEach(t => t.stop());

Then the recording indicator should disappear.


What people are saying

What are your thoughts?

You are responding to comment #. Clear

Preview

Markdown supported. cmd + enter to submit. Your comment will appear upon approval. You are responsible for what you write.
This small puzzle helps protect the blog against automated spam.

Preview