Dear Apple users everywhere! It's a huge honour to announce that our development team at mix:analog have finally succeeded. We implemented lossless real-time streaming in the Apple Safari browser. It took us a few months to get there, but finally it is here for all of you to enjoy.
What's Wrong With Regular Streaming?
Safari already supports FLAC when streaming over a regular HTTP link. This can be experienced by clicking on a link to a FLAC file and hearing playback before download is complete. However, our app needs lots of additional information to come together with audio, which we call sideband data.
Sideband data feeds the various metering tools that need to be updated at exactly the same time as the user is hearing the matching sound. This information is read from hardware sensors at the same time as the audio is encoded and can't be calculated in the browser and the accuracy is super important to our users.
When using regular streaming, this information is lost when it reaches the browser or can't even be sent along with audio correctly. The playback position information is poor quality and can't be used to synchronise with metering information accurately. Safari in particular does not update this information frequently enough.
Additionally, regular HTTP streaming in the web browsers has a huge amount of buffering that the programmer can't easily adjust. This means that latency is usually very high (several seconds) and varies a lot from browser to browser. This is ok for internet radios but is not acceptable for real-time use like in our app.
Media Source Extensions (MSE)
Playing back pre-recorded content like YouTube and Netflix can often rely on a technique called MSE (Media Source Extensions). This was developed to address some of the issues of regular streaming. The main selling point of MSE is the ability to glue together a custom multimedia stream from various parts.
In the context of a Netflix stream it could mix 720p video, Italian subtitles and English audio for instance. But then the user changes a setting and the stream is updated with Italian audio without subtitles. This can happen and change as the stream is running.
Regarding latency, MSE yields control over buffering depth to the application. If the application is aggressive enough and network is fast enough, you can get really low latency with MSE, even below 50ms, which is really great for our app!
MSE Issues for Lossless Streaming
Sounds too good to be true? Well in many cases it does work out great, but be aware there are caveats. For example, not all codecs are supported in all browsers equally. For example, the lossless audio codec FLAC is supported for HTTP streaming in Safari, but not for MSE.
To use FLAC with MSE you need an MP4 and not a vanilla FLAC container. This makes sense because sub-streams like different audio tracks in a video can come and go during playback, based on user interaction and availability. This is a feature that the MP4 container can handle and FLAC can't.
However, while you can play and pause MSE streams, they are subject to the same playhead location resolution problem like normal HTTP streaming. Every browser has their own update frequency, resolution and behaviour. This makes it really hard to sync animations with the sound coming out of the speakers accurately.
Implementation Details (including Safari path)
To make this possible, we needed to split tasks between many different pieces of code, here is a handy diagram:
The API client code is in charge of talking with a server on the internet via WebSockets. It can handle requests, responses and raise events on both the client and server sides. When audio is ready to be processed, an event is triggered and its data is handed over to a high performance decompression Web Worker.
This worker will fist split the information into sideband and audio information. Then it will decompress the audio data into variably sized chunks, which are then pushed into a queue. The sideband data is collected and sent to the UI renderer where it will be drawn to the screen when the time is right.
This queue is then picked up by another component that handles the actual streaming of data into the Audio Interface (finally!).
Audio is an interesting topic on the internet. We developers want to have a lot of control over how audio is played back in the browser. However, too much of that could mean that every web page out there would be playing audio ads without recourse. So there are rules and bodies that govern this sort of thing and sometimes progress is slow.
Still, a big breakthrough happened when Google's Chrome started supporting the latest addition to the WebAudio standard: Audio Worklets. These are isolated pieces of code that can send audio almost directly into the audio interface of the computer. It must satisfy certain conditions, but they are not too hard to get right.
As a double-whammy, Google Chrome Labs have released a piece of code that emulates this functionality in other / older browsers such as Mozilla Firefox, Microsoft Edge and Apple Safari (commonly known as a polyfill).
At mix:analog we took the opportunity and reengineered the playback code in our product around Audio Worklets, Web Workers and Web Assembly. These are all high performance unlocking pieces of code that mesh well with our goal of real-time lossless audio streaming across continents.
How Do I Take Advantage Of That?
Actually, you need to do nothing. The new code is on by default in the recent builds of mix:analog (as of January 2019) and can be further tweaked by selecting an amount of buffering from the dropdown menu.
This works in Safari, Firefox, Microsoft Edge and of course the fastest and most responsive browser that supports mix:analog: Google Chrome.
If you are a developer here are a few links to open source libraries that you can glue together to get the desired result: