WebAssembly and SIMD: A match made in the browser

How SIMD makes compute-intensive tools practical on the web

If you’re new to WebAssembly, check out this article first.

WebAssembly can be a powerful way to speed up web apps, and it’s about to get a whole lot better thanks to the upcoming WebAssembly SIMD feature. SIMD is a technique to speed up calculations (more on that later), but it’s not yet supported in browsers. As a result, SIMD utilities that are ported to the web can’t use SIMD instructions, and therefore run much slower than they do on the command line.

To illustrate how powerful WebAssembly SIMD can be, here I compile a SIMD-based tool from C to WebAssembly and show that SIMD is essential for high performance in the browser.

What is SIMD anyway?

Used by tools such as NumPy, Tensorflow, and PyTorch, SIMD is an approach for parallelizing code. Whereas multithreading works on multiple data in different threads, SIMD works on multiple data within a single CPU instruction, hence the name Single Instruction, Multiple Data.

For example, say you’re calculating the Manhattan distance between two vectors: |x2 – x1| + |y2 – y1|. Instead of calculating the two subtractions one at a time, SIMD lets you perform those subtractions in parallel, within a single CPU instruction! (assuming 64-bit numbers and a CPU with 128-bit registers, i.e. 128 / 64= 2).

In my field of genomics, SIMD is often used in tools that align DNA sequences to a genome, including SSW, bowtie2 and minimap2.

Test driving WebAssembly SIMD

As an experiment, I set out to compile the genomics tool SSW from C to WebAssembly, so it can run in the browser. SSW implements a SIMD version of Smith-Waterman, a fundamental algorithm in bioinformatics for aligning sequences to each others.

To that end, I used Emscripten to compile SSW twice from C to WebAssembly: once with SIMD enabled, and another time using SIMDe so the code compiles in the absence of SIMD (details here). As a benchmark, I measured the runtime of SSW when aligning short DNA sequences to the reference genome of the Lambda phage (not the AWS kind of Lambda 😉). The benchmarks ran on Chrome 87.0.4280.67 with the flag #enable-webassembly-simd, on a MacBook Pro 2.3GHz i7, 8 GB RAM.

The results are striking: without WebAssembly SIMD, the code is hundreds of times slower to run in the browser than it is natively on the command line:

When running the Smith-Waterman algorithm in the browser, enabling SIMD gets us much closer to native performance! Note that I stopped running the benchmark on “WebAssembly without SIMD” past x = 100 because it took ~10 minutes to run each iteration 😱

Interestingly enough, native runtime without SIMD is “only” 20X slower than native runtime with SIMD; if anyone has ideas on why that is, I would love to hear them.

📕 📗 📘 If you’re new to WebAssembly and wondering how to get started, check out my book Level up with WebAssembly, a practical and approachable guide to using WebAssembly in your own applications.

Can I use WebAssembly SIMD today?

Mostly no, but also yes. WebAssembly SIMD is not yet supported in any browser by default, but Firefox and Chrome have flags to enable SIMD (javascript.options.wasm_simd and enable-webassembly-simd, respectively). Chrome also has Origin Trials, which allow developers to enable SIMD in their app without requiring users to enable the flags explicitly.

Note, however, that if SIMD is not enabled, your app will simply crash. To address that, you can compile two versions of your app: one with SIMD and one without. Then you can use wasm-feature-detect to load the correct version. Just make sure you don’t use too many such new features or you’ll end up with a lot of permutations of your app 😬.

Should I use WebAssembly SIMD?

Just because you use WebAssembly does not mean your app will be faster, and the same is true of SIMD. Whether you decide to use it depends on what your application does. While a to-do list app might not be the best use case, apps with compute-intensive needs such as image processing, gaming, simulations, or bioinformatics are particularly well suited for WebAssembly and SIMD.

Closing remarks

As we’ve seen here, WebAssembly made it possible to run an existing genomics tool in the browser, whereas SIMD’s performance made it practical to do so. For computations that are “embarrassingly parallel” (often the case in bioinformatics), using SIMD alongside multithreading is a powerful combination — dare I say synergy? — for speeding up web applications.

Finally, a big thank you to the developers who are getting SIMD working on the web, and I look forward to seeing it enabled in all browsers!

Thanks to the Emscripten and SIMDe teams for their fantastic tools, Zhao et al. for their SIMD implementation of Smith-Waterman, and to Maria Nattestad and Thomas Lively for discussions.

📕 📗 📘 If you enjoyed this article and are looking for a practical guide to WebAssembly, check out my book Level up with WebAssembly.

Learn more about WebAssembly SIMD

Talks

Articles

Bioinformatics Software Engineer at Invitae, Author of “Level up with WebAssembly”.