Using WebAssembly and Cloudflare Workers to power genomics analysis

It sometimes feels like WebAssembly was made for genomics: A lot of bioinformatics tools are written in C/C++/Rust, and it is increasingly common for web apps to perform more of the analysis in the browser.

While it makes sense to run WebAssembly in the browser to analyze local data, what if that data is in the cloud? We don’t want web apps to download multi-gigabyte files just to analyze them in the browser, but at the same time, common cloud architectures for data analysis (Spot VMs, AWS Batch, etc.) get expensive real fast for tool builders.

Enter Serverless Functions

For some applications, using…

How SIMD makes compute-intensive tools practical on the web

If you’re new to WebAssembly, check out this article first.

WebAssembly can be a powerful way to speed up web apps, and it’s about to get a whole lot better thanks to the upcoming WebAssembly SIMD feature. SIMD is a technique to speed up calculations (more on that later), but it’s not yet supported in browsers. As a result, SIMD utilities that are ported to the web can’t use SIMD instructions, and therefore run much slower than they do on the command line.

To illustrate how powerful WebAssembly SIMD can be, here I compile a SIMD-based tool from C to…

Why the current definition could hurt WebAssembly adoption

I still remember my first encounter with WebAssembly a few years ago. Some called it a low-level bytecode for the web, others a sandboxed environment for execution, and others yet an instruction format for a stack-based virtual machine.

At the time, I had mostly done web development so I was quite puzzled by those definitions. In fact, here is actual footage of me hearing about WebAssembly for the first time:

Me hearing about WebAssembly for the first time

The following quote about teaching new concepts perfectly captures why I felt that way:

“Most beginners aren’t in a place where thorough…

A tutorial on using WebAssembly with Emscripten and C/C++ (even if you don’t know any C/C++)

JavaScript — which was famously designed in 10 days way back in 1995 — has until very recently remained the only language you could use natively on the web. It wasn’t until 2017 that WebAssembly 1.0 was released, making it the second language supported by all major web browsers.

What is WebAssembly?

As the name implies, you can think of WebAssembly as an Assembly language for the Web (though we’ll see later why it’s also useful outside the browser). As with other Assembly languages you’ve heard about in CS101, no one actually writes code directly in WebAssembly — except Ben Smith 😉. Instead…

How to port an Asteroids game from C to WebAssembly 🎮

WebAssembly is a new language for the web. Much like low-level assembly languages, however, very few people write WebAssembly by hand; instead, you can compile code written in other languages (e.g. C, C++ and Rust) to WebAssembly and run that code in the browser.

Why would you ever want to do that, you ask? One reason is portability: WebAssembly makes it easier to port existing games, desktop applications, and command-line tools to the web. Another reason is the potential for speeding up web apps by replacing slow JavaScript calculations with compiled WebAssembly.

In this article, we’ll focus on how to…

A behind-the-scenes look at how I self-published “Level up with WebAssembly” as a side-project in 4 months.

Earlier this month, I published the book Level up with WebAssembly 🎉. It aims to be a practical guide to using WebAssembly, a new programming language that helps you port existing C/C++/Rust code to the web (click here to learn more). I wrote the book because I found that while WebAssembly had a lot of potential, the learning curve was quite steep.

The last time I wrote a book (Adventures in Data Science with Bash), the most common question I got was: “How long did it take you?”. While I didn’t have a clear answer then, I do this time!

A Benchmark of VM Boot Times on the Google Cloud

For an overview of Google Cloud vs. AWS in terms of pricing and features, see my previous article, A Tale of Two Clouds.

A good rule of thumb on The Cloud™ is that VMs typically boot in 1 to 2 minutes. In this article, we’ll test this hypothesis on the Google Cloud by launching ~5,000 VMs and timing how long it takes for each one to boot.

To that end, the metric we’ll use is time-to-SSH, i.e. the number of seconds from the time we request a new VM to the time we can SSH into it. …

We all know arrays are useful, but how are they useful on the command line?

Although software engineers regularly use the command line for many aspects of development, arrays are likely one of the more obscure features of the command line (although not as obscure as the regex operator =~). But obscurity and questionable syntax aside, Bash arrays can be very powerful.

Wait, but why?

Writing about Bash is challenging because it’s remarkably easy for an article to devolve into a manual that focuses on syntax oddities. Rest assured, however: the intent of this article is to avoid having you RTFM.

A real (and actually useful) example

To that end, let’s consider a real-world scenario and how Bash can help: You take on a…

Although Excel can be a powerful tool for biologists, it also converts some gene names into dates. Here‘s an app to solve that issue.

UPDATE (June 2020): There’s now a web app version of Oct4th available at, and a command-line tool for larger datasets.

Microsoft Excel is likely one of the most widely used data analysis tools in the field of biology. And with reason. It enables almost anyone, with just a few clicks, to manage their data, quickly generate plots and calculate simple statistics.

In its infinite wisdom, however, Excel also interprets many gene names as dates, as was previously reported elsewhere. For example, the tumor-supressor gene DEC1 becomes December 1st, while the transcription factor OCT4 becomes October 4st:

Genes that look like dates are automatically converted to dates by Excel.

This is an…

In recent years, AWS has become the de facto standard cloud provider. As we’ll see in this article, it may be worth jumping off the bandwagon and taking a serious look at the Google Cloud.

Last updated on August 20, 2018.

Having used both Amazon Web Services (AWS) and Google Cloud Platform (GCP) for several projects, here I’ll highlight the differences between the two solutions as they relate to pricing, cloud products, instance configurations, and free trials.

Google Cloud wins on pricing

Google’s Cloud is the clear winner when it comes to compute and storage costs. For example, a 2 CPUs/8GB RAM instance will cost $69/month with AWS, compared to only $52/month with GCP (25% cheaper). As for cloud storage costs, GCP’s regional storage costs are only 2 cents/GB/month vs 2.3 cents/GB/month for AWS. Additionally, GCP offers a “multi-regional” cloud…

Robert Aboukhalil

Bioinformatics Software Engineer at Invitae, Author of “Level up with WebAssembly”.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store