On Using Rust in Parcel and Vitest

By Radosław Miernik · Published on 28 February 2023 · Comment on Reddit

Intro

While working on my PhD, I had to create a rather sizable set of tools for a new general game playing language. It includes a parser, a bunch of optimizing and validating passes, transformations to different formats, and a tree-walking interpreter. All of this was done in my primary language of choice – TypeScript.

Such a project takes a lot of experimenting, and having a fully-fledged IDE with a way to visualize them is basically a must. I’m not there yet, but a basic syntax highlighter, a type checker with somewhat helpful error messages, and a visual representation are enough to work with. All of that is in your browser and accessible on all devices with no setup!

Then, as the games got more and more complex, it started to take a while to run all of the tests. While the small ones take less than 50ms to analyze, the more complex ones are at 3s and counting. I decided to do something about it and…

Started rewriting it part by part in Rust. That’s what people do, right?

Starting out

Let’s sum up what the project looked like before. The core of it is located in the lib directory, and it’s not tied to the UI or CLI – it exports a couple of functions, and that’s it. It’s highly modular by design and has a lot of pluggable parts, e.g., the optimization passes are configured with flags.

I use Vitest for testing. Nothing crazy; actually, 99% of tests are snapshot tests, i.e., call a function and check if the result matches a given value. Basically, a === test with a smart diff and automatic regeneration of the expected value.

The CLI uses commander to parse arguments, reads files from the disk, and calls the lib functions. The UI is a React app using the Blueprint toolkit. Both are built using Parcel, including the local server with hot reloading.

To summarize, you can npm start to run a local server, npm test to run the tests, and npm run build to build the CLI and UI apps. I wanted to keep these three both because it’s easy to pick up for other people and because it keeps the CI simple. (Yes, there’s a CI deploying the UI to GitHub Pages.)

Adding some Rust

I decided to start with the interpreter – it had virtually no dependencies and required a little bit of everything: passing things between TypeScript and Rust, a complex data structure, and a potentially CPU-heavy operation.

The first version was even smaller – run the CLI to output something that the Rust program can consume and run. On the TypeScript side, it was as easy as one JSON.stringify call. On the Rust side, I replicated the type definitions and used serde to automatically build them from JSON. So far, so good.

I can’t share any of the code yet, but I can say it has something to do with automata, and the Rust interpreter was a lot faster. Like, three-fold faster. Sure, it was a manual process to run two programs instead of one, but it’s not something a small Bash script wouldn’t help with.

Web Assembly to the rescue

I had some prior Web Assembly experience; I even wrote a .wat program or two back in 2018 (note to self: never use iframes in slides). While looking for the state-of-the-art of Rust to WASM, I found this guide on MDN and decided to go with wasm-pack, which uses wasm-bindgen underneath. It took me less than an hour to get up and running with some example calls.

The idea was to reuse the existing Rust code as much as possible, including the serde-based parser. It required zero extra code, and the entire communication was based on JSON strings. Here’s the gist of it:

// Expose this function in WASM.
#[wasm_bindgen(js_name = "optimizeSomehow")]
pub fn optimize_somehow(input: &str) -> String {
  // Deserialize.
  let mut game = serde_json::from_str::<Game<&str>>(input)
    .expect("deserialization failed");

  // Perform some operation.
  game.optimize_somehow();

  // Serialize.
  serde_json::to_string(game)
    .expect("serialization failed")
}

# Build the WASM module and place it directly in the node project.
wasm-pack build \
  --out-dir ../node/src/wasm/module \
  --out-name index \
  --target web

import { readFileSync } from 'fs';

// If you see an error here, make sure to build the Rust module first!
import init from './wasm/module';

// This function is synchronous, but it won't work before the module
// will be initialized (i.e., `initPromise` resolves).
export { optimizeSomehow } from './wasm/module';

// Parcel inlines the WASM module in the browser and references it from
// the disk in the Node.js bundle.
const buffer = readFileSync(__dirname + '/wasm/module/index_bg.wasm');
export const initPromise = init(buffer);

Such a setup works well with the default Parcel config, as it simply inlines the .wasm module for the browser but leaves the readFileSync in the Node.js bundle. Thanks to the fact we build it into the src, it automatically refreshes the local server too. If you saw a problem regarding the crypto module in Node.js, try the following:

// Node.js requires a crypto polyfill. Importing it directly inlines it
// in the browser too, but we don't need it there. Yep, this is a nasty
// `eval` trick. Sorry for that!
if (typeof crypto === 'undefined') {
  eval("globalThis.crypto = require('crypto').webcrypto;");
}

Now the last part is to make it work with Vitest. It’s easy to wait explicitly for the initPromise in both the CLI and React app somewhere, but we don’t want to add an explicit await in every single test. Thankfully, we can configure a globalSetup with the following contents:

import * as wasm from './src/wasm';

export function setup() {
  return wasm.initPromise;
}

export function teardown() {}

Actually, it’s possible to initialize the module fully synchronously too, using the initSync function generated by wasm-pack. It’ll work in Safari and Node.js but will throw an error in Chrome:

RangeError: WebAssembly.Instance is disallowed on the main thread, if the buffer size is larger than 4KB. Use WebAssembly.instantiate.

Unblocking the main thread

The above setup requires an asynchronous module initialization, but once it’s done, all of the operations remain synchronous. This is great because it requires no changes in the existing code using it. However, if the operation takes more time, it blocks the UI – just like the TypeScript version.

To solve this problem, we can create a Web Worker and execute the WASM module there. The communication will be fully asynchronous, based on the postMessage API. Let’s create a basic, RPC-like worker:

import { readFileSync } from 'fs';

import { initSync, optimizeSomehow } from './module';

// Node.js requires a crypto polyfill. Importing it directly inlines it
// in the browser too, but we don't need it there. Yep, this is a nasty
// `eval` trick. Sorry for that!
if (typeof crypto === 'undefined') {
  eval("globalThis.crypto = require('crypto').webcrypto;");
}

// Parcel inlines the WASM module in the browser and references it from
// the disk in the Node.js bundle.
initSync(readFileSync(__dirname + '/module/index_bg.wasm'));

const methods = { optimizeSomehow };
self.addEventListener('message', ({ data }) => {
  try {
    // Call the function synchronously. If it returns, reply with value.
    self.postMessage({ value: methods[data.fn](...data.args) });
  } catch (error) {
    // If it throws, reply with error. Keep in mind that the error
    // instance itself is NOT transferable!
    self.postMessage({ error: error.message });
  }
});

The worker was easy, now we have to communicate with it. To keep it simple, there’s an execution queue of limit 1, i.e., only one call happens at the same time. And because we love TypeScript, everything is typed as it should be.

import pLimit from 'p-limit';

import { Game } from '../game';

// Node.js requires a Worker polyfill.
if (typeof Worker === 'undefined') {
  eval("globalThis.Worker = require('web-worker');");
}

// Parcel will bundle the `worker.ts` file.
const worker = new Worker(new URL('worker.ts', import.meta.url), {
  type: 'module',
});

// A queue to call at most one function at a time.
const queue = pLimit(1);

type WASM = typeof import('./module');
function workerMethod<Name extends keyof WASM>(
  fn: Name,
  args: Parameters<WASM[Name]>,
) {
  return queue(
    () =>
      // Wrap the worker call in a promise.
      new Promise<ReturnType<WASM[Name]>>((resolve, reject) => {
        function onError({ error }: ErrorEvent) {
          reject(error);
          worker.removeEventListener('error', onError);
          worker.removeEventListener('message', onMessage);
        }

        function onMessage({
          data,
        }: MessageEvent<
          | { value: ReturnType<WASM[Name]> }
          | { error: unknown }
        >) {
          if ('error' in data) {
            reject(data.error);
          } else {
            resolve(data.value);
          }

          worker.removeEventListener('error', onError);
          worker.removeEventListener('message', onMessage);
        }

        // Register event handlers and call the worker.
        worker.addEventListener('error', onError);
        worker.addEventListener('message', onMessage);
        worker.postMessage({ fn, args });
      }),
  );
}

// Expose a nice async interface on top of the queue.
export async function optimizeSomehow(game: Game) {
  const input = JSON.stringify(game);
  const result = await workerMethod('optimizeSomehow', [input]);
  return JSON.parse(result) as Game;
}

This approach executes all of the operations in a separate thread, keeping the UI responsive at all times. You can send additional messages in between to indicate the operation progress, e.g., the number of processed entries.

Keep in mind, that the entire API is now asynchronous, and it may require a lot of work to handle it everywhere. In my case, the CLI was easy, but the UI took some extra work (and loaders).

Finally, we have to take care of tests. Vitest has problems with my Web Worker polyfill, but adding the @vitest/web-worker just worked – no configuration needed. Of course, we can remove the globalSetup too.

Rust all the way

It’s been a couple of weeks now, and I really liked the entire setup. Sure, Rust compilation takes significantly more than reloading a TypeScript module (1s vs 200ms), but the performance gains were massive.

Also, because the entire TypeScript setup was based on discriminated unions, i.e., all of the types had a unique kind field, it was extremely easy to migrate the code – replace all switch statements with match and call it a day.

There are also downsides. As I moved more and more functionalities, I had to use more and more crates (Rust packages). It increased the compilation times even further (2-5s incremental build) and made the .wasm bundle rather big (>1MB). I decided to implement some logic myself, saving roughly 45% of it.

At first, I was a little worried that some crates I was using won’t work with wasm-bindgen. It turned out almost everything I used so far just worked: nom, rand, regex, serde, and serde_json. std::time::Instant didn’t, but I didn’t want to install an additional crate to support it, so I got rid of it. The only thing I had to do was to enable the js feature in getrandom crate.

Closing thoughts

Would I do it again? Sure! I even plan to use such a setup in some commercial projects too. It took me some time to figure things out, especially to make it work in the browser, CLI, and tests at the same time, but it was worth it.

I wrote On Rust in Webdev more than a year ago, and it aged surprisingly well; the ecosystem is solid, the community vivid, and the tooling works like a charm. If you haven’t tried Rust yet and you work with TypeScript a lot, give it a try!

I’m heading back to work now – there’s still some code to move…