@radekmie

On Fast Paths

By Radosław Miernik · Published on · Comment on Reddit

Table of contents

Intro

Planning, testing, peer-reviewing, or even formal proofs; in a way, it all boils down to making sure whether the piece of code we just created covers all the cases we need. We often start with a simpler alternative, which covers most of them (or even only some), as it allows us to bring something to our customers earlier. Huh, that’s basically how startups work.

But then a flood of edge cases come from all directions – these pesky inputs that make our well-designed and efficient solution… The complete opposite. “It’s just one more case, it’ll take a couple of hours, like the previous one, right?” Well, not this time – this one was one too many.

But let’s pull through this hot mess and assume you covered all of them. This unveils a new kind of issues: in some cases, this entire machinery runs forever. “Why is it taking ten seconds to load this one document? It’s just one document!” In such a case, “Because it can load ten thousand documents at all.” is a reasonable, but not an expected answer – it makes sense for someone technical, but it isn’t the best slogan for boosting sales.

Did you really think you were the only one who did that?

In the wild

If we search for "fast path" in the Node.js repo, we’ll get almost 200 code results. One of them is an “almost entirely C++” case for writeFileSync using the UTF-8 encoding, added in nodejs/node#49884. These 96 lines of code (and 60 lines of tests) make the arguably most common case for writeFileSync up to 2.5x faster! In fact, it was so beneficial it made it to the v21.3.0 changelog.

A similar search in the .NET runtime results in nearly 190 hits. A rather recent change, dotnet/runtime#103733 (landed in .NET 9.01) was about reworking the JsonNode and JsonValue, so that the most common case is not slowed down by some niche feature. As a result, DeepEquals got over 10x faster and is no longer allocating any memory. However, this time the PR was much larger.

Of course, fast paths are not limited to language runtimes – MongoDB has nearly 50 of them. The most popular one, IDHACK, is a dedicated query plan for _id queries. It’s hard to estimate the performance benefit of it, but looking at the general IXSCAN implementation, IDHACK has much less work to do.

One more benefit of some fast paths could be a simpler mental model. This piece of React’s reconciler (a core part of the virtual DOM implementation) is very similar to its much simpler alternative, domc. I never seen the latter library before, but it feels natural to me to have a dedicated path in such a code. Even Svelte does the same for removal.

And at home

I spent some time optimizing rendering for low-end devices, like that cheap Android tablet from your local supermarket2. It wasn’t easy, as everything was terribly slow in there, and nothing stood out in particular. Among the smaller things, I found this small function:

export function formatTime(millis: number, format = 'HH:mm') {
  const hour = Math.floor((millis / 3600000) % 24);
  const minute = Math.floor((millis % 3600000) / 60000);
  return moment({ hour, minute }).format(format);
}

It’s perfectly valid, though if you ever worked with Moment.js (yes, I know it’s no longer maintained), you’re most likely aware of its performance. It is an overkill, but it’s easy to understand and covers all interesting cases – 12/24h with(out) leading zeroes. And don’t worry about timezones, we only call it with millis representing “time since midnight”.

Now, calling it 10k times with no specified format (so using the default HH:mm) takes 31ms in Safari, and 16ms in Chrome, on my M2 MacBook Air. Is anyone using a different format? From what I checked, it’s less than 0.1% of all users… So let’s add a fast path:

 export function formatTime(millis: number, format = 'HH:mm') {
   const hour = Math.floor((millis / 3600000) % 24);
   const minute = Math.floor((millis % 3600000) / 60000);
+  // Fast-path for the most common format.
+  if (format === 'HH:mm') {
+    return `${hour < 10 ? '0' : ''}${hour}:${minute < 10 ? '0' : ''}${minute}`;
+  }
   return moment({ hour, minute }).format(format);
 }

Is it better? 0.23ms in Safari (134x faster) and 0.38ms in Chrome (42x faster). I’d go for it any day – it’s a trivial change that doesn’t make the code far more complex. Of course, you should add tests for that as well!

Closing thoughts

While the performance improvements are at least tempting, are there any downsides? Increased complexity is for sure one. It varies from one if to a whole redesign just to make it possible. In some cases it’s simply not worth it.

One function done, twenty more to go…

1

If you’re even remotely interested in programming language implementation, JIT compilation, or simply enjoy reading about performance optimizations, I highly recommend reading the Performance improvements in .NET 9 as well as the posts about the previous versions. They’re massive and take forever to go through, but the inspiration they fill you with is worth it.

2

Isn’t it crazy that you can buy bread and hardware at the same store? Like, I’m putting a bag of onions to my cart next to someone deciding on their printer.