@radekmie

On Optimizing Meteor Publications

By Radosław Miernik · Published on · Comment on Meteor Forum, Reddit

Table of contents

Intro

Making a server do less work is often a good thing. That’s why most, if not all, large-scale Meteor deployments use cultofcoders:redis-oplog. Today, we’ll focus on fine-tuning of this package. If you want to know why, go ahead and read On Oplog Replacement in Meteor first.

Our goal is to reduce the number of events a Meteor instance has to process. It can be done with custom channels and namespaces (which are based on custom channels anyway). The idea is simple: instead of getting notified on every change in the collection, we’d like to narrow it down to the published documents only.

It’s perfectly doable with in-app changes only, though you’d need to configure both in every single publication, which is doable… But you’d need to configure it in every single insert/remove/update, too.

And as I’m rather lazy a fan of automated solutions, it’s now implemented in changestream-to-redis via the NAMESPACES option.

How to use it

As this solution is applicable to practically all of your publications, let’s start with the in-app changes first. We’ll configure a single namespace based on the field we’re querying the documents with. Here’s an example:

 Meteor.publish('myTasks', function () {
   // Some authorization logic here...
-  return Tasks.find({ userId: this.userId });
+  return Tasks.find({ userId: this.userId }, {
+    namespace: `userId::${this.userId}`,
+  });
 });

Without any further changes, this publication will only send the initial state and nothing else. Now, we could change all mutating operations as follows:

-Tasks.insert(task);
+Tasks.insert(task, { namespace: `userId::${task.userId}` });

Instead, we could configure changestream-to-redis to generate all these namespaced messages automatically; it’s as easy as adding an environmental variable NAMESPACES=tasks.userId (where tasks is the collection name and userId is the field name).

It also works if the field you’re querying by is an array or object! Arrays publish one namespaced message for each element, and objects publish one for each key. That means the following queries also work:

 Meteor.publish('projectsByScope', function (scope) {
   // Some authorization logic here...
-  return Projects.find({ scopes: scope });
+  return Projects.find({ scopes: scope }, {
+    namespace: `scopes::${scope}`
+  });
 });

 Meteor.publish('groupsByRole', function (role) {
   // Some authorization logic here...
-  return Groups.find({ [`roles.${role}`]: { $exists: true } });
+  return Groups.find({ [`roles.${role}`]: { $exists: true } }, {
+    namespace: `roles::${role}`
+  });
 });

Configuration remains simple: NAMESPACES=groups.roles,projects.scopes.

And finally, if you’re querying by multiple fields, it’s up to you to decide which one to use. As a rule of thumb, it’s better to use the more selective one, i.e., the one that results in fewer documents when queried alone.

Implementation

Under the hood, changestream-to-redis will request the full documents of the selected collections to generate the namespaced messages based on their fields’ values (source). However, as the delete change events don’t include full documents, it also requests the pre-images.

Now, as it may significantly increase the number of messages pushed to Redis, consider increasing REDIS_BATCH_SIZE. We’re at 100 now, and the average latency remains at a stable 10 milliseconds. (You can track the latency using the Prometheus metrics exposed with METRICS_ADDRESS.) For the same reason, make sure to monitor the RAM of your Redis instance. In our case, it increased by around 20 megabytes (definitely not a lot, but not negligible).

Results

We rolled out all the publication changes over one week, releasing a small batch every day. In total, we changed roughly 40 publications and 200 lines, including the necessary infrastructure configuration.

And the results? I think they’re crazily good! Across three days, the number of fetched documents dropped from over 453 to roughly 35 million – almost 93% less. But more importantly, the number of oplog notifications dropped from over 306 to less than 1.5 million – that’s over 99.5% less!

Publication metrics before

Publication metrics before

Publication metrics after

Publication metrics after

You may have noticed that the number of observer changes (i.e., changes pushed to the client) also slightly increased (roughly by 28%). Since we only changed the number of server notifications and the traffic remained stable, the number of changes should stay the same, too.

This change is actually desirable! Due to a very high number of writes in some collections, we had to switch to polling (disableOplog). With namespaced publications, we were able to make these publications real-time again.

Closing thoughts

While such optimizations are rare, they’re a blessing. Really, how often can you spend less than a day to cut any database-related metric by 200x? Sure, reviewing indexes and tweaking the configuration are a thing, but that’s it.

Let me know how much it helped your Meteor app!