Sambhav Satija

Delta encoding list of ints yields shorter on-wire messages in `gRPC`

20/08/2020 | Jump to [TL;DR takeaway] or [Why?]

A project I was working on required that the client send ~300,000 ints to a server. This rpc was invoked 25 times with different values which meant the network cost quickly became a bottleneck for a resource-constrained client. It was infeasible to expect the client to incur a 300K*25*4bytes = 27.5MB cost.

The key property that I ended up exploiting was that we weren't concerned about the ordering. We just wanted this set of ints to go through. At a high level, if we sort and delta encode the ints, the resultant diff elements will be smaller than the original elements. Our bane, having to send ~300K ints, became a boon as the more number of elements you have to pack within 2³², the smaller the difference between each element; this allows more compact packing.

Concretely, assuming a uniform distribution in the original list, each diff element will be 2³²/300K = 14316 on average, which fits in 14 bits. Of course, we can't use 14 bits as a hard upper bound to fit our elements, and will probably need to build a packer which can exploit the expected lower bit cost per element, but still encode elements which require more bits than average.

No need for writing our own int encoder

Prelim results showed that we probably don't need to handwrite this tricky piece of code ourselves. Simple gRPC with gzip enabled gave us the benefits we were looking for.

Running a quick and dirty experiment allows us to ascertain that this result isn't fickle. Since there is no clear way to approximate a realistic workload for all systems, I tested with 2 common scenarios: The list of 300K ints is either sampled from a uniform distribution or a normal distribution. Across these 2 scenarios, I evaluated the upload cost with and without applying delta encoding.

The improvement is so stark that it passes even the eyeball statistics test. Without writing any code for efficiently encoding ints with varying bit requirements, delta encoding got us the ~50% gain we were looking for.

TL;DR: If you want to send a set (or list of small valued) integers (over gRPC), you can easily save bandwidth by delta encoding your dataset with negligible compute overhead.

How soon does it start becoming relevant?

Pretty late. My suggestion: if you're sending more than 10K ints, the benefits become quickly apparent.

Who exactly is doing the heavy lifting for us?

While running trials for the previous experiment, the system config remained constant. Specifically, the client's requests were always gzip'ed in flight. This improvement could be due to either gRPC or gzip and it'd be better to reduce the number of variables in our experiment. Since it's easy to toggle compression usage, I tested the data cost with and without compression.

The gap between an uncompressed raw stream and an uncompressed delta encoded stream makes it clear that while gzip is doing useful work, it's not the one giving us the bulk of the savings. What we're searching for now is what magic gRPC, or rather protobuf, employs.

A quick search yields the answer: Protobuf encodes ints as varints, which serializes them in a variable number of bytes. It uses LEB128 which has an overhead of 1 bit per byte. Smaller valued ints will take fewer bytes, so a delta-encoded set would occupy less space — this is exactly what we were looking for.

We do stand on the shoulders of giants.
/fin

Aside: On the flip side, that means it'd take 5 bytes to serialize an int which is greater than 2^(32-4) = 2²⁸. It'd be good to keep in mind to use fixed32 instead of int32 if your dataset contains predominantly large valued ints (no, I won't run experiments for that).

Sambhav Satija

Delta encoding list of ints yields shorter on-wire messages in gRPC

No need for writing our own int encoder

How soon does it start becoming relevant?

Who exactly is doing the heavy lifting for us?

Delta encoding list of ints yields shorter on-wire messages in `gRPC`