Binary Encodings for JSON and Variant

Discussion (2 Comments)Read Original on HackerNews

boricj•12 minutes ago

At work, I wrote a C++20 data binding library. It works by running visitors over a data model that binds to the application state. My comment comes from a different set of trade-offs driven by memory constraints.

I've implemented a bunch of serialization visitors. For the structured formats, most (JSON, YAML, CBOR with indefinite lengths) use an output iterator and can stream out one character/byte at a time, which is useful when your target is a MCU with 640 KiB of SRAM and you need to reply large REST API responses.

And there's the BSON serializer, which writes to a byte buffer because it uses tag-length-value and I need to backtrack in order to patch in the lengths after serializing the values. This means that the entire document needs to be written upfront before I can do something with it. It also has some annoying quirks, like array indices being strings in base 10.

There are also other trade-offs when dealing with JSON vs. its binary encodings. Strings in JSON may have escape characters that require parsing, if it has them then you can't return a view into the document, you need to allocate a string to hold the decoded value. Whereas in BSON or CBOR (excluding indefinite-length strings) the strings are not escaped and you can return a std::string_view straight from the document (and even a const char* for BSON, as it embeds a NUL character).

Some encodings like CBOR are also more expressive than JSON, allowing for example any value type to be used for map keys and not just strings.

kstenerud•40 minutes ago

As the author stated, it really depends on what you intend to use it for.

Fast internal scanning isn't free, because now you need pre-indexing, which is more data, and loses the incremental buildability on the encoding end.

Small transfer size and fast (full) decoding is possible with a single binary format, but unfortunately designers keep falling into the trap of adding extra things that make them incompatible with JSON. It's why I wrote https://github.com/kstenerud/bonjson/

Binary Encodings for JSON and Variant

⚡ Community Insights

Discussion (2 Comments)Read Original on HackerNews