Back to News
Advertisement
Advertisement

⚡ Community Insights

Discussion Sentiment

67% Positive

Analyzed from 280 words in the discussion.

Trending Topics

#json#strings#need#data#cbor#bson#value#document#wrote#visitors

Discussion (2 Comments)Read Original on HackerNews

boricj12 minutes ago
At work, I wrote a C++20 data binding library. It works by running visitors over a data model that binds to the application state. My comment comes from a different set of trade-offs driven by memory constraints.

I've implemented a bunch of serialization visitors. For the structured formats, most (JSON, YAML, CBOR with indefinite lengths) use an output iterator and can stream out one character/byte at a time, which is useful when your target is a MCU with 640 KiB of SRAM and you need to reply large REST API responses.

And there's the BSON serializer, which writes to a byte buffer because it uses tag-length-value and I need to backtrack in order to patch in the lengths after serializing the values. This means that the entire document needs to be written upfront before I can do something with it. It also has some annoying quirks, like array indices being strings in base 10.

There are also other trade-offs when dealing with JSON vs. its binary encodings. Strings in JSON may have escape characters that require parsing, if it has them then you can't return a view into the document, you need to allocate a string to hold the decoded value. Whereas in BSON or CBOR (excluding indefinite-length strings) the strings are not escaped and you can return a std::string_view straight from the document (and even a const char* for BSON, as it embeds a NUL character).

Some encodings like CBOR are also more expressive than JSON, allowing for example any value type to be used for map keys and not just strings.

kstenerud40 minutes ago
As the author stated, it really depends on what you intend to use it for.

Fast internal scanning isn't free, because now you need pre-indexing, which is more data, and loses the incremental buildability on the encoding end.

Small transfer size and fast (full) decoding is possible with a single binary format, but unfortunately designers keep falling into the trap of adding extra things that make them incompatible with JSON. It's why I wrote https://github.com/kstenerud/bonjson/