To view parent comment, click here.
To read all comments associated with this story, please click here.
Zifre,
This is the first time I've heard of BSON. I see the mongodb project came up with it.
http://bsonspec.org/#/specification
I read the spec, and to be honest I don't like some of the design choices they made. Some of the datatypes are fixed size, some strings are zero terminated, others have a length prefix, yet another has both a length prefix and is zero terminated (I haven't the foggiest idea why?).
Whereas JSON encapsulates only primary generic datatypes, BSON has started enumerating application specific datatypes like MD5, RegEx, and UUID.
{"Name": "Alfman", "MD5": (string)"md5bytes"}
{"Name": "Alfman", "MD5": (Generic)"md5bytes"}
{"Name": "Alfman", "MD5": (User)"md5bytes"}
{"Name": "Alfman", "MD5": (MD5)"md5bytes"}
{"Name": "Alfman", "MD5": (RegEx)"md5bytes"}
{"Name": "Alfman", "MD5": (UUID)"md5bytes"}
In my opinion this is fundamentally flawed:
Problem 1: Application specific datatypes don't deserve special treatment in a generic transport protocol. Why is SHA not present? What about RSA keys? What about a JPEG datatype?
Problem 2: The *meaning* of a variable is implicitly known by the code which uses it. If it's *expecting* the MD5 field to contain an MD5 hash, there's no need for the transport protocol to tell it that the raw bytes are an MD5 datatype. JSON/XML work fine this way, I can't think of many applications that would benefit from overloading the MD5 field into different datatypes, which is probably always going to be an error.
So in my opinion, BSON is not a good model for a binary HTTP protocol. But I do think a simple binary name/value collection could work nicely.
Edit: it has occurred to me that you didn't actually propose BSON would be suitable for HTTP, but merely used it as an example of a binary container. So maybe my reply is out of context to what you were thinking, but I'll leave my comments anyway.
Edited 2012-01-27 03:27 UTC
Just to make it clear : what I meant was not that it is easy to write a text parser, but that thanks to the UNIX world, there are several quality general-purpose text parsing code and algorithms around the web, ready to be tuned for specific uses. I am not sure that the same can be said of binary parsers, where it seems to me that one is more likely to find one parser for each specific protocol/file format.
It is arguable that UTF-8, like ASCII, is more of the minimal binary support that any text-based protocol needs, though. This protocol is pretty much only good at transmitting text.
Edited 2012-01-27 05:59 UTC
Neolander,
"It is arguable that UTF-8, like ASCII, is more of the minimal binary support that any text-based protocol needs, though. This protocol is pretty much only good at transmitting text."
Yes, I would argue that UTF8 merely a text variant and not binary by any metric which matters here. You still have to create text structures/delimiters to separate fields, etc.




Member since:
2009-10-04
Have you ever written parsers? It is vastly easier to parse binary protocols than text. Even JSON, with a relatively simple grammar, is much harder to parse than BSON, for example.
Endianness is not really an issue, if the protocol simply defines it to be one way or the other. Also, some binary protocols are byte-based, like UTF-8.