BSONobj size too large

Permalink: HTTP NNTP

Posted Sat, 15 Apr 2017 23:08:19 GMT

Reply

Just hit this error message when executing a mongodb insert.

BSONObj size: 18850036 (0x11FA0F4) is invalid. Size must be between 0 and 16793600(16MB)

There seems to be a 16MB limit on the size of a BSONobj passed mongodb. For now I'd like to increase this limit and carry on. Is there a simple way to achieve this?

Re: BSONobj size too large

Permalink: HTTP NNTP

Sönke Ludwig

Posted Sun, 16 Apr 2017 18:17:21 GMT in reply to Carl Sturtivant

Reply

On Sat, 15 Apr 2017 23:08:19 GMT, Carl Sturtivant wrote:

Just hit this error message when executing a mongodb insert.
BSONObj size: 18850036 (0x11FA0F4) is invalid. Size must be between 0 and 16793600(16MB)
There seems to be a 16MB limit on the size of a BSONobj passed mongodb. For now I'd like to increase this limit and carry on. Is there a simple way to achieve this?

I'm pretty sure that this is unfortunately a fixed limit. See also https://docs.mongodb.com/manual/reference/limits/#bson-documents

Re: BSONobj size too large

Permalink: HTTP NNTP

Carl Sturtivant

Posted Mon, 17 Apr 2017 23:04:15 GMT in reply to Sönke Ludwig

Reply

On Sun, 16 Apr 2017 18:17:21 GMT, Sönke Ludwig wrote:

There seems to be a 16MB limit on the size of a BSONobj passed mongodb. For now I'd like to increase this limit and carry on. Is there a simple way to achieve this?

I'm pretty sure that this is unfortunately a fixed limit. See also https://docs.mongodb.com/manual/reference/limits/#bson-documents

Thanks for pointing that out the unfortunate source of that unexpected limitation. OK, so my document contains a dozens of very long strings (XML), so I decided to compress them. To that end I defined a compressedString type that automatically converts to and from string, and used that type instead of string in the struct used by the mongodb insert. Here's the essential idea.

struct compressedString
{
    import std.zlib;
    ubyte[] data;
    this( string s)
    {
        this = s;
    }
    string cnvrt() @property
    {
        if( data.length == 0) return "";
        return cast(string)uncompress(data);
    }
    void cnvrt(string value) @property
    {
        data = compress(value);
    }
    alias cnvrt this;
}

However, apparently vibe.d serialization to BSON engaged by the insert query detects the ability to convert struct fields of type compressedString to/from string, and uses that conversion to string as part of serialization, thus undoing the compression.

Easy enough to work around, by making conversion to string manual. But does vibe.d provide a way for this to be done with both conversions present?

Re: BSONobj size too large

Permalink: HTTP NNTP

Sönke Ludwig

Posted Sun, 23 Apr 2017 11:29:33 GMT in reply to Carl Sturtivant

Reply

On Mon, 17 Apr 2017 23:04:15 GMT, Carl Sturtivant wrote:

On Sun, 16 Apr 2017 18:17:21 GMT, Sönke Ludwig wrote:

There seems to be a 16MB limit on the size of a BSONobj passed mongodb. For now I'd like to increase this limit and carry on. Is there a simple way to achieve this?

I'm pretty sure that this is unfortunately a fixed limit. See also https://docs.mongodb.com/manual/reference/limits/#bson-documents

Thanks for pointing that out the unfortunate source of that unexpected limitation. OK, so my document contains a dozens of very long strings (XML), so I decided to compress them. To that end I defined a compressedString type that automatically converts to and from string, and used that type instead of string in the struct used by the mongodb insert. Here's the essential idea.
struct compressedString
{
    import std.zlib;
    ubyte[] data;
    this( string s)
    {
        this = s;
    }
    string cnvrt() @property
    {
        if( data.length == 0) return "";
        return cast(string)uncompress(data);
    }
    void cnvrt(string value) @property
    {
        data = compress(value);
    }
    alias cnvrt this;
}
However, apparently vibe.d serialization to BSON engaged by the insert query detects the ability to convert struct fields of type compressedString to/from string, and uses that conversion to string as part of serialization, thus undoing the compression.

Easy enough to work around, by making conversion to string manual. But does vibe.d provide a way for this to be done with both conversions present?

I got a bit confused, thinking that alias this is indeed used by the serializer by accident (it is not explicitly supported at the moment) and had to postpone the analysis. What I think happens instead is that the serialized output contains both, the data array and a cnvrt field, because read+write properties are serialized like ordinary fields.

The easiest fix for this would be to annotate the cnvrt properties with @ignore. To make things a bit more pretty, the struct could alternatively also be represented as a single binary blob by defining toRepresentation and fromRepresentation methods:

struct compressedString
{
    // ...

    ubyte[] toRepresentation() { return data; }
    static compressedString fromRepresentation(ubyte[] data)
    {
        compressedString ret;
        ret.data = data;
        return ret;
    }
}

Re: BSONobj size too large

Permalink: HTTP NNTP

Carl Sturtivant

Posted Sun, 23 Apr 2017 21:22:18 GMT in reply to Sönke Ludwig

Reply

On Sun, 23 Apr 2017 11:29:33 GMT, Sönke Ludwig wrote:

I got a bit confused, thinking that alias this is indeed used by the serializer by accident (it is not explicitly supported at the moment) and had to postpone the analysis. What I think happens instead is that the serialized output contains both, the data array and a cnvrt field, because read+write properties are serialized like ordinary fields.

The easiest fix for this would be to annotate the cnvrt properties with @ignore. To make things a bit more pretty, the struct could alternatively also be represented as a single binary blob by defining toRepresentation and fromRepresentation methods:
struct compressedString
{
    // ...

    ubyte[] toRepresentation() { return data; }
    static compressedString fromRepresentation(ubyte[] data)
    {
        compressedString ret;
        ret.data = data;
        return ret;
    }
}

Ah, that's nice, decoupling serialization from everything else. Thank you.

I can confirm that your working hypothesis is correct, i.e. that the serialized output contains both the compressed data and a cnvrt field.