Operation Timeout in MongoDB

February 23, 2018

Today I'd like to talk about a problem every MongoDB developer should be aware of - operation timeout. I have surely risen a lot of eyebrows and a few snide remarks, but let me reassure it's worth reading.

Connection vs Operation Timeout

So where do we start? The main problem with operation timeout in any database, not specifically to MongoDB, is the developer's confusion between connection timeout and operation timeout. So let's clear the air right away by clarifying the difference. Connection timeout is the maximal time you wait until you connect to the database. Whereas operational timeout is the maximal time you wait until a certain operation is performed, usually CRUD. This happens after you're already connected to the database.

Post MongoDB 2.6

If you've just started using MongoDB or had a luck to upgrade your existing instance to the newest version, that being 2.6 at the moment of writing, then you should know there is a build-in support for operation timeout by using $maxTimeMS operator in every request.

 
db.collection.find().maxTimeMS(100)

Akward? Surely, but it does the job pretty well.

Pre MongoDB 2.6

But what happens if you don't have the luxury of upgrading your database instance, either from IT or project constrains. In pre 2.6 world, things get ugly. Naturally we want our operations to be constrained within limited timeline, so that we could properly write error logs and take the effective measures. So how do we do this?

MongoDbManager

I've written a MongoDB wrapper library, which uses JavaScript setTimeout mechanism to tackle the issue. The full code can be found in GitHub. Let's look through the main ideas of the library in depth.

find = function find(obj, callback, logger) {
    var filter = obj.filter, name = obj.name, isOne = obj.isOne,
        isRetrieveId = obj.isRetrieveId, limit = obj.limit,
        projection = obj.projection || {};
    if (!isRetrieveId) {
        projection._id = 0;
    }
    connect(function (err1, db) {
        if (err1) {
            callback(err1);
            return;
        }
        var start = logger.start("get " + name), isSent = false,
            findCallback = function (err, items) {
                logger.end(start);
                if (isSent) {
                    return;
                }
                isSent = true;
                if (err) {
                    callback(err);
                } else {
                    callback(null, items);
                }
            };
        setTimeout(function findTimeoutHanlder() {
            if (isSent) {
                return;
            }
            isSent = true;
            callback(ERRORS.TIMEOUT);
        }, SETTINGS.TIMEOUT);
        if (isRetrieveId) {
            if (isOne) {
                db.collection(name).findOne(filter, projection,
                findCallback);
            } else {
                if (limit) {
                    db.collection(name).find(filter, projection)
                    .limit(limit).toArray(findCallback);
                } else {
                    db.collection(name).find(filter, projection).
                    toArray(findCallback);
                }
            }
        } else {
            if (isOne) {
                db.collection(name).findOne(filter, projection,
                findCallback);
            } else {
                if (limit) {
                    db.collection(name).find(filter, projection).
                    limit(limit).toArray(findCallback);
                } else {
                    db.collection(name).find(filter, projection).
                    toArray(findCallback);
                }
            }
        }
    }, logger);
}

A lot of code :( Let's take step by step or in our case line by line. Firstly we connect to the database by calling connect method. It checks whether there is an open connection and opens one in case there isn't. Then we create a timeout callback, findTimeoutHanlder, and queue it's invocation after SETTINGS.TIMEOUT. Right after this, we query the database with find method. Once the data is retrieved our timeout flag, isSent, is set to true, indicating the response was sent. Once the timeout callback is activated, it checks the value of the flag and in case it isn't set to true, error is returned.

Why is that? Activation of timeout callback means we reached a predefined timeout. If flag is still false, then we haven't still received the data from the database and we should quit. When the data is finally retrieved, we check the flag again. If it was set by timeout callback, then we don't need to do a thing, since the error was already returned.

This simple, yet powerful technique is used throughout the library wrapping other operations like update and insert as well. The code is fully documented and has a few examples, which should aid you with understanding the code within one hour.

If you have any questions or suggestions, please don't hesitate to comment below.

Deep Research and Development