Drop Redis/Improve MongoDB/Support SQL?



  • I just browsed the MongoDB database of my tiny test forum and it's quite apparent that dropping Redis support could mean a HUGE win for the cleanness and (I think) performance of MongoDB. There is a lot of redundancy at the moment, there are lots of strangely linked items. Small example:

    { 
        "_id" : ObjectId("5989ba5986d1235ea0696381"), 
        "_key" : "event:15", 
        "type" : "plugin-activate", 
        "text" : "nodebb-plugin-calendar", 
        "timestamp" : 1502198361021.0, 
        "eid" : NumberInt(15)
    }
    

    is linked to:

    { 
        "_id" : ObjectId("5989ba5986d1235ea0696380"), 
        "_key" : "events:time", 
        "value" : "15", 
        "score" : 1502198361021.0
    }
    

    That doesn't really make sense to me :)

    I'm not exactly an expert on NoSQL databases, but it's hard for me to see how MongoDB can perform with a non-trivial amount of data. What would happen if I would dump 100MM posts in there?

    The current use of the DB also looks a lot like a relational database:

    A post:

    { 
        "_id" : ObjectId("598999cf86d1235ea06962af"), 
        "_key" : "post:1", 
        "pid" : NumberInt(1), 
        "uid" : NumberInt(1), 
        "tid" : NumberInt(1), 
        "content" : "# Welcome to your brand new NodeBB forum!\n\n[blabla]", 
        "timestamp" : 1502190031774.0, 
        "deleted" : NumberInt(0)
    }
    

    A topic:

    { 
        "_id" : ObjectId("598999cf86d1235ea06962a9"), 
        "_key" : "topic:1", 
        "tid" : NumberInt(1), 
        "uid" : NumberInt(1), 
        "cid" : NumberInt(2), 
        "mainPid" : NumberInt(1), 
        "title" : "Welcome to your NodeBB!", 
        "slug" : "1/welcome-to-your-nodebb", 
        "timestamp" : 1502190031763.0, 
        "lastposttime" : 1502199185012.0, 
        "postcount" : NumberInt(4), 
        "viewcount" : NumberInt(30), 
        "locked" : NumberInt(0), 
        "deleted" : NumberInt(0), 
        "pinned" : NumberInt(0), 
        "teaserPid" : "4"
    }
    

    These two are linked through 'tid' which makes a lot of sense in an RDBM context. In MongoDB I would have expected one document per topic or something like that?

    In mongodb single collection? a 'delete' example is given as a reason for going with NoSQL. But you can't just do: db.delete('topic:1'); because that would miss all posts, read indicators, subscriptions, etc, etc. (If I'm not missing something)

    All this:
    https://github.com/NodeBB/NodeBB/blob/master/src/topics/delete.js
    is needed to delete a topic.

    But, as I said, my NoSQL experience is really very limited. Could be that my expectations of NoSQL are too high (or too low when it comes to performance).

    I'm curious though what the plans are for the NodeBB datastore(s) for the future. Is Redis support going to be dropped? Have there been tests regarding the scalability of the current schema(lessness)? Are there plans to support RDBMSes?

    Also curious why NoSQL was picked as the datastore. A forum doesn't seem like a very natural fit for NoSQL? The schema is fixed, there are lots and lots of relationships between tables/collections/objects.

    In short:

    • Is Redis going to be dropped?
    • Is MongoDB able to handle more than a few posts (without a lot more hardware than an RDBMS)?
    • Are there plans to support RDBMSes?

  • Admin

    They're not strangely linked... event:15 is the 15th item in the events:time sorted set, event:15 is the hash containing the actual data.

    But that said, yes, a forum is a good use case for a relational database, though we wanted to try something new with a NoSQL solution. Redis worked well and still does, though there are areas or improvement to be made in MongoDB, and most involve dropping Redis support at some point down the line.

    SQL support is always a consideration and I do know there are interested parties working on such an adapter at this moment, though I don't think it is ready yet :smile:



  • I notices that what.thedailywtf.com is working on a PostGres implementation, but they are just storing JSON objects in one huge table. I was talking about a properly normalized implementation of the datastore :)

    Or is there another effort in the Postgres department?

    And thanks on the explanation about the sorted sets. I really need to read up on NoSQL again. Although in this case the events:time objects seem redundant, the timestamp is already available in event:15?

    Do you have an idea what to expect of the current MongoDB implementation when the database gets to a nontrivial size? Is there a fixtures script somewhere?


  • Admin

    Your assumption that SQL is faster than MongoDB simply because it is SQL is erroneous. We have structured the database in such a way that large volumes of data do not slow down data access (and why should it?)

    event:15 also contains additional data related to the event in question. That data is not stored in the sorted set.



  • event:15 does but eventtime:15 doesn't as far as I can see?

    SQL is not faster because it's SQL, far from it. SQL can suck hard from a performance point of view.

    A large part of my worries stem from the fact that I'm not that familiar with the strengths and weaknesses of MongoDB and I am really familiar with the challenges that MySQL can pose when it comes to forums and big boards.

    In the MySQL case the DB can be really slow if proper indices are missing. And that's with database tables which are highly structured where it's trivial for the DB to do a table scan on a certain field.

    I view MongoDB as one large bucket of JSON objects that are hardly related or structured. It's possible to index them to some extent but as far as I know that's not really efficient when everything is part of one large collection.

    But most I guess most of it is 'fear of the unknown' :) I guess I should just fire up a conversion or fixtures script and see what happens...


  • Admin

    @BartVB take a look at https://github.com/NodeBB/NodeBB/wiki/Database-Structure. events:time is a sorted set that is used to get objects from the event:<eid> hashes.

    The only indices we have on mongodb are _key, value and _key, score



  • @julian said in Drop Redis/Improve MongoDB/Support SQL?:

    But that said, yes, a forum is a good use case for a relational database, though we wanted to try something new with a NoSQL solution. Redis worked well and still does, though there are areas or improvement to be made in MongoDB, and most involve dropping Redis support at some point down the line.

    Dropping redis will make me sad since I setup my forum with the redis recommendation from the documentation :(

    You'll have to publicize the secret sauce you use to convert from redis to Mongo.



  • @teh_g said in Drop Redis/Improve MongoDB/Support SQL?:

    @julian said in Drop Redis/Improve MongoDB/Support SQL?:

    But that said, yes, a forum is a good use case for a relational database, though we wanted to try something new with a NoSQL solution. Redis worked well and still does, though there are areas or improvement to be made in MongoDB, and most involve dropping Redis support at some point down the line.

    Dropping redis will make me sad since I setup my forum with the redis recommendation from the documentation :(

    You'll have to publicize the secret sauce you use to convert from redis to Mongo.

    Yes, if they would release that, it would be great and then they can drop Redis support at will.


  • Admin

    Yes, let it be known that if we do decide to drop support, the Redis to Mongo migration script will be immediately published :smile:



  • @julian said in Drop Redis/Improve MongoDB/Support SQL?:

    Yes, let it be known that if we do decide to drop support, the Redis to Mongo migration script will be immediately published :smile:

    I figured as much, you guys are too cool to be jerks :D


Log in to reply
 


Looks like your connection to NodeBB was lost, please wait while we try to reconnect.