Memory efficient archiving
Several years ago, we started our board on MyBB. We migrated to NodeBB v0.3 as it started to get crowded, and have been happy ever since. The db server it's Redis instance is running on has 4GB of RAM currently, off which around 70% are in constant use.
As the consumption continues to grow, I know we'll have to upgrade the server rather sooner than later, and that made me wonder how to solve this without keeping to throw memory upgrades on it.
My idea would be to swap older content (configurable threshold) from Redis into a MongoDB instance, for example - it shouldn't be a problem to create different adapters, let's say MySQL or even JSON for a pure disk solution.
If there was an algorithm checking the popularity of content, it would also be possible to not just stupidly moving old content but less frequently accessed content to the archive.
To be clear: I'm not talking about caching here, since that would imply keeping the content in the Redis store.
Since this would require setting up multiple database systems, it shouldn't be mandatory (there surely are many users who can just use machines powerful enough and keep sizing them, but that doesn't apply for everyone).
Would this be feasible? Is it on some roadmap already, maybe? What's your opinion on this?
@Moritz-Friedrich I would suggest just moving all of your data from redis to MongoDB. It isn't significantly slower, and disk space is much much cheaper than memory. If you want to increase the responsiveness of your forum, I'd suggest using more CPU cores which multiple NodeBB threads clustered behind Nginx.
I heavily discourage moving away from Redis. Anything else will make your performance suffer. In my opinion you can solve this in 2 ways.
Option 1 - Change provider:
RAM is cheap as hell. I have a 32GB DDR4 RAM server for less than 25€ a month. Forget about DigitalOcean or any other bullshit cloud providers.
If you want I can provide you with a list of providers.
Option 2 - Clean up your DB/Switch to SSDB:
Somewhile ago I worked on getting SSDB.io working as an drop in replacement for Redis. Furthermore you could just clean your bord once in a while. Often you will find spam users, posts, topics ot whatever else useless.
Can you provide us with some numbers by the way, e.g. amount of posts, threads, users?
I'd like to stay with Redis - it's a battle-tested, hugely scalable and ridiculously performant key-value storage. No matter what happens in the future, Redis will grow with it.
The responsiveness is perfectly fine, we've got four cores dedicated to nginx and NodeBB, that works well. But I think it would be a nice addition to NodeBB to be able to not hold less frequently accessed content in memory but load it on demand. I'd imagine that'll be also interesting for huge-scale instances.
I'm currently using DigitalOcean, yes, because I like VPS servers. I'm a fulltime sysadmin, many of my work clients have their own datacenter servers, so I'm aware there are cheaper possibilites. Though I think the huge advantage of a virtual box is the speed to scale it, and the direct control over its power supply and console.
I've just looked at SSDB, but I wonder what the main advantage is? The stats seem to indicate it is a little bit slower than Redis, but funnily enough the page doesn't list any selling points... Less memory consumption?
I could clean the database up, but to be honest, while I'm the admin and responsible for the server, I'm not really involved with the content or its users.
For the stats, here you go:
Keys in Redis: 2,402,609
Page views per week: 549,187
Uploaded files: ~4GB
@Moritz-Friedrich, well thats true, but DigitalOcean is not worth it. From my experience their overall service is average and sometimes more a hype. With a proper VPS provider you won't miss features like KVM. Maybe send me a PM with your location and I will see if there is any alternative for you.
And yes, SSDB is way more efficient than Redis regarding memory usage, though its indeed a bit slower.
Now about your stats:
I have more traffic, users and topics, but fewer posts. Overall I have 1,605,290 keys for my nodeBB database, however before I cleaned the sessions manually it was around 2,100,000. My memory usage never exceeded 2GB for Redis. Currently I need less than 1,5GB of RAM.
@AOKP You might be right about that - we went from a managed hosting provider to DO when the market was small, I've never bothered to compare again since then. My location is no secret, I'm from Germany, the board's here. I'd be thankful if you could share some good providers having data centers in or near germany!
I'll set up a testing environment with NodeBB and SSDB, thanks for that one!
Redis allocates around 2.6GB real memory currently, hinting to NodeBB consuming most memory for posts (or I'm doing something horribly wrong). @PitaJ on a general note, does the way NodeBB handles keys in the db have potential for optimization? Because most applications do
Linode is similar to Digital Ocean and they provide double the ram for the same price right now.
@baris still its not recommendable. Maybe its just because I am used to German providers.
Making a blanket assumption like that seems a little rash don't you think?
Now, if you made statements about data ownership and "five eyes" countries, then perhaps an argument can be made...
I highly appreciate the valuable input about hosting from all of you, but I feel this discussion drifting away from the original question quicky
Is archiving content so far out of scope or dispensable for the project?
The idea that Redis stores everything in-memory is the key to its speed. However, it's also a huge crutch in that yes... the less-often used data is still in-memory when it could rightly just be put in the hard drive and queued up when needed.
That's why we suggested migrating to Mongo, because, well... it does just that