Redis advice on low memory system

General Discussion
  • I'm getting the infamous csrf error on NodeBB that indicates that Redis has died (or maybe just unavailable?).

    I initially used Redis due to speed on a 2GB system. With the database sizes I had seen as well as my limited number of active forum users (~35), I didn't think it'd be a problem.

    However, Redis is still dying once per day. Thus far, I have:

    1. Set vm.overcommit_memory=1 in /etc/sysconfig
    2. Set maxclients 4064 in my /etc/redis.conf as my ulimit -n == 1024.

    I'm still getting these errors in my log:

    [23425] 20 May 20:12:46.979 * Saving the final RDB snapshot before exiting.
    [23425] 20 May 20:12:47.469 * DB saved on disk
    [23425] 20 May 20:12:47.469 * Removing the pid file.
    [23425] 20 May 20:12:47.470 # Redis is now ready to exit, bye bye...
    [24099] 20 May 20:12:48.197 # You requested maxclients of 10000 requiring at least 10032 max file descriptors.
    [24099] 20 May 20:12:48.198 # Redis can't set maximum open files to 10032 because of OS error: Operation not permitted.
    [24099] 20 May 20:12:48.198 # Current maximum open files is 1024. maxclients has been reduced to 4064 to compensate for low ulimit. If you need higher maxclients increase 'ulimit -n'.
    

    In hindsight I probably should have gone with MongoDB given my memory issue, but I'm unaware of a way to move the two. Perhaps someone could aid me in writing a JSON export/import function (see here and here).

    Currently, my Redis DB only takes ~18MB but I do notice large jumps in size after it dies each time (e.g. 8 to 12 to 18). Plus, after switching to nginx and varnish, I have about 958MB free.

  • What operating system is the redis installed on ? CentOS ? Debian?

    you might want to follow http://redis4you.com/articles.php?id=014&name=Redis+too+many+open+files+error+on+high+traffic+sites

    ulimit -n 4096

    if redis is running as root user I believe but from that message looks like redis wants at least 10032 file descriptors

    if it's running as redis user and on CentOS, you might need editing /etc/security/limits.conf and specifically raising open file descriptor limit for redis user

    i.e. usually I set open file limits to at least 65536 to 262144 in /etc/security/limits.conf

    * soft nofile 65536
    * hard nofile 65536
    redis soft nofile 65536
    redis hard nofile 65536
    nginx soft nofile 262144
    nginx hard nofile 262144
    

    looks like /etc/redis.conf for redis 2.8.9 defaults to following commented out settings

    # Set the max number of connected clients at the same time. By default
    # the max number of allowed clients is set to the current file limit
    # an error 'max number of clients reached'.
    # maxclients 10000
    

    Hope that helps

  • @Guiri The symptoms you post don't seem to point to your system running out of memory... and Redis itself doesn't seem to be the cause, I think...

    Try running free -m. Take note of the free column, in the buffers/cache row, this is your actual amount of free memory. Unless it's near zero, you should be fine.

    NodeBB in Redis really doesn't take too much space either. As a point of reference, this forum used to run on a 512mb cloud server, and now has a database size of 44.12M, currently. Still have some room to grow.

    Can you run ./nodebb log in your nodebb dir and see what the error is when it crashes? Provide a full stack trace if possible (it should be in the log, if an error did occur)

  • I whipped up a shell script at https://gist.github.com/centminmod/7d3e562fb87fa8ef263a to gather redis and system memory usage info. Might be helpful 🙂

  • Thanks everyone. I'll try this out later. I did increase the ulimit to unlimited for redis last night when I posted and it still crashed this morning. The problem is that although its the csrf error, both NodeBB and Redis are still running when I log in. And a restart to either fixes it.

    @eva2000 RHEL 6.5

  • Use my script to check real and system reported redis memory usage and not just redis db size itself.

    I re-read your first post about jumps in memory usage/size after each time it dies, maybe related to some automated backup policy you have in place or redis' persistence snapshot settings as I mentioned at https://community.nodebb.org/topic/932/redis-useful-info#6668 ? It matches your errors in your log particularly

    [23425] 20 May 20:12:46.979 * Saving the final RDB snapshot before exiting.
    [23425] 20 May 20:12:47.469 * DB saved on disk
    [23425] 20 May 20:12:47.469 * Removing the pid file.
    [23425] 20 May 20:12:47.470 # Redis is now ready to exit, bye bye...
    

    there's a 0.727 second gap between that and the maxclients error that follows

  • oh and only just noticed

    vm.overcommit_memory=1 should be placed in /etc/sysctl.conf and needs sysctl -p command for changes to take effect. Your first post listed the wrong file to place it it. I'd double check you have the settings in the right file 😉

    you can check with command in SSH below

    sysctl -a | grep 'vm.overcommit_memory'
    

    should give

    vm.overcommit_memory = 1
    
  • @eva2000 said:
    Thanks. I mistyped and it was /etc/sysctl.conf.

    I've uploaded the output from the script here: http://fpaste.org/103837/14006865/

    I'll read through your post on Redis persistence and perhaps upload my config. I agree that the timing between saving and NodeBB behind unable to connect is interesting. Even more interesting is that it's still running ps -ax | grep redis

  • ah i need to update my script to take into account redis set passwords heh

    but no memory/swap issues i see from your output

    what's your free disk space like ? and how large is your /tmp directory

    df -hT

  • df -hT
    Filesystem           Type   Size  Used Avail Use% Mounted on
    /dev/mapper/vg_pmc-lv_root
                         ext4    50G   11G   36G  24% /
    tmpfs                tmpfs  924M   72K  923M   1% /dev/shm
    /dev/sda1            ext4   485M   83M  378M  18% /boot
    /dev/mapper/vg_pmc-lv_home
                         ext4    94G  2.6G   86G   3% /home
    

    All is good. I was running tmp on tmpfs a while ago, but given the memory issues I stopped doing that a month ago.

  • @eva2000 The bad news is that I modified the nofile parameters, restarted, and it still died...

    The good news is that NodeBB started leaving errors in the error log! The Error: Ready check failed: NOAUTH Authentication required. error is puzzling since my config.json and Redis have an identical password. However, how important is the secret parameter? I've setup new NodeBBs 3 times trying to fix this error. They all connect to the same Redis database but maybe the different secret parameters are problematic?

    {"level":"error","message":"EACCES, open '/var/www/nodebb/socket.log'","timestamp":"2014-05-18T23:02:59.642Z"}
    {"level":"error","message":"[[error:too-many-posts, 10]]","timestamp":"2014-05-19T00:00:00.746Z"}
    {"level":"error","message":"EACCES, open '/var/www/nodebb/socket.log'","timestamp":"2014-05-19T03:53:33.291Z"}
    {"level":"error","message":"[[error:too-many-posts, 10]]","timestamp":"2014-05-19T11:00:00.564Z"}
    {"level":"error","message":"TypeError: Cannot read property 'name' of null\n    at /home/user/nodebb/src/categories.js:256:46\n    at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n    at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n    at Array.forEach (native)\n    at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n    at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n    at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n    at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n    at /home/user/nodebb/src/categories.js:255:10\n    at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-20T06:34:07.650Z"}
    {"level":"error","message":"Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED\n    at RedisClient.on_error (/home/user/nodebb/node_modules/redis/index.js:185:24)\n    at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:95:14)\n    at Socket.EventEmitter.emit (events.js:95:17)\n    at net.js:440:14\n    at process._tickDomainCallback (node.js:463:13)","timestamp":"2014-05-20T06:35:36.711Z"}
    {"level":"error","message":"TypeError: Cannot read property 'name' of null\n    at /home/user/nodebb/src/categories.js:256:46\n    at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n    at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n    at Array.forEach (native)\n    at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n    at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n    at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n    at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n    at /home/user/nodebb/src/categories.js:255:10\n    at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-20T12:09:24.813Z"}
    {"level":"error","message":"Error: Ready check failed: NOAUTH Authentication required.\n    at RedisClient.on_info_cmd (/home/user/nodebb/node_modules/redis/index.js:368:35)\n    at Command.callback (/home/user/nodebb/node_modules/redis/index.js:418:14)\n    at RedisClient.return_error (/home/user/nodebb/node_modules/redis/index.js:558:25)\n    at ReplyParser.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:305:18)\n    at ReplyParser.EventEmitter.emit (events.js:95:17)\n    at ReplyParser.send_error (/home/user/nodebb/node_modules/redis/lib/parser/javascript.js:296:10)\n    at ReplyParser.execute (/home/user/nodebb/node_modules/redis/lib/parser/javascript.js:181:22)\n    at RedisClient.on_data (/home/user/nodebb/node_modules/redis/index.js:534:27)\n    at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:91:14)\n    at Socket.EventEmitter.emit (events.js:95:17)","timestamp":"2014-05-20T12:32:21.109Z"}
    {"level":"error","message":"TypeError: Cannot read property 'name' of null\n    at /home/user/nodebb/src/categories.js:256:46\n    at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n    at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n    at Array.forEach (native)\n    at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n    at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n    at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n    at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n    at /home/user/nodebb/src/categories.js:255:10\n    at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-20T18:28:59.064Z"}
    {"level":"error","message":"Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED\n    at RedisClient.on_error (/home/user/nodebb/node_modules/redis/index.js:185:24)\n    at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:95:14)\n    at Socket.EventEmitter.emit (events.js:95:17)\n    at net.js:440:14\n    at process._tickDomainCallback (node.js:463:13)","timestamp":"2014-05-20T20:10:51.862Z"}
    {"level":"error","message":"TypeError: Cannot read property 'name' of null\n    at /home/user/nodebb/src/categories.js:256:46\n    at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n    at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n    at Array.forEach (native)\n    at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n    at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n    at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n    at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n    at /home/user/nodebb/src/categories.js:255:10\n    at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-21T00:08:22.352Z"}
    {"level":"error","message":"TypeError: Cannot read property 'name' of null\n    at /home/user/nodebb/src/categories.js:256:46\n    at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n    at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n    at Array.forEach (native)\n    at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n    at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n    at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n    at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n    at /home/user/nodebb/src/categories.js:255:10\n    at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-21T06:13:14.637Z"}
    {"level":"error","message":"Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED\n    at RedisClient.on_error (/home/user/nodebb/node_modules/redis/index.js:185:24)\n    at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:95:14)\n    at Socket.EventEmitter.emit (events.js:95:17)\n    at net.js:440:14\n    at process._tickDomainCallback (node.js:463:13)","timestamp":"2014-05-21T13:23:22.852Z"}
    {"level":"error","message":"TypeError: Cannot read property 'name' of null\n    at /home/user/nodebb/src/categories.js:256:46\n    at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n    at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n    at Array.forEach (native)\n    at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n    at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n    at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n    at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n    at /home/user/nodebb/src/categories.js:255:10\n    at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-22T00:12:40.546Z"}
    {"level":"error","message":"Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED\n    at RedisClient.on_error (/home/user/nodebb/node_modules/redis/index.js:185:24)\n    at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:95:14)\n    at Socket.EventEmitter.emit (events.js:95:17)\n    at net.js:440:14\n    at process._tickDomainCallback (node.js:463:13)","timestamp":"2014-05-22T00:42:07.066Z"}
    
  • This if I'm honest is way over my head (the experts are @julian and @baris) - what I can tell you for sure is that it might be something other than you running out of memory. You said:

    I initially used Redis due to speed on a 2GB system. With the database sizes I had seen as well as my limited number of active forum users (~35), I didn't think it'd be a problem.
    However, Redis is still dying once per day.

    This forum is pretty active and we were running it on a 512MB VPS with a few other NodeBB's (couple of my own personal forums) and WP blogs, etc. all on one system. Never ran into this so I'm wondering how you'd be running into memory issues with so little users? You're right though the csrf error is indicative of the database crashing or connection dropping - see gh#1554

  • @psychobunny Thanks. I've switched the loglevel of Redis to debug and I'll parse through the output once it dies. That your forum runs so well on 512MB is good to know. I believe I could use one of the free Amazon AWS micro-tiers with 512MB to run mine if I can't fix the problem with my configuration. I'll probably stick with Centos 6 unless you guys as the devs any any opinions one way or another.

  • So far going strong. I did find a few warnings in my /var/log/nginx/error.log though:

    2014/05/22 11:59:50 [warn] 2639#0: *115790 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/0/05/0000000050 while reading upstream, client: 165.124.*.186, server: domain.com, request: "GET /nodebb.min.js?v0.3.2-1701-g05872ad HTTP/1.1", upstream: "http://127.0.0.1:6081/nodebb.min.js?v0.3.2-1701-g05872ad", host: "domain.com", referrer: "http://domain.com/"
    2014/05/22 12:03:27 [warn] 2639#0: *116210 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/1/05/0000000051 while reading upstream, client: 165.124.*.*, server: domain.com, request: "GET /nodebb.min.js?v0.3.2-1701-g05872ad HTTP/1.1", upstream: "http://127.0.0.1:6081/nodebb.min.js?v0.3.2-1701-g05872ad", host: "domain.com", referrer: "http://domain.com/"
    2014/05/22 12:05:59 [warn] 2639#0: *116461 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/2/05/0000000052 while reading upstream, client: *.249.88.32, server: domain.com, request: "GET /nodebb.min.js?v0.3.2-1701-g05872ad HTTP/1.1", upstream: "http://127.0.0.1:6081/nodebb.min.js?v0.3.2-1701-g05872ad", host: "domain.com", referrer: "http://domain.com/"
    2014/05/22 12:06:00 [warn] 2639#0: *116463 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/3/05/0000000053 while reading upstream, client: *.249.88.32, server: domain.com, request: "GET /nodebb.min.js?v0.3.2-1701-g05872ad HTTP/1.1", upstream: "http://127.0.0.1:6081/nodebb.min.js?v0.3.2-1701-g05872ad", host: "domain.com", referrer: "http://domain.com/"
    

    It's been up and running for 6-hours now. I think that's a record. I added a 60s timeout in redid.conf and adjusted the tcp keep alive as well as increasing the maximum number of connections in /etc/sysctl.conf.

  • Guiri's Guide to Fixing Redis Problems

    I'm including in this post the various fixes and optimizations that have made Redis stable on an RHEL 6.5 system with 2GB of memory. I currently run NodeBB/Redis behind a Varnish Cache + nginx combo. Switching to nginx was easy and significantly cut my memory usage.

    I am merely combining the advice given to me by other experts. The credit is shared by @julian, @psychobunny, @eva2000, @a_5mith , and others.

    /etc/sysctl.conf

    1. net.core.somaxconn = 1024
    2. vm.overcommit_memory = 1

    /etc/redis.conf

    Close dead and idle peers to free up connections. This was my biggest problem that would eventually cause NodeBB to fail to connect and elicit the _csrf error. On RHEL/CentOS, Redis does not by default seem to be configured to close idle connections. Thus, whenever your forum will crash is predictable at a linear rate, where the beta is your avg. number of visitors.

    1. timeout 120
    2. tcp-keepalive 60

    /etc/security/limits.conf

    Increase the maximum number of open files for Redis and your web server.

    redis soft nofile 65536
    redis hard nofile 65536
    nginx soft nofile 262144
    nginx hard no file 262144

    
    Using Varnish with nginx
    --------------------------------
    
    There is some debate about whether it is more efficient to use nginx's caching capabilities.  The additionally memory overhead of using Varnish with nginx is minimal and the scaling performance on my old Core2Duo Mac Mini is incredible as measured by `ab` and `httperf`. My limited understanding is that it recycles old connections, which in turn reduces the load on NodeBB and Redis.  Enabling this doubled the time it took to crash Redis, but the eventual fix was modifying `redis.conf` to time out old connections.
    
    **Varnish**
    The default [Varnish instructions](https://docs.nodebb.org/en/latest/configuring/proxies/varnish.html) work well, but still need to be updated to 4.0 standards.  Thus, since RHEL/CentOS has an older version, install the 3.0 branch from [here](http://repo.varnish-cache.org/redhat/varnish-3.0/el6/), as the syntax is compatible. Varnish outputs on port `6081`, so we just need to slightly modify the nginx default.conf to make it work.
    
    Enable Varnish to start on boot: `sudo chkconfig varnish on`
    
    **nginx**
    Once again, you'll first need to install the latest nginx from [their repo](http://wiki.nginx.org/Install). The only line I believe you need to modify at this point is the proxy_pass port:
    
    

    server {
    listen 80;

    server_name domain.com *.domain.com;
    
    location / {
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header Host $http_host;
        proxy_set_header X-NginX-Proxy true;
    
        proxy_pass http://127.0.0.1:6081/;
        #proxy_pass http://127.0.0.1:4567/;
        proxy_redirect off;
    
        # Socket.IO Support
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
    

    }

    
    Last, make sure you enable nginx to start on boot: `sudo chkconfig nginx on`


Suggested Topics