Redis advice on low memory system
-
@Guiri The symptoms you post don't seem to point to your system running out of memory... and Redis itself doesn't seem to be the cause, I think...
Try running
free -m
. Take note of thefree
column, in thebuffers/cache
row, this is your actual amount of free memory. Unless it's near zero, you should be fine.NodeBB in Redis really doesn't take too much space either. As a point of reference, this forum used to run on a 512mb cloud server, and now has a database size of 44.12M, currently. Still have some room to grow.
Can you run
./nodebb log
in your nodebb dir and see what the error is when it crashes? Provide a full stack trace if possible (it should be in the log, if an error did occur) -
I whipped up a shell script at https://gist.github.com/centminmod/7d3e562fb87fa8ef263a to gather redis and system memory usage info. Might be helpful
-
Thanks everyone. I'll try this out later. I did increase the ulimit to unlimited for redis last night when I posted and it still crashed this morning. The problem is that although its the
csrf
error, both NodeBB and Redis are still running when I log in. And a restart to either fixes it.@eva2000 RHEL 6.5
-
Use my script to check real and system reported redis memory usage and not just redis db size itself.
I re-read your first post about jumps in memory usage/size after each time it dies, maybe related to some automated backup policy you have in place or redis' persistence snapshot settings as I mentioned at https://community.nodebb.org/topic/932/redis-useful-info#6668 ? It matches your errors in your log particularly
[23425] 20 May 20:12:46.979 * Saving the final RDB snapshot before exiting. [23425] 20 May 20:12:47.469 * DB saved on disk [23425] 20 May 20:12:47.469 * Removing the pid file. [23425] 20 May 20:12:47.470 # Redis is now ready to exit, bye bye...
there's a 0.727 second gap between that and the maxclients error that follows
-
oh and only just noticed
vm.overcommit_memory=1
should be placed in/etc/sysctl.conf
and needssysctl -p
command for changes to take effect. Your first post listed the wrong file to place it it. I'd double check you have the settings in the right fileyou can check with command in SSH below
sysctl -a | grep 'vm.overcommit_memory'
should give
vm.overcommit_memory = 1
-
@eva2000 said:
Thanks. I mistyped and it was/etc/sysctl.conf
.I've uploaded the output from the script here: http://fpaste.org/103837/14006865/
I'll read through your post on Redis persistence and perhaps upload my config. I agree that the timing between saving and NodeBB behind unable to connect is interesting. Even more interesting is that it's still running
ps -ax | grep redis
-
df -hT Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/vg_pmc-lv_root ext4 50G 11G 36G 24% / tmpfs tmpfs 924M 72K 923M 1% /dev/shm /dev/sda1 ext4 485M 83M 378M 18% /boot /dev/mapper/vg_pmc-lv_home ext4 94G 2.6G 86G 3% /home
All is good. I was running tmp on
tmpfs
a while ago, but given the memory issues I stopped doing that a month ago. -
@eva2000 The bad news is that I modified the
nofile
parameters, restarted, and it still died...The good news is that NodeBB started leaving errors in the error log! The
Error: Ready check failed: NOAUTH Authentication required.
error is puzzling since my config.json and Redis have an identical password. However, how important is thesecret
parameter? I've setup new NodeBBs 3 times trying to fix this error. They all connect to the same Redis database but maybe the different secret parameters are problematic?{"level":"error","message":"EACCES, open '/var/www/nodebb/socket.log'","timestamp":"2014-05-18T23:02:59.642Z"} {"level":"error","message":"[[error:too-many-posts, 10]]","timestamp":"2014-05-19T00:00:00.746Z"} {"level":"error","message":"EACCES, open '/var/www/nodebb/socket.log'","timestamp":"2014-05-19T03:53:33.291Z"} {"level":"error","message":"[[error:too-many-posts, 10]]","timestamp":"2014-05-19T11:00:00.564Z"} {"level":"error","message":"TypeError: Cannot read property 'name' of null\n at /home/user/nodebb/src/categories.js:256:46\n at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n at Array.forEach (native)\n at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n at /home/user/nodebb/src/categories.js:255:10\n at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-20T06:34:07.650Z"} {"level":"error","message":"Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED\n at RedisClient.on_error (/home/user/nodebb/node_modules/redis/index.js:185:24)\n at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:95:14)\n at Socket.EventEmitter.emit (events.js:95:17)\n at net.js:440:14\n at process._tickDomainCallback (node.js:463:13)","timestamp":"2014-05-20T06:35:36.711Z"} {"level":"error","message":"TypeError: Cannot read property 'name' of null\n at /home/user/nodebb/src/categories.js:256:46\n at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n at Array.forEach (native)\n at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n at /home/user/nodebb/src/categories.js:255:10\n at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-20T12:09:24.813Z"} {"level":"error","message":"Error: Ready check failed: NOAUTH Authentication required.\n at RedisClient.on_info_cmd (/home/user/nodebb/node_modules/redis/index.js:368:35)\n at Command.callback (/home/user/nodebb/node_modules/redis/index.js:418:14)\n at RedisClient.return_error (/home/user/nodebb/node_modules/redis/index.js:558:25)\n at ReplyParser.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:305:18)\n at ReplyParser.EventEmitter.emit (events.js:95:17)\n at ReplyParser.send_error (/home/user/nodebb/node_modules/redis/lib/parser/javascript.js:296:10)\n at ReplyParser.execute (/home/user/nodebb/node_modules/redis/lib/parser/javascript.js:181:22)\n at RedisClient.on_data (/home/user/nodebb/node_modules/redis/index.js:534:27)\n at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:91:14)\n at Socket.EventEmitter.emit (events.js:95:17)","timestamp":"2014-05-20T12:32:21.109Z"} {"level":"error","message":"TypeError: Cannot read property 'name' of null\n at /home/user/nodebb/src/categories.js:256:46\n at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n at Array.forEach (native)\n at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n at /home/user/nodebb/src/categories.js:255:10\n at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-20T18:28:59.064Z"} {"level":"error","message":"Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED\n at RedisClient.on_error (/home/user/nodebb/node_modules/redis/index.js:185:24)\n at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:95:14)\n at Socket.EventEmitter.emit (events.js:95:17)\n at net.js:440:14\n at process._tickDomainCallback (node.js:463:13)","timestamp":"2014-05-20T20:10:51.862Z"} {"level":"error","message":"TypeError: Cannot read property 'name' of null\n at /home/user/nodebb/src/categories.js:256:46\n at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n at Array.forEach (native)\n at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n at /home/user/nodebb/src/categories.js:255:10\n at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-21T00:08:22.352Z"} {"level":"error","message":"TypeError: Cannot read property 'name' of null\n at /home/user/nodebb/src/categories.js:256:46\n at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n at Array.forEach (native)\n at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n at /home/user/nodebb/src/categories.js:255:10\n at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-21T06:13:14.637Z"} {"level":"error","message":"Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED\n at RedisClient.on_error (/home/user/nodebb/node_modules/redis/index.js:185:24)\n at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:95:14)\n at Socket.EventEmitter.emit (events.js:95:17)\n at net.js:440:14\n at process._tickDomainCallback (node.js:463:13)","timestamp":"2014-05-21T13:23:22.852Z"} {"level":"error","message":"TypeError: Cannot read property 'name' of null\n at /home/user/nodebb/src/categories.js:256:46\n at /home/user/nodebb/node_modules/async/lib/async.js:227:13\n at /home/user/nodebb/node_modules/async/lib/async.js:111:13\n at Array.forEach (native)\n at _each (/home/user/nodebb/node_modules/async/lib/async.js:32:24)\n at async.each (/home/user/nodebb/node_modules/async/lib/async.js:110:9)\n at _asyncMap (/home/user/nodebb/node_modules/async/lib/async.js:226:9)\n at Object.map (/home/user/nodebb/node_modules/async/lib/async.js:204:23)\n at /home/user/nodebb/src/categories.js:255:10\n at /home/user/nodebb/node_modules/redis/index.js:1138:13","timestamp":"2014-05-22T00:12:40.546Z"} {"level":"error","message":"Error: Redis connection to 127.0.0.1:6379 failed - connect ECONNREFUSED\n at RedisClient.on_error (/home/user/nodebb/node_modules/redis/index.js:185:24)\n at Socket.<anonymous> (/home/user/nodebb/node_modules/redis/index.js:95:14)\n at Socket.EventEmitter.emit (events.js:95:17)\n at net.js:440:14\n at process._tickDomainCallback (node.js:463:13)","timestamp":"2014-05-22T00:42:07.066Z"}
-
This if I'm honest is way over my head (the experts are @julian and @baris) - what I can tell you for sure is that it might be something other than you running out of memory. You said:
I initially used Redis due to speed on a 2GB system. With the database sizes I had seen as well as my limited number of active forum users (~35), I didn't think it'd be a problem.
However, Redis is still dying once per day.This forum is pretty active and we were running it on a 512MB VPS with a few other NodeBB's (couple of my own personal forums) and WP blogs, etc. all on one system. Never ran into this so I'm wondering how you'd be running into memory issues with so little users? You're right though the
csrf
error is indicative of the database crashing or connection dropping - see gh#1554 -
@psychobunny Thanks. I've switched the
loglevel
of Redis to debug and I'll parse through the output once it dies. That your forum runs so well on 512MB is good to know. I believe I could use one of the free Amazon AWS micro-tiers with 512MB to run mine if I can't fix the problem with my configuration. I'll probably stick with Centos 6 unless you guys as the devs any any opinions one way or another. -
So far going strong. I did find a few warnings in my
/var/log/nginx/error.log
though:2014/05/22 11:59:50 [warn] 2639#0: *115790 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/0/05/0000000050 while reading upstream, client: 165.124.*.186, server: domain.com, request: "GET /nodebb.min.js?v0.3.2-1701-g05872ad HTTP/1.1", upstream: "http://127.0.0.1:6081/nodebb.min.js?v0.3.2-1701-g05872ad", host: "domain.com", referrer: "http://domain.com/" 2014/05/22 12:03:27 [warn] 2639#0: *116210 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/1/05/0000000051 while reading upstream, client: 165.124.*.*, server: domain.com, request: "GET /nodebb.min.js?v0.3.2-1701-g05872ad HTTP/1.1", upstream: "http://127.0.0.1:6081/nodebb.min.js?v0.3.2-1701-g05872ad", host: "domain.com", referrer: "http://domain.com/" 2014/05/22 12:05:59 [warn] 2639#0: *116461 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/2/05/0000000052 while reading upstream, client: *.249.88.32, server: domain.com, request: "GET /nodebb.min.js?v0.3.2-1701-g05872ad HTTP/1.1", upstream: "http://127.0.0.1:6081/nodebb.min.js?v0.3.2-1701-g05872ad", host: "domain.com", referrer: "http://domain.com/" 2014/05/22 12:06:00 [warn] 2639#0: *116463 an upstream response is buffered to a temporary file /var/cache/nginx/proxy_temp/3/05/0000000053 while reading upstream, client: *.249.88.32, server: domain.com, request: "GET /nodebb.min.js?v0.3.2-1701-g05872ad HTTP/1.1", upstream: "http://127.0.0.1:6081/nodebb.min.js?v0.3.2-1701-g05872ad", host: "domain.com", referrer: "http://domain.com/"
It's been up and running for 6-hours now. I think that's a record. I added a 60s timeout in
redid.conf
and adjusted thetcp keep alive
as well as increasing the maximum number of connections in /etc/sysctl.conf. -
Guiri's Guide to Fixing Redis Problems
I'm including in this post the various fixes and optimizations that have made Redis stable on an RHEL 6.5 system with 2GB of memory. I currently run NodeBB/Redis behind a Varnish Cache + nginx combo. Switching to nginx was easy and significantly cut my memory usage.
I am merely combining the advice given to me by other experts. The credit is shared by @julian, @psychobunny, @eva2000, @a_5mith , and others.
/etc/sysctl.conf
net.core.somaxconn = 1024
vm.overcommit_memory = 1
/etc/redis.conf
Close dead and idle peers to free up connections. This was my biggest problem that would eventually cause NodeBB to fail to connect and elicit the
_csrf
error. On RHEL/CentOS, Redis does not by default seem to be configured to close idle connections. Thus, whenever your forum will crash is predictable at a linear rate, where the beta is your avg. number of visitors.timeout 120
tcp-keepalive 60
/etc/security/limits.conf
Increase the maximum number of open files for Redis and your web server.
redis soft nofile 65536
redis hard nofile 65536
nginx soft nofile 262144
nginx hard no file 262144Using Varnish with nginx -------------------------------- There is some debate about whether it is more efficient to use nginx's caching capabilities. The additionally memory overhead of using Varnish with nginx is minimal and the scaling performance on my old Core2Duo Mac Mini is incredible as measured by `ab` and `httperf`. My limited understanding is that it recycles old connections, which in turn reduces the load on NodeBB and Redis. Enabling this doubled the time it took to crash Redis, but the eventual fix was modifying `redis.conf` to time out old connections. **Varnish** The default [Varnish instructions](https://docs.nodebb.org/en/latest/configuring/proxies/varnish.html) work well, but still need to be updated to 4.0 standards. Thus, since RHEL/CentOS has an older version, install the 3.0 branch from [here](http://repo.varnish-cache.org/redhat/varnish-3.0/el6/), as the syntax is compatible. Varnish outputs on port `6081`, so we just need to slightly modify the nginx default.conf to make it work. Enable Varnish to start on boot: `sudo chkconfig varnish on` **nginx** Once again, you'll first need to install the latest nginx from [their repo](http://wiki.nginx.org/Install). The only line I believe you need to modify at this point is the proxy_pass port:
server {
listen 80;server_name domain.com *.domain.com; location / { proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_set_header X-NginX-Proxy true; proxy_pass http://127.0.0.1:6081/; #proxy_pass http://127.0.0.1:4567/; proxy_redirect off; # Socket.IO Support proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; }
}
Last, make sure you enable nginx to start on boot: `sudo chkconfig nginx on`