NodeBB 1.11.2 Error Connect EMFILE
-
We've been getting this quite a bit recently. We are on 1.11.2. We get scores of these errors as the site gets flaky and eventually crashes completely. A restart bring it back up fine and it runs for a while without an issue. Been happening for a few weeks now.
2019-03-01T17:06:04.467Z [4567/4257] - error: connect EMFILE 35.172.4.119:465 - Local (undefined:undefined) Error: connect EMFILE 35.172.4.119:465 - Local (undefined:undefined) at internalConnect (net.js:932:16) at defaultTriggerAsyncIdScope (internal/async_hooks.js:294:19) at defaultTriggerAsyncIdScope (net.js:1024:9) at process._tickCallback (internal/process/next_tick.js:61:11) {"errno":"EMFILE","code":"ECONNECTION","syscall":"connect","address":"35.172.4.119","port":465,"command":"CONN"}
-
@scottalanmiller @baris, too many open files?
Have you tried raising ulimit?
-
No, you think that we might be short on file handles due to volume?
-
@julian said in NodeBB 1.11.2 Error Connect EMFILE:
@scottalanmiller @baris, too many open files?
Have you tried raising ulimit?
Raised it now. We will see...
-
@julian said in NodeBB 1.11.2 Error Connect EMFILE:
@scottalanmiller @baris, too many open files?
Have you tried raising ulimit?
Raised it dramatically. Seemed to help for a little while. But also updated to 1.12.0 so that might have had an effect.
But now getting this...
2019-03-13T15:42:05.728Z [4567/31594] - error: connect EMFILE 52.10.168.253:465 - Local (undefined:undefined) {"errno":"EMFILE","code":"ECONNECTION","syscall":"connect","address":"52.10.168.253","port":465,"command":"CONN"} 2019-03-13T15:42:05.728Z [4567/31594] - error: connect EMFILE 52.10.168.253:465 - Local (undefined:undefined) {"errno":"EMFILE","code":"ECONNECTION","syscall":"connect","address":"52.10.168.253","port":465,"command":"CONN"} 2019-03-13T15:42:05.728Z [4567/31594] - error: connect EMFILE 52.10.168.253:465 - Local (undefined:undefined) {"errno":"EMFILE","code":"ECONNECTION","syscall":"connect","address":"52.10.168.253","port":465,"command":"CONN"}
-
Could you check the open file descriptors with
lsof -p <PID>
periodically? My guess is that you're running into limitations because you have so many open websockets. Perhaps running more NodeBB processes would help. -
Wait, I think the shell was holding onto the old ulimit. Testing again.
-
@scottalanmiller Fluentd hosts some nice configs (distilled from a presentation from a Netflix engineer about tuning their servers) https://docs.fluentd.org/v1.0/articles/before-install
Perhaps this could help? Also check out
ss -s
, what isTIMEWAIT
?