[Solved] Socket.io 400/500 Errors

Technical Support
  • After spending a few days debugging a problem on our forum/nginx, I've finally found a solution to a massive amount of http 400 errors we've seen when loading the nodebb forum. These problems normally only manifest itself on high load.. In our case >500 connected clients.

    In the browser console you will see responses like this (for the failed connections):
    {"code":1,"message":"Session ID unknown"}

    Background

    • When receiving a 4xx error, nginx proxy by default will take the errant upstream out of rotation for 10 seconds
    • When upstream-A is unavailable, ip_hash will route all of A's requests instead to upstream-B
    • Unfortunately, when upstream-B gets the new requests, it spits out 4xx errors (correctly) because the SID is not found in this.clients
    • That makes them get taken out of rotation as well, and their requests get routed to upstream-C
    • and so on...

    The Solution

    Set the max_fails on the upstream to something higher than default (1).

    Example:

    upstream io_nodes {
       ip_hash;
       server 127.0.0.1:4567 max_fails=50;
       server 127.0.0.1:4568 max_fails=50;
       server 127.0.0.1:4569 max_fails=50;
    }
    

    I suggest someone to update the NodeBB documentation, including this in the nginx examples.

  • @hek do you mind reposting this on the issue tracker? I'm going to make a new issue tag for documentation-related problems so I can more easily track them. Thanks.


Suggested Topics