NodeBB periodically hangs



  • NodeBB periodically hangs every few hours, requiring me to restart the server. No error messages are listed in the NodeBB log or the MongoDB log. I can't reproduce the error.

    What causes this? Is there a tool one can use to detect this and automatically restart the server?



  • You aren't alone here. I've been dealing with the same thing. It weird because the supervisor service still sees it as online, but nginx returns a 502. I have yet to find a good way to deal with this. It seems to have started on 1.0.3, so I've been considering going back down to 1.0.2.



    • What's your servers hardware?
    • What operating system runs on your server?
    • What's your Redis version?
    • What's your NGINX version?
    • What's your NGINX config?
    • What else is running on the server which might interfere with NodeBB / Redis (especially IO extensive tasks like backups)?
    • Is your server using swap at all / a lot of swap during the times of lag?
    • Have you tried running NodeBB in verbose mode?

    Best regards

    Bent



    • What's your servers hardware?
      Core™ i3-2130/3240 3.4 GHz+ | 8 GB Ram
    • What operating system runs on your server?
      CentOS 7.2
    • What's your Redis Mongo version?
      mongodb 3.2.7
    • What's your NGINX version?
      nginx 1.6.3
    • What's your NGINX config?
    server {
        listen         80;
        server_name     domain.com www.domain.com;
        return         301 https://www.domain.com/$request_uri;
    }
    server {
        ssl_certificate /etc/letsencrypt/live/www.domain.com/fullchain.pem;
        ssl_certificate_key /etc/letsencrypt/live/www.domain.com/privkey.pem;
    
        listen 443 ssl;
    
        add_header Strict-Transport-Security "max-age=31536000";
        server_name www.domain.com;
    
        location / {
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header Host $http_host;
            proxy_set_header X-NginX-Proxy true;
    
            proxy_pass http://127.0.0.1:4567/;
            proxy_ssl_session_reuse off;
            proxy_redirect off;
    
            # Socket.IO Support
            proxy_http_version 1.1;
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
        }
    }
    
    • What else is running on the server which might interfere with NodeBB / Redis (especially IO extensive tasks like backups)?
      Confluence. The server doesn't even hiccup right now though. Not much traffic.
    • Is your server using swap at all / a lot of swap during the times of lag?
      Negative. It's not lag. The server just stops.. but the daemon stays up. The only fix is to kill the service and start it up again
    • Have you tried running NodeBB in verbose mode?
      Not recently. Running in dev and logging to a file now. I'll provide relevant stuff the next time it crashes.

  • Community Rep

    Are you using the stable v1.0.3 or master?



  • @yariplus Good question. I'm honestly not sure. I just followed the install directions. I think I'm on the stable. :3 Is there an easy way to tell? My logs just say "Initializing NodeBB v1.0.3".


  • Plugin & Theme Dev

    @L33t git rev-parse HEAD



  • @pichalite That'd work if I didn't just scp it to another server. :)


  • Community Rep

    Try grep -c filter:hotswap src/webserver.js

    if it says 0, you are on stable.


  • GNU/Linux

    We see similar issues. Sometimes we have to restart many times per day, some days it will keep going. we've been tracking it here:

    We're running stuff under docker, which makes things more complicated, but I've had luck in a local dev/test environment with using Linux perf tools and building flamegraphs. For raisins, we haven't done this yet in docker, though you can see how to make that work here.

    However, if you're not running docker, it's a little bit easier. Follow the instructions here:

    He doesn't go into it there, but if you're on Ubuntu you'll need to install:

    • linux-tools-common
    • linux-cloud-tools-common

    ...and then kernel specific versions of those based on what you're running in order to use perf.

    If you get there before we do, I'm eager to hear what anyone discovers with this.



  • @yariplus I'm on stable.

    @boomzilla Thanks for the info. I'm on CentOS and not using docker. I'll take a look at that sometime soon. It's late thought and it's been a long day.



  • This is pretty hacky but it's worked beautifully so far: I wrote a Ruby script that tries to load my page. If it fails, it restarts the server. This is run in a cron job every minute. Here's my code:

    require 'net/http'
    
    begin
    	print "Downloading the page..."
    	res = Net::HTTP.start("yourdomainhere.com", :read_timeout => 5)
    	res.get('/')
    	puts " success!"
    rescue Net::ReadTimeout => e
    	print " failed.\n\nRestarting the server..."
    	`./nodebb restart`
    
    	puts " success!"
    end
    

    Until the cause of this is figured out, this or something like it might be a good enough workaround. It sure beats me manually restarting it after my users throw a fit.



  • @LukeLaupheimer That's beautifully terrible. Once it crashes again and I (maybe) get some useful info.. I'll do something like that.

    I don't ruby. Is that a loop? Or do I need to set a cron to run it every x minutes? Also.. my http is a 301 to https. Will that work with https also?



  • @L33t I set up a cron job so that it wouldn't be subject to the server resetting. I wanted it to be somewhat resilient in spite of its hackiness.

    That said, this is just treating a symptom. I absolutely do not advocate for this approach on a permanent basis, only until the true cause has been revealed and addressed. It's the software engineers' equivalent of filling a hole in a dam with a big wad of gum. It buys you time. It doesn't solve the real issue.


  • Swedes

    In my case the issue was with an old version of Nodejs. Try to update it.



  • After update to latest node.js version, everything works fine. I recommend use nvm for switching between node.js versions.


  • Swedes

    Cool NVM, something learned every day :) Good to know for dev sites!



  • Bad news is that I crashed with absolutely no errors a few minutes ago. The good news is that it actually recovered on it's on in dev mode. I'll try updating nodejs as recommended.



  • Looks like moving to the latest node.js did not help. Just happened again.


  • GNU/Linux

    We're running our instance with --perf-basic-prof-only-functions now and are able to generate flamegraphs. Just waiting for the servercooties to strike now...

    Example flamegraph:
    nodebb flamegraph


Log in to reply
 


Looks like your connection to NodeBB was lost, please wait while we try to reconnect.