Is there a simple way to analyze long running queries?
-
In NodeBB 1.10.2 I had many 503er and thus decided to just increase the event loop lag threshold to 200ms and also set the check interval to 1000ms. This made most 503er disappear. After upgrading to NodeBB 1.11.2 now suddenly again there is a 503er every 1-5 seconds.
CPU load is not an issue. Memory is also enough available. But I notice that the NodeBB and NodeJS event loop lag at times spikes. And I would like to find out why that is or which queries created those spikes. Is there a possibility to get this information from NodeBB? Are there suitable hooks that could be used for such monitoring?
-
@baris said in Is there a simple way to analyze long running queries?:
If you are running mongod, you can try running
db.currentOp()
to see the running queries during the 503.Oh, sorry. I might have formulated this a bit misleading. I was more thinking of the HTTP request. Basically find out long running NodeBB requests. I'd like to find out what causes the spike. A long running DB query shouldn't cause a spike as this is IO and shouldn't hurt the NodeJS event loop lag? If there is additional lag there has to be a longer running calculation in NodeBB. Or am I wrong with this assumption?
-
I found something new today. First it seems to have gotten quite bad at times. Lag spikes of up to a seconds. The second interesting thing is, that it happens on a schedule. It happens every 25 minutes. Is there anything in NodeBB that is happening every 25 minutes?
-
@baris Yes, they are. But further analysis seems to reveal, that is happening much more often and is building up in spikes before it drops again. This is the measurements for an hour:
Those spikes building up are about 1.5 - 2 minutes apart. I missed them previously as I wasn't zoomed in enough. The measurement is done every 15 seconds. I'm pretty sure it has something to do with the change to NodeBB 1.11.x since I hadn't such problems in NodeBB 1.10.x. But I still have absolutely no clue what it is and a very hard time to find possibilities to analyse this better.
-
@baris One nodebb process is running.
@julian So far I haven't noticed it outside of production. But my development machine is quite different. Also I usually don't have Prometheus attached to my development environment so it might be possible I don't even notice it. I'll have to look into setting up a second environment without any load but otherwise similar to the production environment. Might take me some time
-
@baris said in Is there a simple way to analyze long running queries?:
You can temporarily disable all cron jobs and see if the issue goes away, then we can pinpoint which one causes it. If it is still happening then it's not the cronjobs.
Do you mean in code? Or is there actually a setting to disable them all?