For the first time the #CoSocialCa Mastodon server has started to struggle just a little bit to keep up with the flow of the Fediverse.
-
@rndeon @virtuous_sloth The sidekiq stats here are coming from this Prometheus exporter: https://github.com/Strech/sidekiq-prometheus-exporter
The graph is showing the βsidekiq_queue_latency_secondsβ gauge, which is βThe number of seconds between the oldest job being pushed to the queue and the current time.β
Prometheus is configured to scrape the data from the exporter every 15 seconds, and Grafana is sampling on a 15s interval as well.
-
@mick @rndeon
It looks to me like the number of queued jobs is fairly low (like 1-4 queued jobs) and the jobs are sometimes arriving as pairs or triples with very little time in between (followed by a long time for the next bunch such that the average is still below the average job run time) so the second job of the pair will necessarily jump the time up. Then when the first job of the next pair is completed the time drops. -
@mick is it my fault? I think I shared a cat picture on Caturday. (Jk, keep up the great work)
-
@OldManToast the smoking gun Iβve been looking for!
-
This strikes me as an issue.
We have the capacity to run 40 workers (following the change I made last week, documented earlier in this thread.)
We have fairly huge backlog of pull queue jobs.
Why arenβt we running every available worker to clear this backlog?
It might be necessary to designate some threads specifically for the pull queue in order to keep up with whatever is going on here, but I am open to suggestions.
-
@mick I havenβt read the entirety of the thread, so forgive me if thatβs already been covered, but have you tried defining your workers with different sequences of queues.
So you could have one service defined as
/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 10 -q pull -q default -q ingress -q push -q mailers
Another as
/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 10 -q default -q ingress -q push -q mailers -q pull
Etc.
That wat you would have 10 workers prioritising the pull queue, but picking up other queues when capacity is available. And another 10 workers prioritising the default queue, but picking up other queues (including pull) when capacity is available.
You could permutate this for some different combination of queue priorities.
-
@michael thatβs where Iβm headed next I think.
Iβd hoped that just increasing the number of threads for the single service would be enough, but it seems like the default queue prioritization results in a backlog and idle workers.
So dedicating a number of threads per queue seems like the next sensible step.
Thanks for the suggestion!
-
@mick just to be clear, what Iβd suggest is not to dedicate them, but to prioritise.
Maybe you mean the same thing, but if you set up a service with
/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 10 -q pull
And another with
/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 10 -q push
Then that first process will sit idle when there is nothing in the pull queue, even if the push queue might be full.
If, on the other hand, you have a service defined as
/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 10 -q pull -q push
And another as
/home/mastodon/.rbenv/shims/bundle exec sidekiq -c 10 -q push -q pull
Then that first command will process the push queue, after the pull queue has been emptied. And the second one will process the pull queue after the push queue has been emptied. Thus potentially wasting fewer resources.
-
@michael right, I see the distinction, and hadnβt fully grasped what you were suggesting.
Given the limited resources available on our small server this seems like an excellent idea.
Iβll play with some changes on the staging system. Thanks!
-
-
Roni Laukkarinenreplied to Mick π¨π¦ on last edited by
@mick What monitoring system is that?
-
Mick π¨π¦replied to Roni Laukkarinen on last edited by
@rolle Prometheus, and this sidekiq-exporter https://github.com/Strech/sidekiq-prometheus-exporter
-
Roni Laukkarinenreplied to Mick π¨π¦ on last edited by
@mick Thanks!
-
Mick π¨π¦replied to Roni Laukkarinen on last edited by
@rolle I have found it extremely useful for keeping tabs on the system.
-
@michael based on this advice I've added some services that prioritize various queues and we're humming along nicely now. Thank-you!