@p > If it's targeted, that's great, but a meg of stuff, something useful should have been in there.
Not really. The only good stuff in there would be the Ecto stats but they're not granular enough to be useful. Someone sharing the raw Postgres stats from pg_exporter would have been better.
> but "the directory listing is 187MB" is a real problem (even if it's not a pointer-chase, you're still reading 187MB from the disk and you're still copying 187MB of dirents into userspace and `ls` takes 45s), and that gets marked as a dup of a "nice to have" S3 bug, but this is the default Pleroma configuration. It's stuff you can't write off, you know? You hit real constraints.
Where is the "ls" equivalent happening for Pleroma? Fetching a file by name is not slow even when there are millions in the same directory.
> Yeah, BEAM using CPU isn't the bottleneck, though. Like I said in the post you're replying to, it's I/O.
BEAM intentionally holds the CPU in a spinlock when it's done doing work in the hopes that it will get more work. That's what causes the bottleneck. It might not look like high CPU usage percentage-wise, but it's preventing the kernel from context switching to another process.
And what IO? Can someone please send dtrace, systemtap, some useful tracing output showing that BEAM is doing excessive unecessary IO? BEAM should be doing almost zero IO; we don't read and write to files except when people upload attachments. Even if you're not using S3 your media files should be served by your webserver, not Pleroma/Phoenix.
> That is cool, but if you could do that for fetching objects from the DB, you'd have a bigger bump.
patches welcome, but I don't have time to dig into this in the very near future.
> Anyway, I am very familiar with fedi's n*(n-1)/2 problems. (Some time in the future, look for an object proxy patch.)
plz plz send
> But you know, back-pressure, like lowering the number of retries based on the size of the table, that could make a big difference when a system gets stressed.
patches welcome. You can write custom backoff algorithms for Oban. It's supported.
> You could ping graf; it's easier to just ask stressed instances than to come up with a good way to do stress-testing.
Everyone I've asked to get access to their servers which were stuggling has refused except mint. Either everyone's paranoid over nothing or far too many people have illegal shit on their servers. I don't know what to think. It's not exactly a motivator to solve their problems.
> Oh, yeah, so 403s? What counts as permanent?
Depends on what it is. If you get a 403 on an object fetch or profile refresh they're blocking you, so no point in retrying. If it was deleted you get a 404 or a 410, so no point in retrying that either... (when a Delete for an activity you didn't even have came in, it would put in the job to fetch the activity it was referencing... and kept trying to fetch it over and over and over...)
> You think you might end up with a cascade for those? Like, if rendering TWKN requires reading 10MB...
No, I mean it was hanging to fetch latest data from remote server before rendering the activity, which was completely unnecessary. Same with rich media previews -- if it wasn't in cache, the entire activity wouldn't render until it tried to fetch it. Stupid when it could be fetched async and pushed out over websocket like we do now.
> The schedule doesn't bug me. ... the following bug is a big problem, that's a basic thing that was broken in a release. Some kind of release engineer/QA situation could have caught it.
Again, it wasn't broken in a release. There were two different bugs: one was that we used the wrong source of truth for whether or not you were successfully following someone. The other bug became more prominent because more servers started federating Follow requests without any cc field and for some reason our validator was expecting at least an empty cc field when it doesn't even make sense to have one on a Follow request.
You seem to have a lot of opinions and ideas on how to improve things but nobody else on the team seems to give a shit about any of this stuff right now. So send patches. I'll merge them. Join me.