@phnt
-
@cvnt @phnt @NonPlayableClown @Owl @dj @ins0mniak @transgrammaractivist
> ask me how I know he doesn't test shit before pushing updates.
"there isn't a Pleroma instance that exists which cannot handle the load on available hardware" still irks me because it means that the person that wrote it was ignoring the performance issues that had been reported in the bug tracker as well as in messages on fedi. (I don't think most people could have convinced Pleroma to stay up when subjected to FSE's load.) I guess nothing has changed since FSE's last merge with upstream. -
@p @cvnt @phnt @NonPlayableClown @Owl @dj @ins0mniak @transgrammaractivist Nobody proved there was an *Oban* bottleneck and still haven't.
I'm always running my changes live on my instances. They were massively overpowered. Now I have a severely underpowered server and it's still fine.
If I could reproduce reported issues it would be much easier to solve them but things generally just work for me.
A ton of work has been put into correctness (hundreds of Dialyzer fixes) and tracking down elusive bugs and looking for optimizations like reducing JSON encode/decode work when we don't need to, avoiding excess queries, etc.
I'm halfway done with an entire logging rewrite and telemetry integration which will make it even easier to identify bottlenecks.
It's actually been going really great -
@feld @NonPlayableClown @Owl @cvnt @ins0mniak @phnt @transgrammaractivist
> Nobody proved there was an *Oban* bottleneck and still haven't.
Well, this was a remark years back. (It does still irk me.) Everything I know about the current Oban bug is second-hand, I am running what might be the only live Pleroma instance with no Gleason commits (happy coincidence; I was actually dodging another extremely expensive migration and then kicked off the other project, which meant I don't want to have to hit a moving target if I can avoid it, so I stopped pulling); at present, I backport a security fix (or just blacklist an endpoint) once in a while.
Unless you mean the following thing, but I haven't run 2.7.0. I don't know what that bug is.
> If I could reproduce reported issues it would be much easier to solve them but things generally just work for me.
I mean, like I mentioned, the Prometheus endpoints were public at the time. You could see my bottlenecks. (I think that would be cool to reenable by default; they'd just need to stop having 1MB of data in them if people are gonna fetch them every 30s, because enough people doing that can saturate your pipe.)
> A ton of work has been put into correctness (hundreds of Dialyzer fixes) and tracking down elusive bugs and looking for optimizations like reducing JSON encode/decode work when we don't need to, avoiding excess queries, etc.
I'm not sure what the Dialyzer is (old codebase), but improvements are good to hear about. That kind of thing gets you a 5%, 10% bump to a single endpoint, though. The main bottleneck is the DB; some cleverness around refetching/expiration would get you some much larger performance gains, I think; using an entire URL for an index is costing a lot in disk I/O. There's a lot of stuff to do, just not much of it is low-hanging, I think.
> It's actually been going really great
:bigbosssalute: That is awesome to hear. -
@p
> I mean, like I mentioned, the Prometheus endpoints were public at the time.
Problem is that this data is useful for monitoring overall health of an instance but doesn't give enough granular information to track down a lot of issues. With the metrics/telemetry work I have in progress we'll be able to export more granular Pleroma-specific metrics that will help a lot.
> The main bottleneck is the DB
So often it's just badly configured Postgres. If your server has 4 cores and 4 GB of RAM you can't go use pgtune and tell it you want to run Postgres with 4 cores and 4GB. There's nothing leftover for the BEAM. You want at least 500MB-1GB dedicated to BEAM, more if your server has a lot of local users so it can handle memory allocation spikes.
And then what else is running on your OS? That needs resources too. There isn't a good way to predict the right values for everyone. Like I said, it's running *great* on my little shitty thin client PC with old slow Intel J5005 cores and 4GB RAM. But I have an SSD for the storage and almost nothing else runs on the OS (FreeBSD). I'm counting a total of 65 processes before Pleroma, Postgres, and Nginx are running. Most Linux servers have way more services running by default. That really sucks when trying to make things run well on lower specced hardware.
You also have to remember that BEAM is greedy and will intentionally hold the CPU longer than it needs because it wants to produce soft-realtime performance results. This needs to be tuned down on lower resource servers because BEAM itself will be preventing Postgres from doing productive work. It's just punching itself in the face then. Set these vm.args on any server that isn't massively overpowered:
+sbwt none
+sbwtdcpu none
+sbwtdio none
> using an entire URL for an index is costing a lot in disk I/O
For the new Rich Media cache (link previews stored in the db so they're not constantly refetched) I hashed the URLs for the index for that same reason. Research showed a hash and the chosen index type were super optimal.
Another thing I did was I noticed we were storing *way* too much data in Oban jobs. Like when you federated an activity we were taking the entire activity's JSON and storing it in the jobs. Imagine making a post with 100KB of content that needs to go to 1000 servers? Each delivery job in the table was HUGE. Now it's just the ID of the post and we do the JSON serialization at delivery time. Much better, lower resource usage overall, lower IO.
Even better would be if we could serialize the JSON *once* for all deliveries but it's tricky because we gotta change the addressing for each delivery. Jason library has some features we might be able to leverage for this but it doesn't seem important to chase yet. Even easier might be to put placeholders in the JSON text, store it in memory, and then just use regex or cheaper string replacement to fill those fields at delivery time. Saves all that repeat JSON serialization work.
Other things I've been doing:
- making sure Oban jobs that have an error we should really treat as permanent are caught and don't allow the job to repeat. It's wasteful for us, rude to remote servers when we're fetching things
- finding every possible blocker for rendering activities/timelines and making those things asynchronous. One of the most recent ones I found was with polls. They could stall rendering a page of the timeline if the poll wasn't refreshed in the last 5 mins or whatever. (and also... I'm pretty sure polls were still being refreshed AFTER the poll was closed 🤬)
I want Pleroma to be the most polite Fedi server on the network. There are still some situations where it's far too chatty and sends requests to other servers that could be avoided, so I'm trying to plug them all. Each of these improvements lowers the resource usage on each server. Just gotta keep striving to make Pleroma do *less* work.
I do have my own complaints about the whole Pleroma releases situation. I wish we were cutting releases like ... every couple weeks if not every month. But I don't make that call. -
@feld
> export more granular Pleroma-specific metrics that will help a lot.
If it's targeted, that's great, but a meg of stuff, something useful should have been in there.
> So often it's just badly configured Postgres.
Well, yeah, the defaults are bad, but you don't even get to the big performance issues if you're still doing that kind of thing.
> If your server has 4 cores and 4 GB of RAM you can't go use pgtune and tell it you want to run Postgres with 4 cores and 4GB.
I think there's a "mixed" dropdown but that's one of the reasons I split early on: it's not just that, I/O bandwidth is a killer, and BEAM doesn't know how to leave the disk alone. In terms of CPU/memory, Pleroma's pretty lightweight. I'm addressing you from this ARM cluster and this instance is just a small one while the big one comes back, and it's fine: BEAM is eating 2% of the RAM, 80% of one core, it's around 10r/s (bump from bae.st's closure, but still about a third of FSE), but I've also got all the CPU-heavy stuff killed (no filemagic, no imagemagick, no Twitter cards) and my own code is serving the /objects/.
At any rate, I had Postgres sensibly configured, Moon did, sjw did. graf has to delete everything older than three months (or something like that) and he has a ridiculous amount of hardware. Reconfiguring Postgres does not help you with a 1kB foreign key, though. Throwing more cores and more RAM at it doesn't make a 1kB key act like an 8-byte key, and you've still got to store it *somewhere*.
> And then what else is running on your OS? That needs resources too. There isn't a good way to predict the right values for everyone.
Man, if we're gauging from your picture, I've been doing this a little longer than you, so I really don't need to hear basic sysadmin stuff. Regardless of intent, it looks like condescension and fine, whatever, I'm an adult, but it wastes time to say things that, you know, you could reason that I know these things if I've been running Pleroma for six years. I think most of the people in the thread are running machines dedicated to Pleroma, anyway.
I'm currently trying to cope with sjw's 1TB DB and the "everything in a flat directory" problem ( https://git.pleroma.social/pleroma/pleroma/-/issues/1513 ) but multiplied across 3TB of waifus, right. I mean, "Well, you're at the mercy of VFS" and that's a trivial observation, but "the directory listing is 187MB" is a real problem (even if it's not a pointer-chase, you're still reading 187MB from the disk and you're still copying 187MB of dirents into userspace and `ls` takes 45s), and that gets marked as a dup of a "nice to have" S3 bug, but this is the default Pleroma configuration. It's stuff you can't write off, you know? You hit real constraints.
> Most Linux servers have way more services running by default. That really sucks when trying to make things run well on lower specced hardware.
Well, sure, but I think anyone running it on a Pi is going to be configuring the Pi for a dedicated setup, I don't think they're running Gnome on it. Sysadmin 101 doesn't help here. That last box, flaky hardware aside, that was 32 cores, 384GB RAM, Postgres got its own dedicated NVMe SSD (currently the same disk has been repurposed for the apotheosis because the DB was already on it), heavily tuned kernel, it should have screamed (it did for most stuff), but hashtags never got fast enough that I could turn them back on.
> You also have to remember that BEAM is greedy and will intentionally hold the CPU longer than it needs because it wants to produce soft-realtime performance results.
Yeah, BEAM using CPU isn't the bottleneck, though. Like I said in the post you're replying to, it's I/O. It's already a problem if any of this is CPU-bound, anyway: it's network software.
> Set these vm.args on any server that isn't massively overpowered:
> +sbwt none
> +sbwtdcpu none
> +sbwtdio none
Thanks. Those are probably useful. I don't know, I'd have to dig into the BEAM docs because I don't know the implications of "sbwtdio" off the top of my head, but if you're expecting CPU usage to be too high, it's not that; CPU usage is fine. FSE never used much CPU, fast_html ate a lot of CPU, Postgres hardly needed CPU. So I can squeeze about as much out of this RK3588 as I could those 32 Xeon cores because it's just I/O bandwidth.
> For the new Rich Media cache (link previews stored in the db so they're not constantly refetched) I hashed the URLs for the index for that same reason. Research showed a hash and the chosen index type were super optimal.
That is cool, but if you could do that for fetching objects from the DB, you'd have a bigger bump. You know, just something like activities_unique_apid_index, which is just `data->>'id'`, right? Or activities_recipients_index. These indexes get huge and it'd be faster to toss a little CPU at Postgres if it has to read fewer pages from the disk. This is a big, slow one:
CREATE INDEX activities_create_objects_index ON public.activities USING btree (COALESCE(((data -> 'object'::text) ->> 'id'::text), (data ->> 'object'::text)));
Lots of joins use indexes like that. And this is something you don't see in a regular benchmark, you only see if you've got a lot of concurrent random load, but sometimes the N+1 problem helps: even on a single thread, sometimes fetching 100 IDs and then populating the objects one at a time is faster (seriously, try it, do a test on /api/v1/notifications: dump the queries, break them up, then try without the joins), but in any case it means that one process that ties up a connection from the pool for six minutes no longer does that, so you get better throughput, but sometimes it ends up faster even if it's the only request (the query plan is often counterintuitive and Pleroma's mature enough that you more or less know which queries you're going to run, you'll want to run `EXPLAIN` a lot). I mean, the connection pool has the same characteristics as a system with voluntary preemption so you want to release connections as frequently as you can.
> Another thing I did was I noticed we were storing *way* too much data in Oban jobs.
Yep; I've had to debug problems with that, and you end up having to walk down some JSON embedded in another JSON blob just to figure out which /inbox a job is bound for.
> Imagine making a post with 100KB of content that needs to go to 1000 servers?
Ha, no one sends the Bee Movie script or the King James bible around fedi any more.
Anyway, I am very familiar with fedi's n*(n-1)/2 problems. (Some time in the future, look for an object proxy patch.) But that table basically fits in memory in most cases. It's 36MB on FSE right now. (It's 1.6GB on Poast right now and that doesn't seem right so I will have to have a look at that after I've done this.)
But you know, back-pressure, like lowering the number of retries based on the size of the table, that could make a big difference when a system gets stressed. Some sort of per-host grouping would be better: re-use connections or if you try selecting jobs from hosts without recent failures, that'd be good. Like, we had to manually clear a lot of shit out of Poast's queue for various reasons, like a timeout eats a worker for 15, 30, 60 seconds, right? Whatever the number is. So someone supplements their instance-block with `-j DROP` and suddenly the oban queue is full of connection timeouts; the easy solution was to drop all the jobs bound for those instances and then shove an iptables rule in so that the jobs would fail immediately instead of timing out, and people start getting posts delivered again.
You could ping graf; it's easier to just ask stressed instances than to come up with a good way to do stress-testing. I'm happy to help with that stuff (as I have been) but fsebugoutzone.org is running an old version and not as stressed as freespeechextremist.com was, so it's not likely to be helpful. (I'll be updating this instance after FSE proper is back up, but it's still not going to be the same amount of traffic as FSE, so limited utility.)
> Even better would be if we could serialize the JSON *once* for all deliveries but it's tricky because we gotta change the addressing for each delivery.
This is probably a fun problem to work on but I think it's not a major bottleneck. (It is nice to have even the little stuff polished and optimized, so I get it. I'm not saying it's a bad thing to work on, it's cool you're doing that.)
> - making sure Oban jobs that have an error we should really treat as permanent are caught and don't allow the job to repeat. It's wasteful for us, rude to remote servers when we're fetching things
Oh, yeah, so 403s? What counts as permanent? {:rejected, "Go away"}?
> - finding every possible blocker for rendering activities/timelines and making those things asynchronous.
You think you might end up with a cascade for those? Like, if rendering TWKN requires reading 10MB of data from the disk and that's split across several pages in several different indexes, you're still reading 10MB, just doing it asynchronously means you're holding up a connection in the background and it's still got to read 10MB and it's still gotta do 2500 seek()s or whatever. 10MB coming off the disk is still 10MB coming off the disk, it's still 10MB going through the bus, and a seek() is still not free (even on an SSD, even a fancy NVMe SSD).
> They could stall rendering a page of the timeline if the poll wasn't refreshed in the last 5 mins or whatever.
Ah, okay, yeah, that kind of thing you can serve stale.
> also... I'm pretty sure polls were still being refreshed AFTER the poll was closed
Ah, ha, that's an easy one I bet.
> I want Pleroma to be the most polite Fedi server on the network.
That would be ideal, yeah. Some of the Pleroma issues I have had the opportunity to address (it's much easier to analyze something that has been built already than it is to do it from scratch, so I owe Pleroma), that's one of them: no simultaneous requests for the same URL flooding the other server, no repeated requests for the same URL, things like that. If you could beat me on politeness before I get mine out the door, that'd be cool. (I mean, the initial seed for it was that I wanted to make an object proxy/repository so that N cofespace servers didn't need to make N requests to the same server and keep N copies of the same object.)
> I do have my own complaints about the whole Pleroma releases situation. I wish we were cutting releases like ... every couple weeks if not every month. But I don't make that call.
The schedule doesn't bug me. I think the breakage is a bad thing: you see the thing people are talking about in the thread (when the thread gets around to real shit instead of drama), the following bug is a big problem, that's a basic thing that was broken in a release. Some kind of release engineer/QA situation could have caught it.
A decent stress/fuzzer setup would be useful but you almost don't need one because of this: http://demo.fedilist.com/instance?software=pleroma&sort=users . 1,083 live instances, and they're all constantly being flooded with dirty data from weird experimental servers, some of them are Soapbox but a lot of the Soapbox instances are going to be moving back to Pleroma soon. (If there is a tweak to fedilist that would help debugging, I'm happy to make it, time permitting: search by version number or whatever. It's been useful for tracing bugs, like the peer count for the Misskey/Mastodon activitypub-troll.cf bug that Pleroma was immune to, so we could see when the number of peers spiked and find patient zero, ping the admins that were affected. Say if "Follow" activities stop coming from servers and those servers all have the same version number; it's hard to know what you're looking for before people start complaining about it.) -
@p
> If it's targeted, that's great, but a meg of stuff, something useful should have been in there.
Not really. The only good stuff in there would be the Ecto stats but they're not granular enough to be useful. Someone sharing the raw Postgres stats from pg_exporter would have been better.
> but "the directory listing is 187MB" is a real problem (even if it's not a pointer-chase, you're still reading 187MB from the disk and you're still copying 187MB of dirents into userspace and `ls` takes 45s), and that gets marked as a dup of a "nice to have" S3 bug, but this is the default Pleroma configuration. It's stuff you can't write off, you know? You hit real constraints.
Where is the "ls" equivalent happening for Pleroma? Fetching a file by name is not slow even when there are millions in the same directory.
> Yeah, BEAM using CPU isn't the bottleneck, though. Like I said in the post you're replying to, it's I/O.
BEAM intentionally holds the CPU in a spinlock when it's done doing work in the hopes that it will get more work. That's what causes the bottleneck. It might not look like high CPU usage percentage-wise, but it's preventing the kernel from context switching to another process.
And what IO? Can someone please send dtrace, systemtap, some useful tracing output showing that BEAM is doing excessive unecessary IO? BEAM should be doing almost zero IO; we don't read and write to files except when people upload attachments. Even if you're not using S3 your media files should be served by your webserver, not Pleroma/Phoenix.
> That is cool, but if you could do that for fetching objects from the DB, you'd have a bigger bump.
patches welcome, but I don't have time to dig into this in the very near future.
> Anyway, I am very familiar with fedi's n*(n-1)/2 problems. (Some time in the future, look for an object proxy patch.)
plz plz send
> But you know, back-pressure, like lowering the number of retries based on the size of the table, that could make a big difference when a system gets stressed.
patches welcome. You can write custom backoff algorithms for Oban. It's supported.
> You could ping graf; it's easier to just ask stressed instances than to come up with a good way to do stress-testing.
Everyone I've asked to get access to their servers which were stuggling has refused except mint. Either everyone's paranoid over nothing or far too many people have illegal shit on their servers. I don't know what to think. It's not exactly a motivator to solve their problems.
> Oh, yeah, so 403s? What counts as permanent?
Depends on what it is. If you get a 403 on an object fetch or profile refresh they're blocking you, so no point in retrying. If it was deleted you get a 404 or a 410, so no point in retrying that either... (when a Delete for an activity you didn't even have came in, it would put in the job to fetch the activity it was referencing... and kept trying to fetch it over and over and over...)
> You think you might end up with a cascade for those? Like, if rendering TWKN requires reading 10MB...
No, I mean it was hanging to fetch latest data from remote server before rendering the activity, which was completely unnecessary. Same with rich media previews -- if it wasn't in cache, the entire activity wouldn't render until it tried to fetch it. Stupid when it could be fetched async and pushed out over websocket like we do now.
> The schedule doesn't bug me. ... the following bug is a big problem, that's a basic thing that was broken in a release. Some kind of release engineer/QA situation could have caught it.
Again, it wasn't broken in a release. There were two different bugs: one was that we used the wrong source of truth for whether or not you were successfully following someone. The other bug became more prominent because more servers started federating Follow requests without any cc field and for some reason our validator was expecting at least an empty cc field when it doesn't even make sense to have one on a Follow request.
You seem to have a lot of opinions and ideas on how to improve things but nobody else on the team seems to give a shit about any of this stuff right now. So send patches. I'll merge them. Join me. -
@feld If this comes out hasty, it's because we are currently being swamped; it's been a while since FSE got DDoS'd/scraped, it feels nostalgic! Anyway, it's enough that I'm getting 100ms ping times and apparently 85% packet loss periodically; see attached. At any rate, if I seem curt or distracted, it is because I am distracted, don't take it personally.
> The only good stuff in there would be the Ecto stats but they're not granular enough to be useful. Someone sharing the raw Postgres stats from pg_exporter would have been better.
Makes sense.
> Where is the "ls" equivalent happening for Pleroma? Fetching a file by name is not slow even when there are millions in the same directory.
The `ls` illustration isn't because Pleroma does an `ls`, it's a gauge. `rm` is quadratic. Some filesystems have a hard limit on the number of entries you can put into a directory. If you are not hosting UGC by strangers, it doesn't matter to you personally that finding and removing a file takes ten minutes, sure. If you do not need to write and run a backup script, this doesn't matter to you, sure.
What *does* matter is that when you unlink a file, most filesystems just scribble over the dirent for the unlinked file and the space is never reclaimed. See the figures in !1513. Adding a file is usually appending, but it's going to be O(n) anyway because it'll usually scan the entire chain of dirents and put it at the end. (ext4 has some facilities to mitigate this, it'll occasionally repack, see the tune2fs man page, but it's a mess even if it does and that option isn't always enabled.)
And, you know, anyone with open registrations is going to have trouble like this. There's some forum where a guy just went down the list of places that have open registrations and dumped some videos that no one wants to host and then checked back in a week to see which accounts got banned and which were still up. FSE got .exe files dumped by someone that needed a host for some email-based phishing campaign. (UGC is a goddamn nightmare.)
> BEAM intentionally holds the CPU in a spinlock when it's done doing work in the hopes that it will get more work. That's what causes the bottleneck. That's what causes the bottleneck. It might not look like high CPU usage percentage-wise, but it's preventing the kernel from context switching to another process.
Well, here's a hypothetical situation: let's say that you notice something is slow as shit, and you watch Postgres and see that the query for fetching the notifications takes 10s and the request takes 10.1s. Then let's say you paste that query into Postgres and fill in a value for the user_id and it takes 10s. Then say someone says "You see, this is a BEAM spinlock issue" instead of asking what makes you think that, and you have already eliminated that or you wouldn't have said that the query is slow: what do you say?
Come on, dude. If you're skeptical, go ahead and ask for details. I am happy to give details, but if I'm not certain, I wouldn't say it, I'd qualify it with "I think" or I'd ask. The CPU is not the bottleneck for most of the load I that see.
Anyway, the query is slow. Notifications, that is a really slow query; part of it is that notifications never go away and one activity is one notification, so maybe they should be collapsed. I've had to abridge the table and this is difficult to do in a reasonable way that makes sense to people, like maybe you get rid of everything that has been marked as seen and is older than three months, but you try to keep a minimum of 100 notifications per user, right, because it gets insanely slow to query after a certain point. (Right now FSE's getting close to its 10,000,000th notification and I think there are only about 200k in the DB.) Same thing with hashtags, same thing with search: it's not BEAM, it's the DB.
> Even if you're not using S3 your media files should be served by your webserver, not Pleroma/Phoenix.
This has nothing to do with S3, it has nothing to do with serving the files: the files are stored in a way that you learn not to do pretty quickly, because a pointer-chase across the disk just to read the directory trashes your performance even if you don't know that it's a pointer-chase across the disk. As noted, it starts to grind on adding new files, it hurts backups, it hurts a lot of normal sysadmin tasks you have to perform.
You know what happens if you delete a file? It doesn't shrink the directory on most filesystems: `mkdir testdir; cd testdir; awk 'BEGIN{for(i=0;i<(2^17);i++) printf "%x">i}'; find -type f -delete; ls -lhd .` Appending a file to the list is O(n) and hopefully it's cached but this is how the data is laid out on disk, and "n" is potentially all of the files that ever existed in the directory rather than the ones that are there now.
> patches welcome, but I don't have time to dig into this in the very near future.
That's reasonable, but it's a thing to keep in mind; I might not have the time to dig into this, either. I could build out the index and see that it's smaller but how well that translates into real-world query performance is not a trivial thing to determine. I do know that the big strings in the JSONB bloat the indexes but a minimally disruptive way to do that without a rewrite, that's a big deal, and then making the migration fast is sometimes non-trivial. (The flakeid migration took the better part of three *days* on FSE with that stupid ASCII-art version of Clippy telling me cutesy bullshit about the behavior of rabbits the entire time. I'm not sure that *could* have been made faster the way the activities table is structured.)
> plz plz send
This one I definitely will. (I'll have to update to send a patch so don't look for it this week, you know, it'll be a minute. But if it works fine with objects, there are a lot of things that ought to be able to work this way. You'll probably want a low-priority job to check some percentage of the proxied objects, things like that.) I do not have a plan for removing objects (while it's a join, you can't as easily make it trigger a refetch if the object is missing), but fetching them from other sources should hopefully alleviate situations like a post on a tiny single-user instance blowing up and saturating the pipe, and it should be trivial to use as an external media proxy. If I'm lucky, the software ends up symbiotic and we both benefit, but in any case, if an instance is down but you can still fetch posts, that's good, and if an instance is less swamped because load gets distributed, that's nice.
> patches welcome. You can write custom backoff algorithms for Oban. It's supported.
Sure; I'm dealing with another codebase, and hopefully it's a nice complement to Pleroma but I don't know how much work I'll be able to put into Pleroma directly. (Obviously the object proxy thing I plan to do.) It's less "Do this!" and more "Do you have thoughts on this?" I'm well aware that it's one thing to come up with a plan and another to actually work it into a codebase.
> Everyone I've asked to get access to their servers which were stuggling has refused except mint. Either everyone's paranoid over nothing or far too many people have illegal shit on their servers. I don't know what to think.
Sure, I get it. I don't think you'd hose anyone, but there are people with legitimate privacy concerns. There are probably people on some of these servers that would close their accounts on the server if anyone besides their admin had access to the server. (Ironically, there are people on Poast that are there because they're afraid of me. So they bailed from FSE to...another server I have root on.) So it's not just the admin's paranoia, it's the LCM of all of the paranoias of everyone on the box, and you know, even if it's just a big repository of shitposts, you kinda feel responsible for things you do that affect the people using it, you know what I mean? No one's making any money off this stuff, so it's a bunch of people trying to do a good job and provide a reliable service just because they care about doing that.
> It's not exactly a motivator to solve their problems.
Well, in my case, I can solve my own problems; it's just that a bunch of hatchet-job patches to FSE's codebase (at present, because 2.2.2 does not run with current OpenSSL, I had to actually patch _deps in-place) only solve *my* problems and they do not help anyone else out. A lot of people ask me about this or that and I expect this is true for a lot of people running higher-volume Pleromae, so multiply however many questions I get by the number of people running those servers, and if it were fixed upstream, no one would need to ask.
> If you get a 403 on an object fetch or profile refresh they're blocking you, so no point in retrying.
Right, so it's just not retrying the current job, not a list of things to never retry, right? (I think the former is a better idea unless you're doing the Postgres equivalent of a circular buffer; misconfigurations are frequent.)
> (when a Delete for an activity you didn't even have came in, it would put in the job to fetch the activity it was referencing... and kept trying to fetch it over and over and over...)
Yeah, I know this one. :terrylol2:
> No, I mean it was hanging to fetch latest data from remote server
Right, I follow now.
> Same with rich media previews
Ah, I have those completely elimated; it'd be nice if this were a user-level setting, I think. I don't want to fetch an image from Facebook's CDN, this is the "good web" that I'm glad to browse rather than the hostile web, and Facebook can't be the only company doing "shadow profiles".
> Again, it wasn't broken in a release.
What I have is second-hand and by the time I update FSE, it'll probably be fixed. The extent of what I know is that people complained of breakage, they downgraded, the breakage went away.
> You seem to have a lot of opinions and ideas on how to improve things but nobody else on the team seems to give a shit about any of this stuff right now. So send patches. I'll merge them. Join me.
Ha, appreciated, that's encouraging. Time and energy permitting, sure; I end up spelunking a lot in either case, but most of the ideas I have are architectural and doing something like that ends up requiring a really good overview of the codebase: you can see the symptoms and reason what the architecture must be because you are familiar with how these things act (in a language-independent way) but reasoning from the outside is a long way from making a big change to the inside, I'm barely conversant in Elixir (which seems to change very frequently), and Elixir seems to favor too much of the AOP-style stuff that Rails/Java like, this makes behavior hard to track down, you get "spooky functors at a distance". So I am not certain that I will ever get enough time to do the kind of stuff that I'd like to do with Pleroma, but small stuff I come across I can send up, like the DB indexes one. (Relatively self-contained and was heavier on Postgres than Elixir.) You know? Like, I see the refetching bug with the accidental DDoS and I can reason that it must not be limiting the outgoing reqs and it must not be keeping track of objects that it fetches and then rejects by an MRF, I can ask around at Moon and moth and lanodan and they can confirm things, but then I discuss it with mint and there are two HTTP request libraries in use and which uses which changes between versions and a fix is doable but non-trivial.
I am curious how your IPFS experiment went, because, so far, IPFS has been a disaster every time I tried to integrate it into anything. Revolver nominally supports it but I have it turned off on every node because it's—not remotely exaggerating—a 100x slowdown. (I like keeping it around because it gives a side-channel for distributing blocks and I like it in theory but IPFS's codebase is a mess spread across multiple repos and actually using it has been nothing but pain. I tried to track down some behavior that diverged from the documentation (storing raw blocks) but the origin of the behavior was split across three repos, it was a mess.) Dumping objects into IPFS might actually be tenable if IPFS worked right (I imagine you could use the IPNS stuff).
swamped.png -
@p I spent so many hours fighting with IPFS I think it's a dead end. You can even find a stale MR where I upgraded the storage backend from Badger to Badger2 (because the default filesystem store sucks) and it didn't help much. Nobody working on Kubo has any interest in fixing the storage scalability problems it seems. To use it for fedi we'd really want the garbage collection to work and not spin 100% CPU and IO for hours but that's what it does when you get enough data in there 🤬
-
@feld
> I spent so many hours fighting with IPFS I think it's a dead end.
Yeah, that's my conclusion, too. It should be great but is not even good. FSE had the start of a patch but I didn't get too far actually integrating it. (
https://git.freespeechextremist.com/gitweb/?p=fse;a=commit;h=2dec0a138612eebe79f757b5bc04810b81d65951 )
> Nobody working on Kubo has any interest in fixing the storage scalability problems it seems.
I think it's a little past them, if I'm being honest. They could probably get away with just integrating tkrzw.
> To use it for fedi we'd really want the garbage collection to work and not spin 100% CPU and IO for hours but that's what it does when you get enough data in there 🤬
Yeah, you wanna see a real horror, look at the directory where it stores the data. I don't know if it's still doing that but it was storing different permutations of the same hash as different files last I checked. So the Qm$whatever or the bafy$whatever, those were just stored as separate files instead of normalizing the hash.
8MB is too large for you to get fortuitous collisions, it's not a very good chunk size for transfer, just the entire thing is like a parody. -
@p Unfamiliar with tkrzw; I was thinking they should build on LMDB.
-
find you on :butterfedy1: fediversereplied to feld last edited by
@feld @p @silverpill @jeffcliff ok, very interesting to read these thoughts on the degraded state of IPFS and how unfit it currently is for use with fedi (in browser as an add-on and as a server).
it fits with my thesis that foss on m$github is getting slowly compromised, by the shell accounts that the corporate state have on that weaponized platform.... a controlled platform, at best, and a platform that deanonymizes devs, which is a way to control them at the very least.
Id consider going back to 2021 and looking at their codebase then, and comparing it to today, there were some very prominent independent media folk talking about IPFS back then and so i suspect that THAT is when the controlled demolition may have started. My understanding is once the corporates have their sights on a codebase they force the programmers to accept bad tradeoffs. Trade-offs exist everywhere, and permutations of tradeoffs can be chosen ad-infinitum. my learned experience is that foss on mshithub tends to not improve in ways that would allow it to compete with corposhit, especially in the #ux department.
to accelerate :butterfedyC: fediverse centric adaptations, just over a week ago i wrote about having #ipfs on another repo and i now, having read both your comments i think it may need to indeed be a fork. a week ago i suggested the acronym DCN (decentralised content network) a tongue-in-cheek reference/reversal of CDN....
so yeah if anyone wants to run with that idea, u have my blessing
-
silverpillreplied to find you on :butterfedy1: fediverse last edited by
@frogzone @jeffcliff @p @feld Performance issues with IPFS were there from the beginning. Many hoped that Protocol Labs will fix them but that didn't happen, perhaps due to the inherent limitations of the protocol, or due to incompetence.
In any case, this is not a "controlled demolition". There was a shift to NFTs and then Filecoin around 2020, but I don't think that affected IPFS in a negative way -
pistoleroreplied to find you on :butterfedy1: fediverse last edited by@frogzone @silverpill @jeffcliff @feld
> foss on mshithub tends to not improve in ways that would allow it to compete with corposhit,
The mass "Rewrite it in Rust" bit tends to have the side effect of converting (A)GPL software to MIT/BSD.
> so yeah if anyone wants to run with that idea,
:ocelot: I may be a bit ahead of you on this. :revolvertan: -
find you on :butterfedy1: fediversereplied to pistolero last edited by
@p @silverpill @jeffcliff @feld >re rust and (A)GPL to BSD
I'll need to keep an eye on that>ocelot/revolver
sounds like i may need to look closerthe reason i thought we could do it with ipfs was that it already was being used in i2p, but its only being used as a proof of concept at this stage.
-
pistoleroreplied to find you on :butterfedy1: fediverse last edited by@frogzone @silverpill @jeffcliff @feld
> sounds like i may need to look closer
I explained the block splitting to nyanide's final boss a few days ago: https://fsebugoutzone.org/notice/AnGc6MOGfUsuVOsblQ . Really, the better thing to read, though, would be the Kademlia paper (attached) and the venti paper, which is at http://doc.cat-v.org/plan_9/4th_edition/papers/venti/ .
> the reason i thought we could do it with ipfs was that it already was being used in i2p, but its only being used as a proof of concept at this stage.
I hesitate to say too much about numbers but I will say that I think it would be hard to make something as slow as IPFS. You'd have to try really hard.
kpos.pdf -
silverpillreplied to pistolero last edited by [email protected]
@p @frogzone @jeffcliff @feld What is your current stance on FEP-ef61? Does it conflict with what you're doing in Revolver, or complement that?
Significant progress has been made since we last discussed it. Today there are two implementations, Mitra and Streams, and they are partially interoperating.
-
@silverpill @frogzone @jeffcliff @feld I haven't kept up with it. I have some notes but you can feel free to ignore them; I don't want to get in the way of some people doing a thing. Wherever it lands, I'll try to make sure that we can interoperate without breaking the FEP semantics if possible, but some places, it's not going to be possible. So, here are the notes (and quick read, so I may have missed things or I may have the wrong idea, because the stupid DDoS yesterday cost me a lot of time and I'm trying to catch up today):
> ap://did:example:123456/path/to/object?name=value#key-id
If it's a new protocol, I don't know why we're still doing a fragment for the key ID. I don't love JSON (anti-web) but we've got to live with it; on the other hand, I don't want to abuse JSON or URLs.
> Ed25519
I'm using RSA keys so that it's easy to port myself from Pleroma. It is entirely possible that there is something I am missing and that we shouldn't be using RSA keys; maybe key algo is overspecifying.
> /.well-known/apgateway
I like this, but I wonder if there's a way around it without hard-coding another URL.
> The client MUST specify an Accept header with the application/ld+json; profile="https://www.w3.org/ns/activitystreams" media type.
This is actually objectively wrong. I don't know why this should be done rather than using the HTTP standard interpretation of "Accept:" headers. If the client doesn't include "application/ld+json" or "*/*", then you just return a 406. The way Pleroma/Masto do these things kind of shits on RFC 2616 and I really love RFC 2616. (This is why I'm not doing the terrible sniffing algorithm to differentiate between a browser and another server on the /objects/$id URLs.) If there is a good reason, maybe, but if we're tossing out the well-defined HTTP semantics that HTTP clients rely on, we can't call it HTTP.
> Portable actors
This looks roughly compatible with what I am doing; there's an ID for an actor, the network protocol is used to find the actor, etc., and then to use an AP gateway, you sign a request to be canonically located on one, the owner of that domain approves or ignores, and that's the PoT for your account as far as any other AP server knows. So your /inbox and /outbox, the server with the responsibility for delivering your posts when it sees new ones, etc.
So this section looks like it's roughly compatible with that. Instead of ordered gateways, Revolver can just send one gateway.
> Portable objects
This part, I don't know how I'm going to manage this. The "attributedTo" being tied to the gateway URLs, and then those potentially changing, that's going to be a mess with content-addressed storage. If you create a new activity by signing the head of a tree that contains pointers to objects in a stream, then those objects changing (e.g., because a gateway changed) is going to make hopping computationally infeasible once you have a long enough history (which is going to be almost all of the initial users, since they will mostly be people on FSE/bae.st that follow the move to Revolver). -
>I'll try to make sure that we can interoperate without breaking the FEP semantics if possible, but some places, it's not going to be possible.
If you want to see how it all works in practice, there is @nomad
It is a Mastodon-compatible FEP-ef61 actor that is managed from a client.
>If it's a new protocol, I don't know why we're still doing a fragment for the key ID.
'ap' URLs resolve to ActivityStreams documents, and fragments are useful for pointing to specific sections within those documents. The URL you're citing is just an example, and implementers are not required to use fragments for key IDs (although that might be a good idea).
I shall change fragment ID in this example to something neutral.
>>Ed25519
>I'm using RSA keys so that it's easy to port myself from Pleroma. It is entirely possible that there is something I am missing and that we shouldn't be using RSA keys; maybe key algo is overspecifying.Nowadays security specialists advise against using RSA (1, 2), and some modern standards do not support it (for example, there is no RSA cryptosuite for Data Integity, which we use for signing objects).
Fediverse needs to migrate to something else and EdDSA seems to be the best option (according to the rough consensus among fedi devs). Also, Ed25519 key size is very small, that's pretty cool.
>>/.well-known/apgateway
>I like this, but I wonder if there's a way around it without hard-coding another URL.Yes, follow-your-nose approach is mentioned in "Discussion - Discovering locations". Existing implementations rely on well-known locations, but I think we can support follow-your-nose if needed.
>> The client MUST specify an Accept header with the application/ld+json; profile="https://www.w3.org/ns/activitystreams" media type.
>This is actually objectively wrong. I don't know why this should be done rather than using the HTTP standard interpretation of "Accept:" headers. If the client doesn't include "application/ld+json" or "*/*", then you just return a 406. The way Pleroma/Masto do these things kind of shits on RFC 2616 and I really love RFC 2616.But the FEP should say something about Accept header and media type, I don't see any way around that. The exact same requirement exists in ActivityPub spec.
How do you prefer it to be written?
>> Portable objects
>This part, I don't know how I'm going to manage this. The "attributedTo" being tied to the gateway URLs, and then those potentially changingIDs are not supposed to change. You include this special query parameter ("location hint") when you create an object, and then it stays there even if you start using a different gateway.
Maybe there's a better way to provide location hints, I don't know. Existing implementations use compatible IDs (regular 'http' URLs with 'ap' URL appended), so not much thought has been put into pure 'ap' IDs.
>that's going to be a mess with content-addressed storage
How do you process
Update
activities then? -
@silverpill @frogzone @jeffcliff @feld
> Fediverse needs to migrate to something else and EdDSA seems to be the best option (according to the rough consensus among fedi devs). Also, Ed25519 key size is very small, that's pretty cool.
Ah, okay. Reasonable. I'll put it on the list; I had seen support for it in Honk but because Pleroma didn't have it, I just put a note in a comment and ignored it.
> But the FEP should say something about Accept header and media type, I don't see any way around that.
The Accept header mechanics don't need redefining; you just say it's served with this MIME-type, right, and then servers that handle Accept headers correctly will handle it correctly. If you're worried an informal reading will make sloppy implementations create a de facto bad standard, then you could put a note in that the client and server are expected to implement HTTP-compliant semantics.
> The exact same requirement exists in ActivityPub spec.
It's broken in the ActivityPub spec. It is not just broken in the ActivityPub spec, but it is irreparably brain-damaged. Feel free to skip these next two paragraphs; I explain why what the ActivityPub spec says can/should be ignored, but I also cannot type this without ranting.
Why call it the "Accept" header if the server isn't going to treat it that way? If the ActivityPub spec says that a byte is nine bits, it's exceeded its authority. cwebber wants to treat it like a secret handshake because cwebber is a greasy illiterate dingus and this is network software: fuckups take years to resolve.
"Accept:" is intended for content negotiation, not like a secret handshake like cwebber seems to think: the idea was initially that a server can represent content in several possible forms, clients say which forms they prefer and indicates preference, the server tries to give the client one of its preferred forms. Support for parameters *besides* preference was explicitly dropped, so the ActivityPub spec if not just broken, it fails to comply with an obsolete version. You sure as hell don't put some stupid fucking XML-style wank in there where you cram a *namespace* into the Accept header. Dear god, I haven't seen a worse idea than ` The client MUST specify an Accept header with the application/ld+json; profile="https://www.w3.org/ns/activitystreams" media type in order to retrieve the activity.` in forever.
Ahem.
> How do you prefer it to be written?
If I were writing it, I'd leave it out: you take it as read that if they are doing web server software, they should be following HTTP's semantics. Maybe a "See also RFC 9110 sections 8.3, 12, and 15.5.7" or something. You don't specify that the client MUST send some value for the Accept header, you take a well-formed HTTP message from the client, the semantics are in the spec for what you've got to do when the message is well-formed but semantically wrong, and you've not broken anyone's clients as long as you stick to the spec. So you say what the server should do in case the message is well-formed but the semantics specify something the server can't do, and you refer to the other specs for semantics and behavior of lower layers instead of re-specifying it.
Again, you know, you're the one putting in the work so take anything I say about as seriously as is warranted: here are my thoughts.
> Maybe there's a better way to provide location hints,
Well, I think, like, you embed the location hints in the user's profile but you expect that if someone has the object, then they can figure out how to get the user. You don't get an object out of context, it's part of someone's stream.
I think maybe "how the URI is constructed and how it should be interpreted semantically" should probably be a different document.
> How do you process Update activities then?
Updates update the Actor, and those are trivial, because the actor's metadata is a slot, not a stream. Other stuff, longer answer, and food is ready. -
@p @silverpill @frogzone @jeffcliff We'll do ed25519 if it's gonna actually stick. I don't think it's been revisited since the original proposal was seen and I didn't even know Honk was supporting it.
Anyone else? Mastodon?