I'm sorry, it took *how* many servers to post a single long message from Ghost to 5k fediverse accounts and handle some replies?
-
Yeah, I get that, I think my confusion is it feels to me a bit like saying:
"If you are going across the water, you can cross Lake Pontchartrain faster in a motorboat than a cybertruck"
You absolutely can. But even if you couldn't, does it matter?
-
@thisismissem @hrefna @jenniferplusplus @kissane @fediversereport @hongminhee I appreciate this more in depth discussion. I just reread the post and the comments. They don't mention what they are autoscaling on. I agree that node services are a different beast than systems that default to blocking I/O. Knowing that they didn't have queueing to create throttling and back pressure, I have other suspicions about what went wrong.
-
Marco Rogersreplied to Marco Rogers on last edited by
@thisismissem @hrefna @jenniferplusplus @kissane @fediversereport @hongminhee to be fair though. I really wanna talk about my other question. The current design of these federation protocols are way too resource intensive for what should be a relatively small network. 5000 followers is nothing compared to what will happen if this gets bigger. I not convinced that it's a matter of just adding some queues and optimizing some code.
-
Hrefna (DHC)replied to Marco Rogers on last edited by
100% agreed. These protocols are all a trashfire when it comes to efficiency and they really don't have to be.
@thisismissem @jenniferplusplus @kissane @fediversereport @hongminhee
-
Marco Rogersreplied to Hrefna (DHC) on last edited by
@hrefna @thisismissem @jenniferplusplus @kissane @fediversereport @hongminhee I know I'm being annoying about it. I'm at the edge of my depth in terms of being able to describe a better system. I just have enough experience trying to scale things to know that having problems at this amount of data is a really bad sign. I want somebody to talk to me about the fediverse version of the Justin Bieber problem.
-
@thisismissem @johnonolan @kissane @fediversereport
+1 object signatures
I was saying at tpac last week "I sure wish the old w3c SocialWG process/chairs didn't require scrubbing object signatures from the 2017 Candidate Recommendation from the final TR
ActivityPub
The ActivityPub protocol is a decentralized social networking protocol based upon the [ActivityStreams] 2.0 data format. It provides a client to server API for creating, updating and deleting content, as well as a federated server to server API for delivering notifications and content.
(www.w3.org)
-
Jenniferplusplusreplied to Marco Rogers on last edited by
@polotek @hrefna @thisismissem @kissane @fediversereport @hongminhee
No you're entirely right. This is a big part of what puts me off from efforts to make solo servers more common. If we can make this network a good place to be, then it could grow 100x, maybe 1,000. But the design of the protocol and nodes will make that infeasible in terms of performance and cost. We have the most leeway to optimize within nodes, and the necessity of scaling will push nodes to become larger. -
Jenniferplusplusreplied to Jenniferplusplus on last edited by
@polotek @hrefna @thisismissem @kissane @fediversereport @hongminhee
Designing for scale between nodes, at the protocol level, is made complicated by the protocol itself, for the same reason reasons that also complicate baseline interop. The protocol doesn't put bounds or requirements on anything. It just sort of envisions a kind of universal social graph, over which apps can tailor unique views. But it hands waves on how to actually accomplish that, and even how to coordinate doing it later. -
Jenniferplusplusreplied to Jenniferplusplus on last edited by
@polotek @hrefna @thisismissem @kissane @fediversereport @hongminhee
There's many very reasonable optimizations that could be made, and would be conformant with the spec. But that really just means it's not prohibited, not that it's supported. You could batch messages, for instance. That's a valid thing to do in AP docs. But I would have no faith that any other software would know what that means or even be able to deserialize it in a useful way.So, no batching.
It's the same in so many ways.
-
@bengo @johnonolan @kissane @fediversereport this makes a fair amount of sense to me, shame it was removed
-
Erin 💽✨replied to Jenniferplusplus on last edited by
@jenniferplusplus @polotek @hrefna @thisismissem @kissane @fediversereport @hongminhee even within the bounds of the spec you can forsee e.g. a group of instances sharing one sharedInbox (which handles fanout between them). You could define some kind of even more optimised inbox property, if you wished.
But if this were a major problem (as opposed to being a problem of building your application in a scalable manner + tuning things like your http2 connection keep alives) I suspect that Mastodon gGmbH would have done it to keep mastodon.social working
-
Marco Rogersreplied to Jenniferplusplus on last edited by
@jenniferplusplus @hrefna @thisismissem @kissane @fediversereport @hongminhee you know, I think I need to be trying to narrow my engagement to talking about single user instances. It's becoming clear that it's a pretty different conversation than scaling large shared instances. I worry that I end up talking past people because they're responding to different contexts.
-
@jenniferplusplus @fediversereport @hongminhee @hrefna @kissane @polotek @thisismissem I think its important to understand that even at an individual level social media is actually a firehose of data much larger than most developer’s initial assumptions. You very quickly pass the point where naïve implementations of things fall apart.
Even with the protocol as it is today, implementations like Akkoma and GoToSocial scale very well to delivering posts to thousands of users (Pleroma & derivatives may be quite famous for scaling problems, but mostly those are on the side of the Mastodon API implementation; the federation queue itself screams) in quite low amounts of resources.
Now, there are other ways of implementing things, but they mostly just move the hard problem around (AP generally places the costs on the poster; ATProto puts the costs on the reader, who must follow approximately the entire firehose)
-
Jenniferplusplusreplied to Erin 💽✨ on last edited by
@erincandescent @hongminhee @hrefna @thisismissem @kissane @fediversereport @polotek
mastodon dot social essentially *is* that big shared inbox. It has like 20% of the active fediverse population. -
Erin 💽✨replied to Jenniferplusplus on last edited by
@jenniferplusplus @hongminhee @hrefna @thisismissem @kissane @fediversereport @polotek right, but they feel the challenge of fanning out posts to followers as much as anyone
-
@erincandescent @jenniferplusplus @polotek @hrefna @kissane @fediversereport @hongminhee there is a new multibox FEP too: https://codeberg.org/fediverse/fep/src/branch/main/fep/0499/fep-0499.md
-
@thisismissem @jenniferplusplus @polotek @hrefna @kissane @fediversereport @hongminhee This is what I’ve sometimes referred to as “enveloped inbox” by analogy to SMTP (though the primary motivation was user collections & Bto/Bcc)
-
@jenniferplusplus @fediversereport @hongminhee @hrefna @kissane @polotek @thisismissem anyway most of what i’m trying to get across is that these problems are not inevitable. Fanning things out to my 2300 followers from my single user instance doesn’t really do anything major on the CPU usage graph and I’m fairly confident it could go much higher before I started noticing anything each time I posted.
-
John O'Nolanreplied to Erin Kissane on last edited by
@kissane @fediversereport @thisismissem @bengo @evanprodromou absolutely - we’ll try to keep publicly documenting gotchas and things we struggle with, to create more shared resource and knowledge
Hopefully along the way we’ll also document some solutions
-
The Nexus of Privacyreplied to Marco Rogers on last edited by
you're not being annoying, it's a great topic of discussion. @julian has made some related points. it's so valuable getting new eyes on things that everybody had fallen into the habit of getting taken for granted. I once asked how much scaling analysis had been done in the standardization process and the answer was unsurprisingly "none".
@polotek @hrefna @thisismissem @jenniferplusplus @kissane @fediversereport @hongminhee