I'm sorry, it took *how* many servers to post a single long message from Ghost to 5k fediverse accounts and handle some replies?
-
Hrefna (DHC)replied to Emelia πΈπ» on last edited by
It does, and maybe I'm misunderstanding, I'm just not convinced it matters when the reason you are doing something is not performance related in the first place.
Like NIO is classically slower than synchronous IO in java, but benefits of NIO aren't about performance but about program design and flexibility with different arch (and if the performance difference is enough to matter you probably shouldn't be using java).
-
Hrefna (DHC)replied to Hrefna (DHC) on last edited by
Like we'd always benchmark apps with the queue, but comparing them to the non-queue scenario seems very weird to me
Because I can't think of a reason you wouldn't already want a queue in a distributed system like this for non-performance reasons
So of course you benchmark, but only the case with a queue means anything and then you tune your queue against different scenarios, or compare types of queues. If that makes sense
-
Noah Kennedyreplied to Emelia πΈπ» on last edited by
@thisismissem @hrefna @jenniferplusplus @kissane @fediversereport @hongminhee it shouldn't really unless the queue itself becomes a bottleneck or conversely enables better fanout
but usually it just shouldn't
-
Noah Kennedyreplied to Noah Kennedy on last edited by
@thisismissem @hrefna @jenniferplusplus @kissane @fediversereport @hongminhee note that the queue mainly impacts perf when it is the bottleneck or when it enables a better architecture
-
Emelia πΈπ»replied to Hrefna (DHC) on last edited by
@hrefna @jenniferplusplus @kissane @fediversereport @hongminhee
I think the benchmark is more to show how much faster the activitypub processing is when you defer that work by using a queue vs not, i.e., the HTTP requests duration drops significantly most likely; but autoscaling on open http requests / connections probably isn't right for node.js, instead you'd want a combination of event loop lag, mem/cpu, and open requests
-
Emelia πΈπ»replied to Noah Kennedy on last edited by
@noah @hrefna @jenniferplusplus @kissane @fediversereport @hongminhee
So specifically here it's the HTTP throughput to measure, not the "I processed the activity" throughput, does that make sense? So synchronous work versus add to queue
-
Noah Kennedyreplied to Emelia πΈπ» on last edited by
@thisismissem @hrefna @jenniferplusplus @kissane @fediversereport @hongminhee ah, so you want to measure the impact of moving this out of the critical path
that makes a lot more sense
-
Jenniferplusplusreplied to Noah Kennedy on last edited by
@noah @thisismissem @hrefna @kissane @fediversereport @hongminhee
I think Emilia means that AP message delivery is currently happening in-band with the request-response cycle. That manifests as slow responses, and drives autoscaling to continue serving demand.The queue (I assume) moves delivery out of band, so the response can complete earlier.
-
Noah Kennedyreplied to Jenniferplusplus on last edited by
@jenniferplusplus @thisismissem @hrefna @kissane @fediversereport @hongminhee i understand now, i replied elsewhere in the reply tree
testing the impact of moving this out of the critical path makes a lot more sense
-
Emelia πΈπ»replied to Jenniferplusplus on last edited by
@jenniferplusplus @noah @hrefna @kissane @fediversereport @hongminhee yup, exactly.
-
Noah Kennedyreplied to Noah Kennedy on last edited by
@jenniferplusplus @thisismissem @hrefna @kissane @fediversereport @hongminhee as soon as i saw her clarification i understood what she was doing, i've done the same thing before a few times
-
Hrefna (DHC)replied to Emelia πΈπ» on last edited by
Yeah, I get that, I think my confusion is it feels to me a bit like saying:
"If you are going across the water, you can cross Lake Pontchartrain faster in a motorboat than a cybertruck"
You absolutely can. But even if you couldn't, does it matter?
-
Marco Rogersreplied to Emelia πΈπ» on last edited by
@thisismissem @hrefna @jenniferplusplus @kissane @fediversereport @hongminhee I appreciate this more in depth discussion. I just reread the post and the comments. They don't mention what they are autoscaling on. I agree that node services are a different beast than systems that default to blocking I/O. Knowing that they didn't have queueing to create throttling and back pressure, I have other suspicions about what went wrong.
-
Marco Rogersreplied to Marco Rogers on last edited by
@thisismissem @hrefna @jenniferplusplus @kissane @fediversereport @hongminhee to be fair though. I really wanna talk about my other question. The current design of these federation protocols are way too resource intensive for what should be a relatively small network. 5000 followers is nothing compared to what will happen if this gets bigger. I not convinced that it's a matter of just adding some queues and optimizing some code.
-
Hrefna (DHC)replied to Marco Rogers on last edited by
100% agreed. These protocols are all a trashfire when it comes to efficiency and they really don't have to be.
@thisismissem @jenniferplusplus @kissane @fediversereport @hongminhee
-
Marco Rogersreplied to Hrefna (DHC) on last edited by
@hrefna @thisismissem @jenniferplusplus @kissane @fediversereport @hongminhee I know I'm being annoying about it. I'm at the edge of my depth in terms of being able to describe a better system. I just have enough experience trying to scale things to know that having problems at this amount of data is a really bad sign. I want somebody to talk to me about the fediverse version of the Justin Bieber problem.
-
@thisismissem @johnonolan @kissane @fediversereport
+1 object signatures
I was saying at tpac last week "I sure wish the old w3c SocialWG process/chairs didn't require scrubbing object signatures from the 2017 Candidate Recommendation from the final TR
ActivityPub
The ActivityPub protocol is a decentralized social networking protocol based upon the [ActivityStreams] 2.0 data format. It provides a client to server API for creating, updating and deleting content, as well as a federated server to server API for delivering notifications and content.
(www.w3.org)
-
Jenniferplusplusreplied to Marco Rogers on last edited by
@polotek @hrefna @thisismissem @kissane @fediversereport @hongminhee
No you're entirely right. This is a big part of what puts me off from efforts to make solo servers more common. If we can make this network a good place to be, then it could grow 100x, maybe 1,000. But the design of the protocol and nodes will make that infeasible in terms of performance and cost. We have the most leeway to optimize within nodes, and the necessity of scaling will push nodes to become larger. -
Jenniferplusplusreplied to Jenniferplusplus on last edited by
@polotek @hrefna @thisismissem @kissane @fediversereport @hongminhee
Designing for scale between nodes, at the protocol level, is made complicated by the protocol itself, for the same reason reasons that also complicate baseline interop. The protocol doesn't put bounds or requirements on anything. It just sort of envisions a kind of universal social graph, over which apps can tailor unique views. But it hands waves on how to actually accomplish that, and even how to coordinate doing it later. -
Jenniferplusplusreplied to Jenniferplusplus on last edited by
@polotek @hrefna @thisismissem @kissane @fediversereport @hongminhee
There's many very reasonable optimizations that could be made, and would be conformant with the spec. But that really just means it's not prohibited, not that it's supported. You could batch messages, for instance. That's a valid thing to do in AP docs. But I would have no faith that any other software would know what that means or even be able to deserialize it in a useful way.So, no batching.
It's the same in so many ways.