How Decentralized Is Bluesky Really?
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
This may seem like an enormous aside, but it isn't. The big sell currently is that "you don't need to run a relay because you can run your own PDS!" but as I have illustrated here, the distribution and syndication power dynamics matter a lot.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
So. It isn't enough to self-host your own PDS. Whether or not people can run their own relays/appviews/etc actually matters *a lot* if we want this stuff to survive.
So, can we? How hard is it to run your own AppView/Relay/etc?
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Today, there is only one real organization running a Relay that really matters or an AppView that people use for anything other than fun aggregation of statistics. Nothing that resembles meaningful decentralization of the network. It's all run by one company: Bluesky.
But could we change that?
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
People are trying; most notably alice has done some great work recently: https://alice.bsky.sh/post/3laega7icmi2q
So now someone *can* run their own Relay (not the AppView yet, but maybe soon), and we're getting a sense of the cost and scale. This is good news; we didn't know before.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
In fact we also have an idea of the rate of growth. Approximately 4 months prior, @bnewbold.net posted an article detailing how to run a Bluesky relay: https://whtwnd.com/bnewbold.net/entries/Notes%20on%20Running%20a%20Full-Network%20atproto%20Relay%20(July%202024)
This is great. We need more people trying to do so to get a sense of how decentralized things can be.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Just focusing on storage, in July @bnewbold.net estimated the amount of storage expected to run a Bluesky relay is approx 1 terabyte. In just 4 months at start of this month (November), alice estimates nearly 5 terabytes.
This is a fast growth rate and this is *before* the big post-election influx.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
I tried estimating how much this would cost; as a lazy approximation I dumped a 5 terabyte machine into seeing what Linode would cost to self-host, and it was approximately 55k a year: https://bsky.app/profile/dustyweb.bsky.social/post/3lah5n3kld42q
That's a lazy estimate, but that's also what many people make in the US every year
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
However @bnewbold pointed out, correctly!, that there were cheaper options available. If we used even Linode's block storage, it would be cheaper (but still expensive) for the storage component, and this is true https://bsky.app/profile/dustyweb.bsky.social/post/3lah5n3kld42q
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
In fact @bnewbold and alice had gotten the server down to just close to $200/month in their estimate, much much cheaper than I had, by choosing a dedicated server plan. Much cheaper!
But there's a problem though; that's cheap because you've got a server that has a dedicated disk...
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Even if we look at the dedicated hosting provider that @bnewbold provided in June and scale the cost to the pre-election storage requirements, we are adding on a massive amount of cost every month, over $400/month more.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
But worse, we have reached the limits of what is possible to do with a dedicated server. We *have to* move to abstracted storage from this point forward because we're starting to hit the limits of what's offered for cheap dedicated storage on one machine. And this number will only grow, and as said previously, is growing at an enormous rate.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
I have spent a lot of time focusing on the cost of storage, but storage is only one cost required. These estimates have been done so far against servers that *nobody is actually using*. The cost of servers that people are using will be much higher, because more needs to happen than just store things.
And that is not even to mention the challenges with administrating, dealing with takedown requests, illegal content, etc, which are probably much more serious.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Let's take a break, the analysis of server costs is boring and I don't like doing it, and I'm sure people will throw numbers at me of the absolute race-to-the-bottom hosting numbers they can find to store and run all this stuff, but really that's not interesting to me.
Let's do a comparison.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Remember that the idea of "fully self-hosting" on Bluesky/ATProto at this point is primarily abstract; nobody is really doing it. But of course there's a place where tens of thousands of people are running their own servers for millions of users, and that's the fediverse/ActivityPub.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
As said, tens of thousands of people are self-hosting *today*. Fediverse software doesn't just scale up, it scales *down*.
GotoSocial is cheap enough on resources where you can run it for family and friends on a raspberry pi or spare laptop you have sitting around.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Now you're hitting the point in this thread where some of you may be thinking "aha! this is where Christine is saying that the fediverse/activitypub are awesome and atproto is terrible!"
you have NO IDEA HOW MUCH I CRITICIZE THE FEDIVERSE ALL THE TIME, I do it all the time, and will later here
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
The fediverse has a lot of flaws. Oh trust me, we're gonna get to that.
But comparison-wise: what I mean to say is that architectural decisions matter, and scaling up isn't the only thing that's important, *scaling down matters too*.
If you care about decentralization, anyway.
-
Jason Lefkowitzreplied to Christine Lemmer-Webber last edited by
@cwebber This is great. Thank you for taking the time to write it all up.
One thing I like about this space is that people on all sides seem collegial. They acknowledge the issues with their own solution and praise the strengths of others. You don't get that much anymore.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Now look, we're about 1/3 of the way done here, there's a lot more to say, and a lot more said in my article, it's about 24 pages long if you print it out.
This is because in the age of TikTok I somehow have decided to model myself after David Foster Wallace, sorry
"Consider the Fediverse" I guess
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
But now, I will break for lunch. Enjoy your intermission because I will be back. We still have to get through the remaining 2/3 of the analysis, after all.
======= LUNCH BREAK HERE =======