How Decentralized Is Bluesky Really?
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
I participated a bit in the process of when Bluesky was Jack Dorsey and Parag Agrawal's personal project. I also believe Jack and Parag were sincere about Bluesky as a decentralized social network protocol that Twitter would adopt, which is the directive that Bluesky was given as an organization.
When Jay Graber was awarded the position to lead Bluesky, I was not surprised. To me, Jay was the obvious choice to deliver what Bluesky was being directed, and I do think Jay is an excellent leader
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
There is also something which Bluesky gets right which the fediverse does not. I mentioned that Bluesky uses decentralization *techniques*, and the most important of those is content-addressing. This allows content to exist even when a server goes down.
This is a great decision and I have advocated that the fediverse do so as well. In fact several years ago I wrote a demo in @spritely's early days showing off how one could build a content-addressed ActivityPub in a spec-compatible way.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
So I have opened here with the things that Bluesky does well. As you may guess, we are about to move into critiques territory, and it's a lot of critiques from a *decentralization*/*federation* perspective. It doesn't erase the "credible exit" goals, which I think are good still.
Let's dive in...
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
A frequent way of describing Bluesky's decentralization, including by Bluesky's team, is "it's like a bunch of blogs (Personal Data Stores), and then the relay/appview/etc pieces are like search engines"
This is a reasonable starting point for thinking about things, so let's run with it.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
In fact ATProto's own tutorial even says "Think of our app like a Google": https://atproto.com/guides/applications
And indeed this is a good way to think about things. But it doesn't seem so bad, because we have Personal Data Stores like blogs, so probably things are fine, right?
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
While most people would argue that blogs and websites are open, few would argue that *Google* is open. So this is a curious place to begin thinking, and yet structually, it is actually quite apt.
PDS'es are like blogs, the rest is like Google. But relays/appviews/etc do a lot *more* than Google.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Relays, AppViews, etc don't just index information. Blogs and their interactions are generally slow-moving, but social media is direct and responsive. Notifications and fast interactions are key. So search engines, yes, but we should also think of these components of doing much more.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
But let's stay on this blog/search engine analogy for a while before we unpack what it means on a *technical* level, which is interesting. Let's analyze for the moment from a power dynamics level.
Building a web search engine is actually pretty easy these days, you can do so with off-the-shelf tools. And yet there are only a couple of search engines *really*, Google and Bing (DDG mostly uses Bing). And yet the information is right there. *Anyone* could run their own engine. Why don't they?
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Furthermore there is an interesting connection between blogs and social media: the death of blogs + feed aggregation directly aligns with the death of social media.
How many of you were around for the birth and awkward death of blog engine feeds? Because I was! Oh, remember Google Reader?
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Feed readers are also simple, and in fact they were even easy to self host, even on the desktop! But Google Reader came in and was such a good design that everyone used it.
When it went away, blogs were still *there*. But blogging as a *syndication medium* died. One big player left, and it's gone.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
This was sad for me especially; my favorite medium on the internet ever was webcomics. Webcomics still exist, sort of, but the loss of independent publishing and aggregation meant that they had to change to survive.
The shape of webcomics started to get shaped to the shape of Twitter's image box.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
This may seem like an enormous aside, but it isn't. The big sell currently is that "you don't need to run a relay because you can run your own PDS!" but as I have illustrated here, the distribution and syndication power dynamics matter a lot.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
So. It isn't enough to self-host your own PDS. Whether or not people can run their own relays/appviews/etc actually matters *a lot* if we want this stuff to survive.
So, can we? How hard is it to run your own AppView/Relay/etc?
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Today, there is only one real organization running a Relay that really matters or an AppView that people use for anything other than fun aggregation of statistics. Nothing that resembles meaningful decentralization of the network. It's all run by one company: Bluesky.
But could we change that?
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
People are trying; most notably alice has done some great work recently: https://alice.bsky.sh/post/3laega7icmi2q
So now someone *can* run their own Relay (not the AppView yet, but maybe soon), and we're getting a sense of the cost and scale. This is good news; we didn't know before.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
In fact we also have an idea of the rate of growth. Approximately 4 months prior, @bnewbold.net posted an article detailing how to run a Bluesky relay: https://whtwnd.com/bnewbold.net/entries/Notes%20on%20Running%20a%20Full-Network%20atproto%20Relay%20(July%202024)
This is great. We need more people trying to do so to get a sense of how decentralized things can be.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
Just focusing on storage, in July @bnewbold.net estimated the amount of storage expected to run a Bluesky relay is approx 1 terabyte. In just 4 months at start of this month (November), alice estimates nearly 5 terabytes.
This is a fast growth rate and this is *before* the big post-election influx.
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
I tried estimating how much this would cost; as a lazy approximation I dumped a 5 terabyte machine into seeing what Linode would cost to self-host, and it was approximately 55k a year: https://bsky.app/profile/dustyweb.bsky.social/post/3lah5n3kld42q
That's a lazy estimate, but that's also what many people make in the US every year
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
However @bnewbold pointed out, correctly!, that there were cheaper options available. If we used even Linode's block storage, it would be cheaper (but still expensive) for the storage component, and this is true https://bsky.app/profile/dustyweb.bsky.social/post/3lah5n3kld42q
-
Christine Lemmer-Webberreplied to Christine Lemmer-Webber last edited by
In fact @bnewbold and alice had gotten the server down to just close to $200/month in their estimate, much much cheaper than I had, by choosing a dedicated server plan. Much cheaper!
But there's a problem though; that's cheap because you've got a server that has a dedicated disk...