Ozone, Bluesky's stackable moderation system is up and open-sourced. https://bsky.social/about/blog/03-12-2024-stackable-moderationI think it's interesting in obvious ways and risky in some less obvious ones (that have less to do with "O NO BILLIONAIRES" ...
-
Erlend Sogge Heggenreplied to Erin Kissane on last edited by
@kissane didn’t think you were anyway, excitedly looking forward to your post(s) on this subject!
-
@tchambers For the Bluesky app, it's the Bluesky moderation team, which is pretty active in the classical/centralized mode.
Bluesky 2023 Moderation Report - Bluesky
We have hired and trained a full-time team of moderators, launched and iterated on several community and individual moderation features, developed and refined policies both public and internal, designed and redesigned product features to reduce abuse, and built several infrastructure components from scratch to support our Trust and Safety work.
Bluesky (bsky.social)
Layers-wise, I think there are two things—the official Bluesky client enforces the use of Bluesky's in-house moderation work (at the level of the labeler) but then the network-wide, truly non-optional stuff—which is very limited in scope—happens at the level of the relay:
-
@tchambers So although there's a lot of concern across fedi about Bluesky "not moderating," they actually do both takedowns and de-indexing for the Bluesky platform itself *and* they're inserting a centralized kill switch for the worst-of-the-worst stuff into the network architecture, which would prevent the whole "we just defederate from CSAM" situation we have here.
It's not difficult to think of a variety of failure modes for this architecture, but I think they're interesting trade-offs!
-
@kissane @tchambers You are correct, but I think there are things to like about the “we just defederate…” model. It demonstrably provides a powerful incentive for the people who control instances not let them turn into Nazi bars.
-
Erin Kissanereplied to Tim Bray on last edited by [email protected]
@timbray @tchambers Agreed.
My main source of frustration about the cross-protocol conversations is that their underlying assumptions about the way AP/fedi and ATP/Bluesky work are kind of freely floating above the observed and stated realities of both protocols. There are lots of reasons for this, but it's getting in the way of crunchier thinking, imo.
edit: meaningful typo fix
-
@kissane Yes, I did know they centralize base moderation is halved by bluesky staff — but not sure at which part of the bluesky / AT ecosystem…can’t be at PDS servers as those swap out, as do the crawler indexers, so where does centralized non-optional moderation occur? Maybe at the app layer?
-
@tchambers For just Bluesky-the-app, as opposed to other potential future ATP apps?
Looking at the whitepaper plus Paul's most recent comments, I *think* it's labelers hardwired into the official client app(s) and actioned at the App View layer plus takedowns at the Bluesky-run relay and Bluesky-run PDSes, right?
Caveat that my understanding is highly fragmented. (As are their docs and explanations, bless them.)
-
@kissane I *THINK* you and I see it the same way, but I personally am still feeling like I have a 20 percent or more chance of being wrong. Wish they would clearly state that.
Side question related to that: So if Gab.com built it's own full AT Protocol coalition of services (It's own app, own labler, own PDS's etc) could the BlueSky service "defederate" entirely from GabSky?
If so, would that also be done at the BlueSky service's App?
-
jonny (good kind)replied to Tim Chambers on last edited by
@tchambers
@kissane
"Can you opt in to labeling" is the whole tension of labeling for content moderation - the answer necessarily has to be "no" at some level or else it wouldnt work, ppl posting harmful shit dont want it to be labeled as harmful, but then it becomes a vector for abuse as eg. The gab system eg. labels all queer people for targeting.They have a view on abuse that "if you dont see it, then its not abuse" - so eg. here you dont see labels from labelers you dont subscribe to: https://github.com/bluesky-social/atproto/blob/a0625ba40f3d66528488a18f6d32854d9165e191/packages/api/src/moderation/decision.ts#L262
But it does look like any labeling service can label any event - there isnt a good way of making that opt in on the protocol. Im going to set up a test instance later and try and label myself and see what happens in my PDS.
The risk is that identity is so cheap that there doesnt need to be a stable "gab" labeling service - if it gets blocked at an infra level by atproto, cool, signal through a side channel to your followers youre minting a new DID and block successfully evaded. So it is effectively impossible to stop labels as designed from being used as abuse vectors.
I raised this last June and Paul did respond once and from a first look it doesnt seem like any changes were made https://github.com/bluesky-social/proposals/issues/19
-
Right. I don't think they've fully thought through the implications of the underlying design. I asked @bnewbold over there if they had done threat modeling but didn't get a response, oh well.
@tchambers @kissane that's also how I see it with Bluesky-run relays and PDS's, but they've also said that it's only illegal content and spam. Masnick's paper talked about *not* removing Alex Jones at this level. So it's not clear that Gab would have to have their own PDSs or relay. (1/N)
-
Erin Kissanereplied to jonny (good kind) on last edited by
@jonny Yep, shared moderation lists are absolutely a potential attack vector. It's one of the most bullet-biting hard tradeoffs of locked-open + decentralized shared mods layer.
Blocking isn't a strong way to prevent regular user lists from being used adversarially either—I see your issue suggesting otherwise, but I have near-zero faith in blocking as a mitigation for the kind of brigading you describe.
(con't)
-
@jonny I guess for me, the question becomes whether other parts of their model can mitigate the harms concentrated by adversarial use of lists and shared mods.
(My current suspicion is that without strong place boundaries, it's always going to be an arms race, but we also haven't really seen "arms race" on this exact configuration of semi-decentralized services yet and there are a lot of variables.)
-
jonny (good kind)replied to Erin Kissane on last edited by
@kissane
Totally agree. And the labels are a different vector than lists alone too since they are applied to the post/account itself, rather than the post/account being indexed and etc. Also agree that blocking is at best a reactive measure, even if identity had more friction. I think youre right on diagnosing lack of place as the core of it, and its a really nasty downside of "frictionless all-to-all platform" as design goal. Fedi fiefdoms are not great, but having no sense of place doesnt feel like an alternative either. -
-
On the other note, I think the "illegal content and network abuse only" refers to the moderation that extends beyond Bluesky-the-reference-app/platform, in a larger future system.
Bluesky as a platform—which is what I *think* Tim and I were discussing—does takedowns and deletions for lots of things that don't rise to that level, and the team talks about that in their moderation report and other places. (I know you know this, I just want to try to keep the thread clear-ish.)
-
Erin Kissanereplied to jonny (good kind) on last edited by
@jonny Let us not even begin to speak of Nostr
-
@kissane @jonny I think on labelling it won't actually make a possible "list of targets" since it's never "filter in this stuff I don't follow" but "filter out this stuff I might see"
So because it's subtractive, you don't know the content you don't know, as an end user. Yeah, the label operator would have a list of accounts / hashtags / etc to monitor, but that'd be internal information to them.
-
@thisismissem @kissane @jonny "Feed generators" get those labels and can opt posts in based on that.
-
@kissane They've certainly thought it through a lot more than ActivityPub and Mastodon did at the equivalent stage! Bryan said they've done red-teaming, perhaps that included threat modeling as well. If so it'd be a first, no social network that I know of has ever done this early in their lifecycle (or for that matter later). Time will tell.
-
Caspar C. Mieraureplied to Erin Kissane on last edited by
@kissane @joshwayne @mergesort BlueSky - as a commercial company that is going to earn money from what they build - makes it obvious that they consider a central moderation instance as bad because it is like a "Supreme Court". Which is by itself already a strange way of criticm. What they don't say here is: moderation costs money. Yes, it does. And social network platform hate paying people for this hard job - which is necessary and it is just fair that they do their job on a platform where they also earn money from users generating content. The result is an ecosystem where it is ok to be harassed as you are free to move to another instance. This is just making a bad system worse and selling it as a new technical feature. If BlueSky would finally agree on it's repsonsibility, building a well paid moderation team and then introduce "composable" moderation, yes, that would be fine. As it would be an addon. But this is cost reduction by technical implementation.
When Jack initially announced BlueSky the first (!) point he made was the following:
»First, we’re facing entirely new challenges centralized solutions are struggling to meet. For instance, centralized enforcement of global policy to address abuse and misleading information is unlikely to scale over the long-term without placing far too much burden on people.«
So he argues that being a responsible company that is obliged to international laws - and it's users - is a "burden on people". Well: the people here is the stakeholders of billion dollar platforms. And that is what BlueSky is the solution, too.
I would have loved to see a blue sky in BlueSky but besides looking nice I mainly see a platform that aims towards deregulation.
https://bsky.social/about/blog/4-13-2023-moderation
https://twitter.com/jack/status/1204766082206011393