Hi Fediverse denizens. I've been working on a project I hope will help Fediverse devs make software that federates across ALL services, not just Mastodon-plus-a-few-others.

Laurens Hof

a: this is a super valuable tool, very helpful, it's great youre building it!

b: I genuinely do not think the fediverse can exist if we're taking the idea seriously that surfacing activitypub data formats requires an opt-in consent. Taking that idea to its logical conclusion means that its literally impossible to build a federating AP server.

Emelia 👸🏻

@YakYak_OG @darius my understanding is that as the software is just recording the shape of the protocol level messages, the actual contents of the messages (the free speech) is completely irrelevant.

i.e., it doesn't matter what the contents is or who said it because the software ignores that. It records “<string>" or “<uri>” or whatever.

Emelia 👸🏻

@darius I think this is going to be fundamental as a tool for compatibility between Fediverse software as the network grows. Understanding what is being sent, schema wise, is essential to being able to handle it.

I also wouldn't mind running this as a proxy in front of a server, where I can submit information about activities I receive, but anonymizing everything before forwarding to the collector.

Scott Feeney

@darius please hand off Hometown to another maintainer! It's cool you're moving on to these new, bigger projects, but it's sadly ironic that those of us who liked your ideas about social media the most — and therefore started Hometown servers — are left stuck without updates or bug fixes.

Firecat

@darius great tool but the Mastodon developers do not want to comply with other rules for different federations and activity hub standards. Mastodon will only work in Mastodon. It’s the reason why you won’t see misskey post in Mastodon or pixelfed photos, the Mastodon developers deliberately ignore everything outside the Mastodon community.

Hugh

@darius @djsundog Is this how a new software would be added? e.g. if I wanted the Observatory to include BookWyrm, I'd convince a BookWyrm server admin to join the relay?

Darius Kazemi

@unchartedworlds means a lot coming from someone on the scicomm server!!!

Darius Kazemi

@graue yes, I want to do this and have plans for it, but I needed to get this out first

Darius Kazemi

@hugh @djsundog yes! Though bookwyrm would also need to support relays

Deborah Pickett

@darius How do you counteract poisoning of the data by bad actors? Do you have a blocklist of peers or addresses to discard data from? Can you remove poisoned anonymized datapoints if they’re discovered long after collection?

Does the relay work with peers who have AUTHORIZED_FETCH turned on? Most relays don’t.

The Nexus of Privacy

@darius thanks for taking time time to write it up so thoroughly and get feedback! It seems like a great tool -- I was impressed by the demo at FediForum -- and I really appreciate you thinking so deeply about the privacy aspects and taking a conset-based approach. I certainly hope that this sets the bar for future projects.

Opt-in at the server level makes a lot of sense to me, and I like the specific approach you described in your reply to @djsundog ... it's a mechanism server admins are already familiar with. The discussion of how you can't leverage existing opt-in/opt-out signals makes it clear that trying to do so would compromising user privacy (and also exposes a limitation of the current design -- not your issue but something I hope developers think about).

Scrubbing the data is a great example of data minimization, and the example makes it easy to understand. The exceptions you list all seem very sensible,

A question about the additional opt-out mechanism ... does this do anything more than the admin undoing the opt-in by unsubscribing? If not, then it might be overkill ... although certainly nothing the matter with having an email-based opt-out as well.

Darius Kazemi

@futzle yes to blocklist and yes to removing poisoned data manually

No it doesn't work with authorized fetch, and I assume that a server with that turned on doesn't want my tool anywhere near them anyway

Irenes (many)

@darius okay! this seems solid to us. we do suspect it errs on the side of not capturing quite enough data to solve real compatibility problems, but we're supportive of resolving that iteratively by looking at the data captured this way, then studying what else it's useful to capture one thing at a time.

Darius Kazemi

@thenexusofprivacy @djsundog if I get any data from a server that's opted out via "announce leaking" I will instantly drop it instead of recording it

The Nexus of Privacy

@darius got it. that makes sense then.

The Nexus of Privacy

@darius also with my pedantic hat on, your approach very much aligns with the principles in https://www.cell.com/patterns/fulltext/S2666-3899(23)00323-9, which is good! I'll update my original post to mention that.

Mike [SEC=OFFICIAL]

@darius I'd love to read the blog post but light text, black background is a massive accessibility problem for a lot of people. If you're able to incorporate some detection of user colour theme requirements into the styles for the project and its documentation that'd be extremely helpful. Thanks.

Rimu

@darius @nuintari For example, Lemmy uses Announce totally differently and with a different meaning and intent than Mastodon does. That information cannot be seen by looking at the JSON (the "results" that @nuintari was referring to).

As another example - in Lemmy a Note is a comment on a post and posts are shared over the wire using a Page object.

Still, a great tool and sorely needed. Thank you!

Steffo :deadlock_dynamo:

@darius opposite question!

will a relay explicitly for opting in be available?

some people may not want to join a public relay due to how heavy they are on server load, but may want to provide data to your project nontheless…

Pumpkin Amber

@[email protected] Hello! I am glad I found this post again, it got lost in my feed. I emailed you a couple of questions. I see you already answered the authorized fetch one. I bring the following counter argument though:

I run a queer instance, and I feel pretty confident with what you’ve shown off so far when it comes to your data “deanonymization" because you’re stripping everything and just changing it to a skeleton highlighting types and the various nested structures. I’m okay with this, but authorized fetch is a protection against instances that pose a threat to my community. I can’t exactly disable it (even though I am completely okay with the collection method you outline) just to opt in. Would you consider looking into that in the future? (I don’t believe you need an entire instance implementation just something that can sign the requests so my instance is happy).