Hi Fediverse denizens. I've been working on a project I hope will help Fediverse devs make software that federates across ALL services, not just Mastodon-plus-a-few-others.
-
Darius Kazemireplied to The Nexus of Privacy last edited by
@thenexusofprivacy @djsundog if I get any data from a server that's opted out via "announce leaking" I will instantly drop it instead of recording it
-
The Nexus of Privacyreplied to Darius Kazemi last edited by
@darius got it. that makes sense then.
-
The Nexus of Privacyreplied to The Nexus of Privacy last edited by
@darius also with my pedantic hat on, your approach very much aligns with the principles in https://www.cell.com/patterns/fulltext/S2666-3899(23)00323-9, which is good! I'll update my original post to mention that.
-
Mike [SEC=OFFICIAL]replied to Darius Kazemi last edited by
@darius I'd love to read the blog post but light text, black background is a massive accessibility problem for a lot of people. If you're able to incorporate some detection of user colour theme requirements into the styles for the project and its documentation that'd be extremely helpful. Thanks.
-
@darius @nuintari For example, Lemmy uses Announce totally differently and with a different meaning and intent than Mastodon does. That information cannot be seen by looking at the JSON (the "results" that @nuintari was referring to).
As another example - in Lemmy a Note is a comment on a post and posts are shared over the wire using a Page object.
Still, a great tool and sorely needed. Thank you!
-
Steffo :deadlock_dynamo:replied to Darius Kazemi last edited by
@darius opposite question!
will a relay explicitly for opting in be available?
some people may not want to join a public relay due to how heavy they are on server load, but may want to provide data to your project nontheless…
-
@[email protected] Hello! I am glad I found this post again, it got lost in my feed. I emailed you a couple of questions. I see you already answered the authorized fetch one. I bring the following counter argument though:
I run a queer instance, and I feel pretty confident with what you’ve shown off so far when it comes to your data “deanonymization" because you’re stripping everything and just changing it to a skeleton highlighting types and the various nested structures. I’m okay with this, but authorized fetch is a protection against instances that pose a threat to my community. I can’t exactly disable it (even though I am completely okay with the collection method you outline) just to opt in. Would you consider looking into that in the future? (I don’t believe you need an entire instance implementation just something that can sign the requests so my instance is happy). -
Pumpkin Amberreplied to Steffo :deadlock_dynamo: last edited by
@[email protected] @[email protected] this is one of the reasons why I suggested perhaps intercepting an instance's job queue. There are certain things that can’t be collected by a relay such as forwarded reports which are still just as important to handle.
-
Jen :TransButterfly: :3hearts: :Green:replied to Darius Kazemi last edited by
@darius Honestly, you could have just left this at "I'm not assuming that I am entitled to the content of every public post on the federated network" and I'd have already known you were more on the level than most scrapers and bridges. But you seem to have put a lot of effort into bot only not hanging on to the content of every post on the network, but into being transparent about the how and why. I'd give you a thumbs-up react, but most Masto servers don't support it. (So obviously I agree with your stated goal as well. :neobot_giggle:)
One thing that occurred to me is when a fedi instance uses custom forks. For example, Anarres.family uses a customized fork of Glitch Social (itself a fork of Mastodon), and the version number reported on our web interface, and presumably what your software will see, is
v4.4.0-alpha.1+glitch.anarres.family
, which is a pretty common practice for these forks, though not all of them include the full URL of the instance like ours does. So with your example of "we saw n polls from [software] x.y.z," in our case it would not actually anonymize us because it would display "50 polls by Masto...anarres.family." Not terribly critical since you're planning on scrubbing post data, so there's not much that could be shared that we'd likely want anonymized.You could get around that by truncating or obfuscating the version, and possibly rolling it in with the equivalent Masto or Glitch version, but that's relevant data; one of the things our fork adds is the emoji reaction code from Chuckya, so our instance is capable of sending and receiving AP messages that are not supported by other Glitch instances.
-
NeoDB Open Source Softwarereplied to Darius Kazemi last edited by
@darius nice idea. consider opt in with https://eggplant.place
meanwhile it would be nice to share your code url, host name, and nodeinfo once you are ready to open your service. much as you'd like to research others' schema, other admin/dev may want to know more about yours, by looking into nodeinfo and code.
-
@darius cool! I didn't dig in too deeply, so maybe you cover this, but a related helpful feature/tool might be knowing which software (libraries, implementations) would be able to parse data/schemas. getting in to https://caniuse.com/ territory, or things like ACID.
also often helpful if folks (eg, devs) can paste in data (JSON) and see how it validates, using the exact code being used to "observe", without that being captured and included in database.
-
Darius Kazemireplied to The Nexus of Privacy last edited by
@thenexusofprivacy thank you! I've been talking with Rob a lot
-
@ireneista 100% correct! That's the idea. Release a version that's very safe, see how useful it is, figure out the gaps, figure out how to safely address those gaps, do another public comment period, rinse and repeat
-
Darius Kazemireplied to Steffo :deadlock_dynamo: last edited by
@steffo yes actually I'm turning this server into a relay itself! So you can just join the relay in order to opt in. It'll be a kind of fake relay where data flows in (and gets scrubbed) but no data flows out
-
@puppygirlhornypost2 @steffo I'm open to that for future iterations if I can figure out how to do that as a safe opt in. But I'm also big on incremental development so I'm building this part first
-
@bnewbold yeah for sure, if I can create something lile caniuse that would be huge! That's been a goal of mine the whole time really though I'm not there yet
-
Darius Kazemireplied to NeoDB Open Source Software last edited by
@neodb correct and I will do so (I already have nodeinfo working)
-
Darius Kazemireplied to Jen :TransButterfly: :3hearts: :Green: last edited by
@SymTrkl yeah I've already noticed that! I considered scrubbing stuff past the + in the semver (so no additional data like most recent commit hash etc) but also kind of the point here is to find patterns from edge cases.
I wonder though, maybe what I'll do is anonymize extra metadata in the semver for enumerated software for a given schema until there are at least N servers emitting that schema.
-
@puppygirlhornypost2 I might be able to make it work as an opt in relay that DOES work with authorized fetch. We'll see
-
Veronika Cheplyginareplied to Darius Kazemi last edited by
@darius I think I support this but the scrolling thing on the side makes it difficult to read the thing