Excited to announce that I will be at #fediforum today speed demo-ing my latest project: an ActivityPub data observatory!
-
@darius nice! The only folks who *I* could imagine insisting on this being opt-in are Oracle's legal team, and they were told in no uncertain terms that this sort of data isn't even *eligible* for opt-out, even in the US of A.
-
Darius Kazemireplied to blaine last edited by [email protected]
@blaine every morning I ask myself: "Am I going to do something today that Oracle's legal team won't like?" and if the answer is no I have already failed
-
@darius I'd say it's fine since it's not collecting user data. However, given how much jerks have caused sensitivity here I'd suggest an explanation page that uses some of your own posts as examples, with detailed explanations. And for usability/accessibility reasons, it should be in text, and with much higher contrast. Machine representations look forbidding to non-technical people anyhow, but especially so when dark and hard to read.
-
@williampietri Yes, sorry, this is something I whipped up in a few minutes for a microblog post and is not going to be what my macroblog post looks like
-
Emelia πΈπ»replied to Darius Kazemi last edited by
@darius oooh, I see, <uri> isn't a placeholder for an actual value, it's just a indicator of the value type
-
Emelia πΈπ»replied to Darius Kazemi last edited by
@darius I think capabilities comes from either pixelfed or gotosocial?
-
@darius it would probably be very useful to also run the data through a JSON-LD processor and flag e.g. URIs being serialised as strings, undefined properties, etc
-
Emelia πΈπ»replied to Darius Kazemi last edited by
@darius why there's the new security context in mastodon: https://github.com/mastodon/mastodon/pull/31871
I think that's a backport candidate, as the context was used but not actually present in the @context's object
-
Evan Prodromoureplied to Darius Kazemi last edited by
@darius can you compare to browser.pub?
-
Emelia πΈπ»replied to Darius Kazemi last edited by
@darius given it's just collecting the unique shapes of data, I think it's perfectly fine to be opt-out, since there's no user data at all. (assuming you never store the raw data anywhere)
-
@erincandescent yeah! any pointers to things that can help me infer more stuff would be great.
-
Darius Kazemireplied to Evan Prodromou last edited by
@evan yeah. on browserpub I can say "hey help me take a look at these particular messages I know about". This observatory will surface information about stuff floating around the fedi that I don't even know about. For example I am already learning about server software I've never even heard of, and I would not have put that into browser.pub because I wouldn't have known it existed
-
Evan Prodromoureplied to Darius Kazemi last edited by
@darius Ah, OK, interesting. Where does your network tap plug in?
-
Darius Kazemireplied to Emelia πΈπ» last edited by
@thisismissem yes I am just storing the inferred type! I use https://github.com/triggerdotdev/schema-infer
-
Darius Kazemireplied to Evan Prodromou last edited by
@evan still figuring it out. Right now I am subscribing to a public relay as that is the most software-neutral source I could think of, but I am looking at other ingestion methods too. Importantly I want to ingest AP only... I'm not going to hit proprietary API endpoints like most scrapers do
-
Evan Prodromoureplied to Darius Kazemi last edited by
@darius barf, no
-
the bus lane enthusiastreplied to Darius Kazemi last edited by
@darius Hm. Am I reading it right you would be logging that person x made a post with URL y on date z?
that might interfere with some people's want to not have their posts seen off fedi; that info could be used against someone even if they delete it later. "why're you posting while on the clock" fer a basic example.
the "in reply to" field as well might expose the shape of who you talk to in a concerning wayedit: it's clear that I don't get it but will Try again after coffee
-
@darius Are you tracking #goblin? https://indieweb.social/@goblin@goblin.band
-
@tchambers never heard of it!
-
Darius Kazemireplied to the bus lane enthusiast last edited by [email protected]
@t54r4n1 no, I am logging that "some person somewhere but I don't who or where because I threw away that data, made a post with a "URL" field that contains some kind of URL in it but I don't know what because I threw away that data"
I'm not even logging the time something was posted! Just "there is a time field in this and it contains a time but I don't know what time"