kinda weird constructing the serialization/deserialization for #activityPub types in #rust because it does mean I'm doing a lot of curl requests against my own server (and those of other people) to see what the raw data looks like in the wild
-
kinda weird constructing the serialization/deserialization for #activityPub types in #rust because it does mean I'm doing a lot of
curl
requests against my own server (and those of other people) to see what the raw data looks like in the wild
aaaaaand I'm very glad I took the "everything can be there, nothing has to be there" seriously because yeah.
I spent most of the last week looking for a job so today is the first day I've programmed all week. While I don't yet have all the basic data types finished (lots of todo stubs) I'm slowly getting things there and the way I've setup theFrom/To
methods+workflow and hacked in inheritance is going to work just fine.
I really need to work on theOrderedCollections
next because that's quite critical for the inbox and outbox and such.
#techPosting -
It's also nice to see the other MIME types in the wild. For instance, my markdown posts (possibly all of them?) use
text/x.misskeymarkdown
. -
(also @[email protected] I see you liked this, and your profile/outbox was the one I poked at to see some Mastodon AP JSON. sorry / thank you!)
-
@aud
Yo I have been trying to figure out the best way to do like dataclass/pydantic style models in rust, in part for exactly this (want to make AP server that has core routines in rust but modeling tools exposed to python) and so can I see how u are writing the models?Like I know how to make structs and whatnot, but I am finding I need to do an awful lot of boilerplate to do even simple things like optional args or type unions, and I have yet to figure out how to neatly write constraints like "a string matching this regex pattern".
I figure there must be an "advanced" type modeling crate somewhere but I have yet to find it.
Anyway just me casually asking like "hey can I see ur work I wanna learn"
-
d@nny "disc@" mc²replied to jonny (good kind) last edited by
@jonny @aud for things like "a string matching this regex pattern" i generally create a struct containing a string and impl TryFrom for generic code. here is an example from the compiler i accidentally made for the zip crate cli https://github.com/zip-rs/zip2/blob/69670e35376d20709e694466190d6c866f932074/cli/src/args/extract.rs#L304 from https://github.com/zip-rs/zip2/pull/235/files
-
@[email protected] sure!!
So the whole thing is here: https://codeberg.org/Astatide/satyr
the AP specific stuff is ... it might be good to start looking at the base Object struct: https://codeberg.org/Astatide/satyr/src/branch/main/src/primitives/activity_pub/core/object.rs
Since a ton of AP stuff uses this as the base, I wrote up custom serializers (and debuggers) and getter/setter methods for it, basically. This way I don't have to replicate any code elsewhere for things that would 'inherit' it.
The getter/setter methods live in a trait,ObjectFields
: https://codeberg.org/Astatide/satyr/src/commit/eac98f43ea5b3baa170e5c1bbc94f0509716293e/src/primitives/activity_pub/core/object.rs#L85 The implementation is pretty basic; it just returns the value of the field.
I use a macro to derive the ObjectField trait on anything that wants to 'inherit' or extend: https://codeberg.org/Astatide/satyr/src/commit/eac98f43ea5b3baa170e5c1bbc94f0509716293e/crates/ap-inheritance/src/lib.rs#L6
Since AP objects can have any extra field they want, I store anything that doesn't conform inside of a BTreeMap.
For anything that inherits/extends from Object, I use this_extends
field. You can see it in Actor: https://codeberg.org/Astatide/satyr/src/branch/main/src/primitives/activity_pub/core/actor.rs
The serialization/deserialization and To/From methods are basically used to jam in object oriented code without having too much fuss.
Here's how I'm doing it for Actor:
https://codeberg.org/Astatide/satyr/src/commit/eac98f43ea5b3baa170e5c1bbc94f0509716293e/src/primitives/activity_pub/core/actor.rs#L53
And for Object:
https://codeberg.org/Astatide/satyr/src/commit/eac98f43ea5b3baa170e5c1bbc94f0509716293e/src/primitives/activity_pub/core/object.rs#L514
You'll notice heavy use of the following enums: ROX, RDF, XSD. They're defined here and are basically wrapping up the potential legal values: https://codeberg.org/Astatide/satyr/src/branch/main/src/primitives/activity_pub/core/wrappers.rs
Then for anything that allows you to have multiple things (like an Object or Link or anything that inherits from it) I have these types: https://codeberg.org/Astatide/satyr/src/commit/eac98f43ea5b3baa170e5c1bbc94f0509716293e/src/primitives/activity_pub/core/wrappers.rs#L172
So far I'm still at the serialization/deserialization stage and making adjustments as I go. There's other crates that do this stuff in one regard or another, but I wanted as much strong type checking as I could get.
As for regex... I'm just starting to look at the MIME types and wondering if I couldn't do something for that. I could probably do some regex parsing in theimpl MIME
block... or write a customFrom<APValue> to MIME
function that would do the regex parsing there and ignore it if's it not valid.
https://codeberg.org/Astatide/satyr/src/commit/eac98f43ea5b3baa170e5c1bbc94f0509716293e/src/primitives/activity_pub/core/wrappers.rs#L159 <- that's where I'd put the regex. -
jonny (good kind)replied to d@nny "disc@" mc² last edited by
@hipsterelectron
@aud
Nice ok that is helpful.I figure this can be done with macros, and I also am assuming someone has already done it, but it would be awesome to be able to do stuff like
struct MyStruct {
pattern_field: Pattern<"whatever.regex">
enum_field: Enum<"one", "two">
}
and then have the actual code to make that work get generated by the macro. Like for situations when you're parsing external input and don't expect it to already be in the enum type, or it's a single use thing and you wont need to refer to it again.
Again I assume someone has done this, but like for the case asta is talking about here with the super heterogeneous, incomplete, and etc. AP objects, some modeling shorthands would be lovely
-
@aud
This is sick as hell thank you thank you THANK you! I am bookmarking this bc I need to be working on a Halloween costume, but u are tempting me to not -
@[email protected] I'm also using
APValue
, which is literally almost just exactlyValue
from theserde
crate except I didn't want to writeFrom<serde_json::Value>
functions for my AP types because I only want to try and construct AP types under specific circumstances. Plus that allows me to implement AP specific functionality on the JSON. (I have aFrom<serde_json::Value> for APValue
function that does a clean conversion). In fact, I DO have aimpl APValue
block that does a little matching on the context (and will probably do more later): https://codeberg.org/Astatide/satyr/src/commit/eac98f43ea5b3baa170e5c1bbc94f0509716293e/src/primitives/activity_pub/parser.rs#L293 -
Asta [AMP]replied to jonny (good kind) last edited by
@[email protected] no, thank you! I'm glad it's helping to at least inspire!!
I actually am thinking of breaking it out into its own crate? Because I think I want to use it for something else, too, since there's nothing limiting AP as a social media FP protocol.
There's other crates, but they assume you're using it for social media (the lemmy one has it baked in with axum). This would only need to rely on like,serde
,serde_json
, maybe a few other commonly used things. -
d@nny "disc@" mc²replied to jonny (good kind) last edited by
@jonny @aud i really really liked @flaviusb's approach to metaprogramming here https://mastodon.social/@flaviusb/113117546857169170—proc macros generally let you output arbitrary token streams but then you have to output arbitrary token streams
-
d@nny "disc@" mc²replied to d@nny "disc@" mc² last edited by
@jonny @aud rust is generally very very very big on being "explicit" about "tradeoffs" and unfortunately what that means is lots of boilerplate because people feel that's "safer" when that's not what a lot of uses of rust need. it's a very corporate framing that i feel makes the language worse for hacking
-
@[email protected] the
_extends
andObjectFields
trait work for things with multiple inheritance levels, too. I'll be doing the same thing forLinks
, too. -
Asta [AMP]replied to d@nny "disc@" mc² last edited by
@[email protected] @[email protected] yeah, I've had a few partial reworks of what I've been writing (you can see their ghosts in the form of commented out blocks)
But even with something that I think minimizes boilerplate, my object.rs file is almost 1000 lines long and I have derive macros for very boilerplate stuff. -
d@nny "disc@" mc²replied to Asta [AMP] last edited by
@aud @jonny i generally do a lot of the kind of metaprogramming jonny is trying to do and i do it through trait logic bc that makes a nice API but i have banged my head against the wall for many many hours to understand how to do that and how to do that nicely and extensibly (i guess extensibility might be my bias too) but traits just let you express composition they don't avoid boilerplate in fact they require boilerplate to impl. dyn traits create c++-like vtables and can let you achieve some dynamism but not metaprogramming. i obv like rust but i feel what jonny is trying to do is underserved and could be done better (i don't like saying "it's the wrong language" bc that is an excuse for mediocrity imo)
-
d@nny "disc@" mc²replied to d@nny "disc@" mc² last edited by
-
Asta [AMP]replied to d@nny "disc@" mc² last edited by
@[email protected] @[email protected] I mean! You know, it's probably the most raw Rust I've written, period. I'm hardly an expert.
And technically I have yet to confirm that this is a 'good solution' for data in the wild; hence why I'm working with partial serialization/deserialization of wild data to see if things match up the way it should. I do think it will, as it should just basically hit atodo!()
and error out or just useNone
so while I'm testing against raw data, I can update as necessary to either make it handle anything invalid... or specific weird use cases.
I'm trying to keep it as close to the spec as possible while maintaining strong typing and compatibility. Since this is also my way of gaining Rust experience and learning ActivityPub, it felt like a good way to go about it. -
Asta [AMP]replied to d@nny "disc@" mc² last edited by
@[email protected] @[email protected] I like extensibility, too. I'm not a strict object-oriented-kitten, but dyn traits are definitely not a 1 to 1 transfer of the concept and if you have to implement a spec that inherently uses a lot of object oriented ideas... well, I've thought about switching to more dyn traits. I suspect that's how the other crates may or may not implement them (I understand more of why the lemmy activitypub implements Object as a trait), but when you don't actually know the spec (hi)... I wasn't totally sure what I did or did not need on any object, basically.
Plus I figured translating a more OO schema into Rust would help me understand the language more. -
@[email protected] @[email protected] Plus, I don't think just using traits would allow me to implement
To/From
which I feel like allows me more transparent strong typing.
I feel like that approach would result in a lot of data getting thrown away, but to be fair, I expect that a lot of clients/servers throw away huge chunks of the incoming data anyway (for instance, my server spits out_misskey_summary
as part of a note, and obviously Mastodon isn't gonna give a shit about that). But especially if I'm using it for multiple projects, which I would like to, this means I keep all the incoming data in one form (rather than... I assumeserde
just silently dumps data if there's extra payload data that isn't in the struct? I dunno) or another and implementations can choose to extend that by sending/receiving expected data in the_extra
attribute. Or just by derivingObjectFields
-
@[email protected] @[email protected] A lot of this is conjecture on my part, for the record.
Speaking of conjecture, with the minor exception of me shoehorning inheritance in here, I feel like wrapping up different types in enums is very Rust-like...