Hang on, I think attaching semantics to schemas, rather than data, solves 100% of the problems with both semantics and schemas.
-
Marco 🦝 :verified: :grumpycat:replied to Jenniferplusplus last edited by
@jenniferplusplus found this https://www.mdpi.com/2076-3417/11/24/11978
As well as some discussions in the json repo
-
Jenniferplusplusreplied to Marco 🦝 :verified: :grumpycat: last edited by
@m2vh yeah, that *kind* of thing. I found SAWSDL, and these SAWSDL-for-json proposals unsurprisingly have the same problem. Embedding semantic annotations into other things harder to work with. I imagine it was easier to ignore that when the structure was XML, because XML is designed to be self describing. But it would work better if the definition is outside the subject, the way schemas operate.
So you get
Data <- syntactic structure (schema) <- semantic meaning (vocabulary) -
Jenniferplusplusreplied to Jenniferplusplus last edited by
@m2vh otherwise, you get the same dynamic as always: system developers can't even imagine why or how you would even possess data whose semantic meaning you don't already know. (Fair, btw.) And the demands made by semantic data nerds seem intrusive and burdensome. (Because they are.)
-
Marco 🦝 :verified: :grumpycat:replied to Jenniferplusplus last edited by
@jenniferplusplus i remember an article where it was proposed to put the semantics external like a sub-schema. But i can't find it. It was proposed by people from the web of things people. I i found the same discussion in the area of geolocation.
I think, that examples will show the benefits. In the era of ai it could be really usefull if your data factory can gather semantic information on the data by following a link.
I think you should start a repo and buy a domain
-
Marco 🦝 :verified: :grumpycat:replied to Marco 🦝 :verified: :grumpycat: last edited by
@jenniferplusplus and maybe people developing odata could have some ideas on how to implement something like the semantic schema extension. They work already with linking to additional information.
-
Jenniferplusplusreplied to Marco 🦝 :verified: :grumpycat: last edited by
@m2vh my concern is mostly in the realm of getting semantic data nerds to get out of my way and stop making everything harder than it needs to be. I also hate the idea of making data more legible to LLMs.
So. This is a battle for someone else to fight.
-
You might find this interesting:
GitHub - common-workflow-language/schema_salad: Semantic Annotations for Linked Avro Data
Semantic Annotations for Linked Avro Data. Contribute to common-workflow-language/schema_salad development by creating an account on GitHub.
GitHub (github.com)
Basically everything defined in the schema has a corresponding semantic node, documents are written in YAML but have a corresponding rdf representation, and robust support for including fields outside the core vocabulary in an unambiguous way
-
@tetron This appears to be a project to define schemas for linked data documents? And that is, again, backwards. I want to attach (but not embed) vocabularies to schemas. Mostly so that I stop having to deal with it. It can be entirely the problem of the people who want it, instead of them making it my problem.
-
@jenniferplusplus
I think you want something like a json-ld context, which describes how json fields map to semantic nodes without necessarily specifying a schema, but even then it is hard to avoid asserting schema-like details such as whether a field takes a single value or an array of values.But ultimately it is a problem for the schema design, because common anti patterns like reusing the same field name to mean different things in different contexts make it challenging to assign semantics.
-
@tetron No, I extremely don't want that. I want the people who do want that to stop forcing it on me. I promise I know about json-ld, and I hate it.
-
Jenniferplusplusreplied to Jenniferplusplus last edited by [email protected]
@tetron I want to give my json schema and human readable documentation to the people to who want that. And I want them to go off on their own devise their own method to attach semantic meaning to things that doesn't burden me with solving this problem that I don't have and don't care about.
-
@jenniferplusplus
I'm not very familiar with the ActivityPub spec but this is about AP isn't it? -
@tetron That is certainly the largest and most immediate contributor, yes.
But it's a concern almost any time that almost any W3C standard or working group is involved with something that needs to operate at high QPS.
-
@jenniferplusplus
So the irony is that linked data semantic web stuff is totally designed for annotating external resources the way you want, but only if the resource itself has a linked data mapping (i.e. there's way to refer to individual elements in the document), and schema documents written with json schema don't. Which is why the schemas need to be linked data themselves. Cue the endless screaming. -
@tetron That's not really ironic, so much as tangential. I get the benefits in a reference context. But at best it's useless in a processing context. To the extent that it displaces techniques that enable processing, it's actually a detriment.
-
@jenniferplusplus
So I'm writing from the perspective of the particular thing I linked earlier but I just want to mention a couple of things it has:a) code generators for a bunch of languages including C#, which use the schema to write the data structures and parsing/validation for you, which is very fast and there's no lunacy like having to transit through an rdf triple store
b) knowing which fields are identifiers or references to other things has some nice properties for validation
-
@tetron That would be helpful if there was a defined schema, or if it was even possible to define a schema. But with activitypub, that's not actually possible.
-
@jenniferplusplus
So if we're talking about https://www.w3.org/TR/activitystreams-vocabulary/
there is a machine readable formal model under there, it is just defined in OWL. I don't offhand know of tools that take in OWL and give you data models in more practical languages but that doesn't mean they don't exist. For ActivityStreams specifically it doesn't look like it would be all that hard.
At this rate I'm going to talk myself into writing a proof of concept, which is dangerous. -
@tetron I'm pretty sure both the owl and context are broken and contradict the spec. The spec also mandates that fields with a single value must serialize as a value rather than an array, which creates an enormous number of problems.
I'm pretty sure that combines to make it impossible. But if you can somehow turn it into a proper schema, you'd be advancing fediverse development by years.