Hey, #LazyWeb!
-
Hey, #LazyWeb!
Does anyone know of a #Python package that takes JSON-LD as input, and validates whether it conforms to schema.org schemas?
Bonus points if you can also easily detect fields that don't conform, not just a yes/no answer.
I looked at a bunch of things, but didn't have too much luck with it. I just figured someone must have done this before.
Update: kinda solved it, more below.
-
validates whether it conforms to schema.org schemas?
What do you mean by this?
My memory from when I last looked at schema.org was, that EVERYTHING IS VALID due to how their use
@vocab
. I'm unsure how one would claim that the nonsensical{ "@context": "http://schema.org", "moo": "mooo" }
is invalid. There are reasons, why I claim that json-ld is not ready. This is one of them.
-
@helge well, it's less whether that is valid. But if you e.g. have:
{
"@context": "https://schema.org/",
"@type": "Thing",
"name": 3.14
}It should probably tell me it's not valid, because the name property should be Text (it's a little difficult here because in a textual representation as JSON, everything is text, but from a typing point of view, this isn't).
-
AFAIK: There is nothing in json-ld that tells you this is invalid. The schema.org validator agrees.
-
@helge I'm not saying it's invalid JSON-LD.
-
@jens @helge *points at her sign* “RDF (and by extension JSON-LD) is highly structured, schemaless, garbage that you may find useful data in”
So yeah, there is really no validation of schema or expected types. schema.org is a misnomer in that JSON-LD and RDF just don't care whether something is a string, URI, boolean, float, whatever.
Only Shex/Shacl really start to touch on schemas.
-
@jens @helge you might like this really old post of my on socialhub: https://socialhub.activitypub.rocks/t/linked-data-undersold-overpromised/2268/29?u=thisismissem
-
@thisismissem @helge I understand that.
And yet, the descriptions on schema.org are enough to perform validation with.
This isn't a JSON-LD or RDF question, really. It just happens to be the case that's what my data is expressed in.
-
@jens As @thisismissem points out, schema.org is not really specifying but suggesting types to be used, same with sdo-types as domain/range. If you want constraints you must define them yourself with Shex/Shacl or JSON Schema. For AMB, a schema.org-based metadata profile for educational resources on the web we chose JSON-LD plus a normative JSON Schema so that also people unfamiliar with RDF can easily use it: https://w3id.org/kim/amb/20231019 (German) JSON Schema: https://w3id.org/kim/amb/20231019/schemas/schema.json