i think honestly what i want out of a storage solution is to like maybe upload stuff to some kind of object storage and then separately have a graph database or rdf quad store for the metadata.

infinite love ⴳ

@tech_himbo the proposal is basically "store content separately from metadata" and "have metadata link to content instead of inlining it in your canonical data format"

tech himbo

@trwnh so, they need to agree on a serialization format for articles, and we need a mapping from A’s metadata format to B’a metadata format to enable import/export. is that right?

infinite love ⴳ

@alice i would split the head off into a separate descriptor

but yes this is basically the issue here. you have head and body, the content is ideally the last remaining body after you unwrap all the layers and extract whatever profiles. you shouldn't be required to use any specific format or container just to pass some atomic content around (metadata optional)

infinite love ⴳ

@tech_himbo no, they need to agree on the *semantics* of the "content". serializations and formats can be anything, and can be negotiated between peers ("i understand a b c", "i understand c d e", "okay let's agree to use c for this session")

this is about the semantic content model basically

in practical terms say instead of sending you an entire HTML document i just sent you a single paragraph element or perhaps only its inner text

tech himbo

@trwnh ok, so the requirement is that both services must support a common format, even if that format isn’t a broad standard. is that right?

infinite love ⴳ

@tech_himbo yes, but also the most straightforward solution to content is to literally just pass it along 1:1 without any containers or metadata

the "problem" is essentially that, for something like an HTML document saved to disk as .html, we pre-bundle the content in the middle of a bunch of presentational stuff that is not content. or for a JSON document, we put an escaped string as the value of some key. i'm saying we don't need to always do that

tuban_muzuru

@erincandescent @alice @trwnh

... if you're going to export huge amounts of data, really huge - have the decency to write a reader and encoder for the data you're trying to preserve. When someone will , in the distant future, try to decode your trove, they'll appreciate your foresight.

infinite love ⴳ

@tuban_muzuru @erincandescent @alice in most cases "always bet on plain text" is good enough for that kind of thing imo. this is more about strategy and architecture of like... managing content. a sort of storage strategy, one that can handle abstract backends

it's probably going to look less like an sql database and a lot more like object storage in the end: the blob being the content (even if it's as simple as a literal string), and the metadata being whatever attribute-value pairs

Erin 💽✨

@trwnh @tuban_muzuru @alice I've got a bit of a visceral reaction to this after dealing with formats with 3 nested layers of Base64

But it's reasonable for purely textual data

infinite love ⴳ

@erincandescent @alice @tuban_muzuru the goal of this thought exercise is to use exactly 0 layers of nesting