The fact that AP defaults to HTML for the content annoys me to no end. It just increases the attack surface by so so much x_x
-
The fact that AP defaults to HTML for the content annoys me to no end. It just increases the attack surface by so so much x_x
-
@hrefna what other format would you use?
-
@hrefna like
There’s only two widely implemented markup languages with an open and mostly consistently implemented specification
One of them is plain text, and the other is html
-
@erincandescent By default? Plain text or no default and force it to declare explicitly, which should default to the safest possible processing if unspecified.
HTML just has too much variety in its parsing and too much of an attack surface. Especially between iframes, image loading, script tags, CSS, and what is required for even basic rendering.
The "safest possible default choice" is always plain text. Followed by a known safe subset of HTML, but that's not specified that I can find.
-
@hrefna @erincandescent the feedparser sanitization rules are good practice https://feedparser.readthedocs.io/en/latest/html-sanitization.html but you can start simpler like mastodon does at the moment https://github.com/mastodon/mastodon/pull/23913
I'm not sure it is fully documented outside the code though https://github.com/mastodon/mastodon/blob/main/lib/sanitize_ext/sanitize_config.rb -
{Insert Pasta Pun}replied to Hrefna (DHC) on last edited by
@hrefna what should it default to?
-
-
@[email protected] @[email protected] I hate to put forth a "slippery slope" argument, but I think it fits here.
If each implementor has their own specific rules for sanitization (NodeBB has their own, too — an allow-list of tags/attributes), then we could potentially end up in a situation not-unlike email, where the list of acceptable tags is so restrictive that everybody ends up defaulting to sending out a janky subset of html so it comes through properly.
That said, one thing that's going for AP-at-current is that at least there's no expectation that federated content be rendered like a full webpage.
<marquee>that would be tragic</marquee>
-
@julian Yeah, my least favorite way to do this is what we're doing, which is "everyone is responsible for doing their own sanitization on both outputs and inputs with no agreement on how we do that sanitization consistently and very little mentioned about the need for it in the first place."
-
@[email protected] sounds like you're volunteering yourself for an FEP snerk
-
-
-