How hard is it to process untrusted SVG data to strip out any potentially harmful tags or attributes (like stuff that might execute JavaScript)?
-
Simon Willisonreplied to Simon Willison last edited by [email protected]
Even better... it looks like I can point to them from a regular img tag and the SVG spec has me covered: https://www.w3.org/TR/SVG2/conform.html#secure-static-mode
zellyn (@[email protected])
@[email protected] I *believe* if you use an svg as the `src` of an image, it turns off all the javascript/onclick handlers, etc. Of course, you might *want* some of the javascript.
Hachyderm.io (hachyderm.io)
-
João S. O. Buenoreplied to Simon Willison last edited by
@simon based on the exoerience of people who tried to create a Python sandbox over the decades, I'd say it is pretty much impossible. (save for a browser saparayed as another page box: i.e. a "Frame")
-
Simon Willisonreplied to Simon Willison last edited by
... and it looks like that means I can do an img tag with an src that points to a base64 encoded SVG object and any nasty JavaScript etc will be disabled for me - here's an example which seems to demonstrate that working https://gistpreview.github.io/?03f0076446027b9b12e1ea14315db52b
-
Simon Willisonreplied to João S. O. Bueno last edited by
@gwidion I think JavaScript sandboxes are a whole lot easier than Python, because browsers are already the most widely-deployed sandboxes in the world
-
João S. O. Buenoreplied to Simon Willison last edited by
@simon i agree that a "document" in a tab or a frame is a good sandbox. But I doubt very much one can achieve slfurther segregation within a document. there are way too many ways of linking back to javascript from html or svg tags, for example. And JS, on its side, has no segregation or protection whatsoever: one is free to manipulate all the DOM and beyond.
-
Simon Willisonreplied to João S. O. Bueno last edited by
@gwidion it looks to me like https://claude.ai has a robust solution to this, using a combination of iframes with the sandbox attribute and CSP headers, plus web workers with CSP headers and careful application of postMessage
I'm still trying to reverse engineer how their solutions work though
-
Jake Archibaldreplied to Simon Willison last edited by
@simon <iframe sandbox> is useful here. You can even allow JavaScript but have it run in an opaque origin.
-
Simon Willisonreplied to Jake Archibald last edited by
@jaffathecake I'm desperately keen on learning the true ins and outs of that, but I've found detailed documentation (including browser support) on all of the options you can stuff in that sandbox attribute frustratingly difficult to locate
-
@simon something to check if you do this: users can right click on the image and open them in a new tab. If they do this, scripts will then run. Check that the URL doesn’t share an origin with your site. I know that blob: URLs do…
-
Jake Archibaldreplied to Simon Willison last edited by
@simon the table at the bottom of https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe is decent
-
Simon Willisonreplied to Jake Archibald last edited by
@jaffathecake it's the best I've seen but it still leaves me with so many questions... how good is browser support for each of those allowX things? What do browser security experts advise in terms of using them?
I'm really paranoid
-
Jake Archibaldreplied to Simon Willison last edited by
@simon the browser support for the various allow features is in the table at the end of the page
-
Simon Willisonreplied to Jake Archibald last edited by
@jaffathecake wow I missed that! Thank you, this helps a LOT
-
@ben_lings that's a good call - I checked and as far as I can tell the base64 URL when opened in a new page has no relationship at all to the page it was originally hosted
-
Frederik Braun �replied to Simon Willison last edited by
@simon @jaffathecake if you just want the SVG displayed, put them in an <img> tag. Otherwise, your favorite sanitizer library DOMPurify has great SVG support. (Iframe sandbox works really great too!!)
-
Jake Archibaldreplied to Frederik Braun � last edited by
-
Frederik Braun �replied to Jake Archibald last edited by
@jaffathecake @simon yes, totally. Dunno if Simon would want scripts in the images. If you want them, sandbox gives better controls. If you want to police the exact set of allowed elements, a sanitizer is even better.
But if all you want is to safely display them, img is really simple (don’t host the user supplied files on the same origin in either of these cases though :))
-
Simon Willisonreplied to Frederik Braun � last edited by
@freddy @jaffathecake I think I can even get away with not serving the images from a separate domain if I instead inline them as base64 SVG in the img sec attribute
(Running off a separate domain is OK for me but makes things harder for my users if I release open source code for other people to self-host)