How hard is it to process untrusted SVG data to strip out any potentially harmful tags or attributes (like stuff that might execute JavaScript)?
-
How hard is it to process untrusted SVG data to strip out any potentially harmful tags or attributes (like stuff that might execute JavaScript)?
I feel like this is well trodden ground for HTML these days, are there robust solutions for the SVG version of this problem?
-
Simon Willisonreplied to Simon Willison last edited by
I'm wondering if I can give untrusted authors the ability to go wild with custom SVG in a framed-off fixed size area of a web page, without breaching the security of the wider page or application
-
Simon Willisonreplied to Simon Willison last edited by
This is great! This Cloudflare Rust library includes a detailed test suite that tells me everything I wanted to know https://mastodon.theorangeone.net/@jake/113370469717181352
-
@simon sounds like you want a sandboxed iframe?
-
@simon the JS in an SVG cannot interact with anything outside of itself.
So while an SVG can do all sorts of crazy things, it can't escape its sandbox. -
@polotek yeah probably! I'm still trying to work up my confidence in those, detailed and comprehensive documentation on exactly what the sandbox attribute does has been hard to come by
-
Simon Willisonreplied to Simon Willison last edited by [email protected]
Even better... it looks like I can point to them from a regular img tag and the SVG spec has me covered: https://www.w3.org/TR/SVG2/conform.html#secure-static-mode
-
João S. O. Buenoreplied to Simon Willison last edited by
@simon based on the exoerience of people who tried to create a Python sandbox over the decades, I'd say it is pretty much impossible. (save for a browser saparayed as another page box: i.e. a "Frame")
-
Simon Willisonreplied to Simon Willison last edited by
... and it looks like that means I can do an img tag with an src that points to a base64 encoded SVG object and any nasty JavaScript etc will be disabled for me - here's an example which seems to demonstrate that working https://gistpreview.github.io/?03f0076446027b9b12e1ea14315db52b
-
Simon Willisonreplied to João S. O. Bueno last edited by
@gwidion I think JavaScript sandboxes are a whole lot easier than Python, because browsers are already the most widely-deployed sandboxes in the world
-
João S. O. Buenoreplied to Simon Willison last edited by
@simon i agree that a "document" in a tab or a frame is a good sandbox. But I doubt very much one can achieve slfurther segregation within a document. there are way too many ways of linking back to javascript from html or svg tags, for example. And JS, on its side, has no segregation or protection whatsoever: one is free to manipulate all the DOM and beyond.
-
Simon Willisonreplied to João S. O. Bueno last edited by
@gwidion it looks to me like https://claude.ai has a robust solution to this, using a combination of iframes with the sandbox attribute and CSP headers, plus web workers with CSP headers and careful application of postMessage
I'm still trying to reverse engineer how their solutions work though
-
Jake Archibaldreplied to Simon Willison last edited by
@simon <iframe sandbox> is useful here. You can even allow JavaScript but have it run in an opaque origin.
-
Simon Willisonreplied to Jake Archibald last edited by
@jaffathecake I'm desperately keen on learning the true ins and outs of that, but I've found detailed documentation (including browser support) on all of the options you can stuff in that sandbox attribute frustratingly difficult to locate
-
@simon something to check if you do this: users can right click on the image and open them in a new tab. If they do this, scripts will then run. Check that the URL doesn’t share an origin with your site. I know that blob: URLs do…
-
Jake Archibaldreplied to Simon Willison last edited by
@simon the table at the bottom of https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe is decent
-
Simon Willisonreplied to Jake Archibald last edited by
@jaffathecake it's the best I've seen but it still leaves me with so many questions... how good is browser support for each of those allowX things? What do browser security experts advise in terms of using them?
I'm really paranoid
-
Jake Archibaldreplied to Simon Willison last edited by
@simon the browser support for the various allow features is in the table at the end of the page
-
Simon Willisonreplied to Jake Archibald last edited by
@jaffathecake wow I missed that! Thank you, this helps a LOT
-
@ben_lings that's a good call - I checked and as far as I can tell the base64 URL when opened in a new page has no relationship at all to the page it was originally hosted