basically the question popped while thinking of installing plugin-adsense (which does not worked at all for me btw - 1.11.2)
[nodebb-plugin-markdown] Markdown Parser
-
The first parser for NodeBB. This used to be integrated directly, but has since been taken out in order to allow other users to utilise different parsers (e.g. BBCode).
This plugin is installed and enabled by default (as of v0.0.7).
-
This is not really nodebb issue, but not sure where to ask.
I am migrating 350k+ records, between post.content and user.signatures, my migrator is almost ready but I am having serious memory issues when trying to convert html-to-markdown using html-md, I keep getting segmentation faults, out-of-memory errors, again, clearly not your problem, but I guess this module is not designed to convert at a large scale, more like a client side tool.
I am wondering if I can disable HTML sanitization on nodebb-plugin-markdown then write a nodebb-plugin- that will sanitize and convert html to md on the client side, using the same html-md module, but I am not sure how to do that, I can clearly inject some js to do the job, but can I count on the .post-content, .post-signature to always be there? that seams so fragile, also, what about the "recent-replies" sidebar?
Also not sure if thats even possible, the script tags are easy to remove, but what about other tags with inline javascript code in them
<a href="javascript:doEvil();">
?Again, that's not really NodeBB's problem but I and anyone else migrating to, would appreciate an advice.
Thanks
-
-
No worries, I can tell you're busy from the commits frequency.
Yea so, for whoever else landing here,
I disable nodebb-plugin-markdown.sanitizeHTML (which uses marked) but I use nodebb-plugin-sanitizehtml (which uses sanitize-html) to stay safe.
It would be nice if marked accepted options for sanitization, I wouldn't need another plugin, but that works.
Thanks.
-
the latter accepts options such as: "allowedTags = [ "a", "b", "i", .... ]", and "allowedAttributes: { "a": [ "href", "target"], ... } ", where as chjj/marked is more aggressive and does not allow any tags.
And even if you allow,<a>
tags w/href
attr, just like the example, it does sanitize content such ashref="javascript:alert('hai')"
out of it, and of course<script>
tags by default, tested.I'd like to move away from html content completely, but not until I markdown all of my HTML content, i can't :(, and till then it's more for the server to parse each post, and I would rather doing that on the client side, but it's ok for now.