[nodebb-plugin-markdown] Markdown Parser

NodeBB Plugins
  • The first parser for NodeBB. This used to be integrated directly, but has since been taken out in order to allow other users to utilise different parsers (e.g. BBCode).

    This plugin is installed and enabled by default (as of v0.0.7).

    Git repository

  • This is not really nodebb issue, but not sure where to ask.

    I am migrating 350k+ records, between post.content and user.signatures, my migrator is almost ready but I am having serious memory issues when trying to convert html-to-markdown using html-md, I keep getting segmentation faults, out-of-memory errors, again, clearly not your problem, but I guess this module is not designed to convert at a large scale, more like a client side tool.

    I am wondering if I can disable HTML sanitization on nodebb-plugin-markdown then write a nodebb-plugin- that will sanitize and convert html to md on the client side, using the same html-md module, but I am not sure how to do that, I can clearly inject some js to do the job, but can I count on the .post-content, .post-signature to always be there? that seams so fragile, also, what about the "recent-replies" sidebar?

    Also not sure if thats even possible, the script tags are easy to remove, but what about other tags with inline javascript code in them <a href="javascript:doEvil();"> ?

    Again, that's not really NodeBB's problem but I and anyone else migrating to, would appreciate an advice.

    Thanks

  • Ah, sorry about never getting back to you -- yes, you can disable HTML sanitization from the control panel

  • No worries, I can tell you're busy from the commits frequency.

    Yea so, for whoever else landing here,

    I disable nodebb-plugin-markdown.sanitizeHTML (which uses marked) but I use nodebb-plugin-sanitizehtml (which uses sanitize-html) to stay safe.

    It would be nice if marked accepted options for sanitization, I wouldn't need another plugin, but that works.

    Thanks.

  • What's the difference between chjj/marked's HTML sanitization and the sanitize-html method?

  • the latter accepts options such as: "allowedTags = [ "a", "b", "i", .... ]", and "allowedAttributes: { "a": [ "href", "target"], ... } ", where as chjj/marked is more aggressive and does not allow any tags.
    And even if you allow, <a> tags w/ href attr, just like the example, it does sanitize content such as href="javascript:alert('hai')" out of it, and of course <script> tags by default, tested.

    I'd like to move away from html content completely, but not until I markdown all of my HTML content, i can't :(, and till then it's more for the server to parse each post, and I would rather doing that on the client side, but it's ok for now.

  • Wow, that's pretty interesting... I did notice that chjj's marked was more of an "all or nothing" approach to HTML sanitization. It's good to know that your plugin can work safely alongside mine to provide an even better third option.


Suggested Topics