HTML-manipulation in Node.js?



  • Hey guys,
    im trying to manipulate a .html-file before sending it to the client. Something like entering a <link> element directly after the <head> - tag. Is there a clever way to do this? Until now I did this by using string-operation (like split) which in fact is a horrible way. Does any smart person know a solution to my dilemma?

    Greetings!
    Ron


  • GNU/Linux

    The simplest solution that comes to my mind would be using a String.prototype.replace().

    "<html><head></head></html>".replace(/(<head>)/, "$1<link .../>");
    

    Depending on use-case (e.g. respecting HTML comments) more effort would be needed.



  • Thats how I did it until now, that does not work when the user does not use <head> but < head > or <head > or anything like that, it's super fragile. Im looking more for something like a DOM-manipulation like in javascript (document.head.appendChild(newLinkTag); ) that wont kill my code if the user changes something.


  • GNU/Linux

    This would require a package like jsdom, domino or similar (I have no experience with those thought). I assume it's a pretty non-performant way for such simple tasks as adding a single element into the head.

    I'd rather improve the regex a bit like /(<head(?:\s[^>]*)?>)/i which should be sufficient (except ignoring HTML comments).

    "<html><heAd some-attr=\"tes\" ></heAd></html>".replace(/(<head(?:\s[^>]*)?>)/i, "$1<link .../>");
    > "<html><heAd some-attr="tes" ><link .../></heAd></html>"
    

    < head> is no valid HTML 😉

    It's up to you how to solve this.



  • Hey and thanks for the support 😉
    At the end the project serves to create a node.js module, that is used by OTHER people, so I would like to make it fail-save and idiotproof. Does anybody have experience with the named libraries jsdom, domino or cheerio? Or is the shown solution by frissdiegurke the best solution for this?

    Sorry, I'm new to node development, especially to the libraries^^

    Greetings!


  • Community Rep

    @Metalmind I have used and can recommend cheerio. It is excellent at parsing html, if you can live with its very big limitations. It's very small and fast.

    It doesn't simulate a browser DOM like JSDOM, but instead wraps jQuery, so that it can be used to manipulate your html string in the way you expect. If you are comfortable with using jQuery to manipulate html, cheerio should be a perfect fit. It's also very good at handling badly/incorrectly formatted html.

    JSDOM I would not recommend for simple html parsing, it's more suited to browser simulation. I use it when I need to pre-render something in a browser-like environment, and for testing.



  • Hey guys,
    I tried cheerio, but as it seems it does not support to inject any kind of handlers to the code. Is there a way to do this or is another alternative needed?

    Greetings and thank you!


  • Community Rep

    @Metalmind What do you mean by inject any kind of handlers to the code?



  • Hey yariplus,
    let me show you:
    I wanted to create an onclick-handler to a button, which I now realized by doing this:

    var htmlString = fs.readFileSync(FilePath);
    var parsedHTML = $.load(htmlString)
    parsedHTML('.interactive').attr('onclick', "clickEmit(this.id)").html();

    now the clickEmit function is called, when I click any element with the interactive-class.

    In the parsed html file now the code is:

    <script src='/socket.io/socket.io.js'></script>
    <script>
    function clickEmit(id){
    var socket = io.connect('http://localhost:8080');
    socket.emit('buttonClick', id );
    }
    </script>

    My next step would be to take the code above (beginning with <script src='/socket.io/socket.io.js'></script> ), delete it from the .html-file and re-inject it into the file via cheerios prepend-function.

    So I now try something like:

    var content = "<script src='/socket.io/socket.io.js'></script><script> function clickEmit(id){ ... ... ...} </script>"
    parsedHTML('head').prepend(content);

    But when I do this, the code appears at the same position as before, but on a click on any button nothing happens anymore. Why is that so?

    Greetings and thanks for the good advises!



  • Solved:

    took the code in a .js file and included it via prepend to the .html file. Nevertheless: Great forum, many thanks!


  • Community Rep

    @Metalmind Excellent, glad you got it working.

    There is probably a way to get that raw script code running, but using an exteral js file as you did is definitely better.


Log in to reply
 


Looks like your connection to NodeBB was lost, please wait while we try to reconnect.