• So I've got a strange issue with my youtube plugin, it doesn't seem to handle parameters after the youtube ID very well.

    I've got a var that looks like this:

    id = $el.data('youtube-id')
    

    Which is parsed via the following regex

             var	regularUrl = /<a href="(?:https?:\/\/)?(?:www\.)?(?:youtube\.com)\/(?:watch\?v=)([\w\-_]+)?&([\w\-_]+)">.+<\/a>/g;
        var	shortUrl = /<a href="(?:https?:\/\/)?(?:www\.)?(?:youtu\.be)\/([\w\-_]+)">.+<\/a>/g;
        var	embedUrl = /<a href="(?:https?:\/\/)?(?:www\.)youtube.com\/embed\/([\w\-_]+)">.+<\/a>/;
    

    Except it doesn't just put the ID in, it also includes all of the parameters that a user may add afterwards, like start times etc. Which breaks, because the ID becomes tGZlwK2qTCI&t=3m20s which isn't a valid video ID. Normally this would only be a problem for the thumbnail that I fetch, but I append &autoplay to the URL here:

    src="//www.youtube.com/embed/' + id + '?autoplay=1"
    

    I assume it's something up with my regex, I would like $1 to be just the video ID (11 characters), and everything after that to become a part of $2 so I can parse the parameters back in afterwards.

    Basically what I'm after is the youtube URL looking like

    src="//www.youtube.com/embed/' + id + '?autoplay=1` + parameters + `"
    

    So $1 would be the 11 character youtube ID, and $2 would be all other parameters after that ID.

  • GNU/Linux Admin

    Is this client-side or server-side? Usually I can tell, except with Node, it's all js 😄


  • GNU/Linux Admin

    Server-side, then.

    Use this module: http://nodejs.org/api/url.html#url_url_parse_urlstr_parsequerystring_slashesdenotehost

    It's going to make your life a million times easier than parsing an URL via regex.

    For client-side, use the Location object, built in. But that's another topic 😄


  • @julian said:

    Server-side, then.

    Use this module: http://nodejs.org/api/url.html#url_url_parse_urlstr_parsequerystring_slashesdenotehost

    It's going to make your life a million times easier than parsing an URL via regex.

    For client-side, use the Location object, built in. But that's another topic 😄

    Remember you're talking to an idiot here? 😆 I'll look into that.


  • @Ted sits idly by and stalks the topic, knowing that with a little more time, this will be resolved. 😉


  • @a_5mith Your regex seems fine to me. You can use tools like http://regexr.com to debug them more easily.
    I've just tested the href parameter why are you testing everything and not only the actual link ?


  • @esiao The regex works, but if you use a parameter, the ID becomes the {ID}&the parameter, which breaks embedding.

    I forked the youtube plugin that psychobunny made, so I've not really changed much of it.

    EDIT: Using that site, I've managed to get what looks right, I'll give it a go and let you know how it goes. 😄

    EDIT again, as you can see from http://regexr.com/39m51 the end of the ID is now being included under $2 if there's no parameters, which also breaks it. 😆 Is there a way of parsing null if there's no parameters? I'm so close. I think.


  • @a_5mith

    With /<a href="(?:https?:\/\/)?(?:www\.)?(?:youtu\.be)\/((?:[\w\-_]+){11})\??([^&]+)?(&?[\w&]+)*">.+<\/a>/g
    On <a href="http://youtu.be/foNkJJWFuI8?t=47s&parameter">something</a>
    It creates three groups
    1: id
    2: time
    3: parameter

    Is that what you wanted ? If the time is not used you can make a non capturing group on ([^&]+)


  • That wouldn't work on <a href="http://youtu.be/foNkJJWFuI8?t=47s&parameter=1">something</a> due to the = sign.

    I'm ok with not using the parameters bit, but time would be good to have. As long as I can get the ID without anything else leaking into it, I'm not 100% concerned about parameters etc.


  • @a_5mith Just adding (&?[\w&=]+) instead of (&?[\w&]+) should do the trick.


  • Hey @esiao , thanks for the code, there's a slight issue though, that appears to be regex based, it's only firing each code, once, if I embed the same URL, it will only embed 1, not the other, however if I change the video embed to be one of the other URL variations, replacing watch?v= with /embed/ for example, then it embeds fine, as I can't read regex, is there something in this that is stopping it from firing again afterwards. 😕


  • @a_5mith Yes you're right, here's the fix

    /(?:<a href="(?:https?:\/\/)?(?:www\.)?(?:youtu\.be)\/((?:[\w\-_]+){11})\??([^&]+)?(&?[\w&=]+)*">[^<a]+<\/a>)+/gm

    But if the links are like <a href="">link</a><a href="">link</a> it will not work.


  • @esiao said:

    Unfortunately, that doesn't seem to work either, even if I put the works of shakespere between the two youtube URLs, it still only displays one.

    Also it doesn't seem to match watch?v=videoID either. But it's probably a slightly different regex.

  • Plugin & Theme Dev

    I'd like to help you out, but I'd need more specific inputs you'd like to read the ID and the parameters from.

    Currently I can just tell you that sth. like [\w\-_] is no clean regex since it's equivalent to [\w-] and the shorter the better the overview 😄
    Also the [^<a]+ out of the last full regex of @esiao would stop at the first a occurrence, not only at the first <a occurrence as it may suggest.
    So there are a few not-so-well parts within each regex I've seen yet and you didn't consider users who put the v=... parameter after other parameters within the regularUrls. And are you sure that it'll always be like <a href="...">...</a> and in no case the a-tag could get another attribute (My emoji-extended broke at some version because the code-tags got ^^)...

    If you want me to help you out with more clean regex (up to my knowledge) I'd likely help you if I get a few example URLs that cover all cases.
    Also if you'd be willing to learn regex syntax I'd try to explain my results afterwards 😉

    But for now I have to sleep first, good night zzz

Suggested Topics

  • 1
  • 1
  • 20
  • 11
  • 5
| |