Still could use assistance if anyone can offer. The new forums we are developing using nodebb is delayed until the next release, but if someone cant point me in the right direction I would highly appreciate it.
Regex Question.
-
So I've got a strange issue with my youtube plugin, it doesn't seem to handle parameters after the youtube ID very well.
I've got a var that looks like this:
id = $el.data('youtube-id')
Which is parsed via the following regex
var regularUrl = /<a href="(?:https?:\/\/)?(?:www\.)?(?:youtube\.com)\/(?:watch\?v=)([\w\-_]+)?&([\w\-_]+)">.+<\/a>/g; var shortUrl = /<a href="(?:https?:\/\/)?(?:www\.)?(?:youtu\.be)\/([\w\-_]+)">.+<\/a>/g; var embedUrl = /<a href="(?:https?:\/\/)?(?:www\.)youtube.com\/embed\/([\w\-_]+)">.+<\/a>/;
Except it doesn't just put the ID in, it also includes all of the parameters that a user may add afterwards, like start times etc. Which breaks, because the ID becomes
tGZlwK2qTCI&t=3m20s
which isn't a valid video ID. Normally this would only be a problem for the thumbnail that I fetch, but I append &autoplay to the URL here:src="//www.youtube.com/embed/' + id + '?autoplay=1"
I assume it's something up with my regex, I would like $1 to be just the video ID (11 characters), and everything after that to become a part of $2 so I can parse the parameters back in afterwards.
Basically what I'm after is the youtube URL looking like
src="//www.youtube.com/embed/' + id + '?autoplay=1` + parameters + `"
So
$1
would be the 11 character youtube ID, and$2
would be all other parameters after that ID. -
@julian said:
Is this client-side or server-side? Usually I can tell, except with Node, it's all js
Ermmm.
https://github.com/a5mith/nodebb-plugin-youtube-lite/blob/master/library.js
&
https://github.com/a5mith/nodebb-plugin-youtube-lite/blob/master/static/lib/lazyYT.js
-
Server-side, then.
Use this module: http://nodejs.org/api/url.html#url_url_parse_urlstr_parsequerystring_slashesdenotehost
It's going to make your life a million times easier than parsing an URL via regex.
For client-side, use the
Location
object, built in. But that's another topic -
@julian said:
Server-side, then.
Use this module: http://nodejs.org/api/url.html#url_url_parse_urlstr_parsequerystring_slashesdenotehost
It's going to make your life a million times easier than parsing an URL via regex.
For client-side, use the
Location
object, built in. But that's another topicRemember you're talking to an idiot here?
I'll look into that.
-
@Ted sits idly by and stalks the topic, knowing that with a little more time, this will be resolved.
-
@esiao The regex works, but if you use a parameter, the ID becomes the
{ID}&the parameter
, which breaks embedding.I forked the youtube plugin that psychobunny made, so I've not really changed much of it.
EDIT: Using that site, I've managed to get what looks right, I'll give it a go and let you know how it goes.
EDIT again, as you can see from http://regexr.com/39m51 the end of the ID is now being included under $2 if there's no parameters, which also breaks it.
Is there a way of parsing null if there's no parameters? I'm so close. I think.
-
With
/<a href="(?:https?:\/\/)?(?:www\.)?(?:youtu\.be)\/((?:[\w\-_]+){11})\??([^&]+)?(&?[\w&]+)*">.+<\/a>/g
On<a href="http://youtu.be/foNkJJWFuI8?t=47s¶meter">something</a>
It creates three groups
1: id
2: time
3: parameterIs that what you wanted ? If the time is not used you can make a non capturing group on
([^&]+)
-
That wouldn't work on
<a href="http://youtu.be/foNkJJWFuI8?t=47s¶meter=1">something</a>
due to the = sign.I'm ok with not using the parameters bit, but time would be good to have. As long as I can get the ID without anything else leaking into it, I'm not 100% concerned about parameters etc.
-
Hey @esiao , thanks for the code, there's a slight issue though, that appears to be regex based, it's only firing each code, once, if I embed the same URL, it will only embed 1, not the other, however if I change the video embed to be one of the other URL variations, replacing watch?v= with /embed/ for example, then it embeds fine, as I can't read regex, is there something in this that is stopping it from firing again afterwards.
-
I'd like to help you out, but I'd need more specific inputs you'd like to read the ID and the parameters from.
Currently I can just tell you that sth. like
[\w\-_]
is no clean regex since it's equivalent to[\w-]
and the shorter the better the overview
Also the[^<a]+
out of the last full regex of @esiao would stop at the firsta
occurrence, not only at the first<a
occurrence as it may suggest.
So there are a few not-so-well parts within each regex I've seen yet and you didn't consider users who put thev=...
parameter after other parameters within the regularUrls. And are you sure that it'll always be like<a href="...">...</a>
and in no case thea
-tag could get another attribute (My emoji-extended broke at some version because thecode
-tags got ^^)...If you want me to help you out with more clean regex (up to my knowledge) I'd likely help you if I get a few example URLs that cover all cases.
Also if you'd be willing to learn regex syntax I'd try to explain my results afterwardsBut for now I have to sleep first, good night zzz