How to hace nice URLs?
-
When creating new topics with these characters:
á é í ó and ú I get the url with them, which is replaced by their equivalent %C3%Ais it possible to configure Nodebb to replace those charactes when creating the url so they look nice (or replace with _)
Or to define the url per post independent of the title?Thanks
-
There are some solutions here https://stackoverflow.com/questions/18176661/copying-a-utf-8-url-from-browsers-address-bar-gives-only-the-ugly-encoded-one like using chrome extensions etc. This is mostly a browser thing I think. They encode the url when you copy the whole thing. If you just copy a substring it doesn't encode it. See below
// copy paste whole url http://127.0.0.1:4567/topic/4846/a-title-with-speacial-chars-%C3%A1-%C3%A9-%C3%AD-%C3%B3-in-it // only copy paste nodebb path /topic/4846/a-title-with-speacial-chars-á-é-í-ó-in-it
-
What I am looking for is a config within nodebb that converts all chars into nice characters.
This happens nicely in WordPress for example, you write the title with any character, but only the nice ones are used to create the "slug", so the slug is always nice, and that prevent the need to any extensions.From your reply, it seems nodebb does not have anything like that?
-
@darkpollo @baris NodeBB just runs
slugify()
which is fairly restrictive... I believe it only allows for the latin alphabet for broadest compatibility even though URL paths can contain a broader set of UTF-8 chars.The problem here is we "slugify" strings for two reasons:
- URL Safety and Readability, so invalid characters are removed and a "human-readable" slug is produced
- Compatibility, so when data is sent across to other sites, the chance of the same data coming out the other side is increased.
I think the solution here is to refactor
slugify
so that there are two types, one for url safety and readability, and another one for compatibility. -
Slugify doesn't remove the special characters @darkpollo mentioned.
const s = await app.require('slugify'); console.log(s('a title with speacial chars á é í ó in it')) a-title-with-speacial-chars-á-é-í-ó-in-it
The issue I saw was they were encoded if the entire url from the address bar is copied.
-
@baris
Yes, I understand, but we need them to be removed so the urls are nice.
What @julian suggest of having an option for this seems great.For what I see on slugify, they have an option for this.
https://www.npmjs.com/package/slugifystrict: false, // strip special characters except replacement, defaults to
false
This is how is made in WordPress (sorry for the reference but this is where I am coming from... ) https://developer.wordpress.org/reference/functions/remove_accents/
And from what I can see in the code:
https://github.com/NodeBB/NodeBB/blob/45eabbf5ba2100201fe887a9d6fc81d88379d414/src/slugify.js#L2It is supposed to strip those as well.
https://github.com/NodeBB/NodeBB/blob/45eabbf5ba2100201fe887a9d6fc81d88379d414/public/src/modules/slugify.js#L23function string_to_slug(str) { str = str.replace(/^\s+|\s+$/g, ''); // trim str = str.toLowerCase(); // remove accents, swap ñ for n, etc var from = "àáäâèéëêìíïîòóöôùúüûñç·/_,:;"; var to = "aaaaeeeeiiiioooouuuunc------"; for (var i=0, l=from.length ; i<l ; i++) { str = str.replace(new RegExp(from.charAt(i), 'g'), to.charAt(i)); } str = str.replace(/[^a-z0-9 -]/g, '') // remove invalid chars .replace(/\s+/g, '-') // collapse whitespace and replace by - .replace(/-+/g, '-'); // collapse dashes return str; }
So maybe this is a bug?
I think to make nice urls and slugs, all should be replaced...