User, group, tag and more pages do not have canonical URL
-
Thank you so much @baris ! Amazing response time. One more reason it was a good decision to choose NodeBB over Discourse Looking forward to v3.12.0
-
@baris I looked at the code and noticed that there may be issue with query string being included in some newly added canonical URLs.
Also, i noticed that og:url meta includes query string everywhere and sometimes page position. og:url should be the same as canonical.
-
@Justas-Azna I believe query params are supposed to be in canonical links if they change the content of the page, like
?page=x
or in the case of/users
the?section=x
query param which controls which list of users to display.@busker the previews are created by https://community.nodebb.org/topic/17384/nodebb-plugin-link-preview-rich-embeds-and-link-previews
-
@baris if they change the content, you could argue that it should be in
canonical
, yes.On the other hand, in case of
/users
,?section=x
just filters out a subset of all users. And?section=x
versions of/users
page don't appear in sitemap (they should appear there, if they are trully distinct pages).Here's what ChatGPT thinks (https://chatgpt.com/share/67590127-15f4-8012-b412-b2a553961dfc) :
Justas:
Should query parameter versions of page (E.g. /users?filter=x, where x has finite options) be shown as canonical url and og:url metas?
E.g. both /users is in canonical and /users?filter=x is in canonical. ?filter=x merely filters a subset of all users that appear in /users
ChatGPT:
No, query parameter versions of a page (e.g.,
/users?filter=x
) that merely filter subsets of the data shown on the base page (e.g.,/users
) should not be set as the canonical URL orog:url
. Instead, the canonical URL should point to the base page (/users
) to prevent search engines from indexing multiple variations of essentially the same content. Here's why:1. Duplicate Content Risks
- Search engines may interpret
/users?filter=x
as a different page from/users
, leading to duplicate content issues. - This can dilute ranking signals (e.g., backlinks and authority) across multiple URLs instead of consolidating them under the main page.
2. Purpose of Canonical and
og:url
- The
canonical
tag signals the "preferred" version of the URL to search engines, consolidating ranking signals. - The
og:url
tag (used for Open Graph meta) ensures consistent sharing on social platforms and should align with the canonical URL unless the query parameter creates fundamentally unique content.
3. Finite Options Still Cause Fragmentation
- Even if the query parameter has finite options, search engines may still treat them as separate pages. For example,
/users?filter=active
and/users?filter=inactive
are treated as distinct URLs unless explicitly canonicalized.
Best Practices
-
Use Canonical to Point to
/users
- Set
<link rel="canonical" href="/users" />
on all filtered versions (/users?filter=x
), as they are subsets of the same core content.
- Set
-
Avoid Indexing Query-Based URLs
- Use meta robots (
noindex, follow
) or other measures like robots.txt to prevent indexing of query-based pages unless the content is truly unique and valuable.
- Use meta robots (
-
Canonicalize Unique and Valuable Filtered Pages
- If a query parameter version (e.g.,
/users?filter=active
) offers distinct and valuable content that users would actively search for, you can set it as canonical. However, avoid doing this indiscriminately.
- If a query parameter version (e.g.,
-
Dynamic Open Graph Tags
- If filtered pages need unique Open Graph meta for sharing (e.g.,
/users?filter=x
displays filtered results explicitly), you can use dynamic Open Graph tags, but ensure the canonical tag still points to/users
unless necessary to highlight the filtered page.
- If filtered pages need unique Open Graph meta for sharing (e.g.,
By adopting this approach, you ensure better search engine optimization (SEO), reduced content duplication, and a consistent user experience.
In the end it's your decision. But my suggestion would be to make sure that neither
canonical
norog:url
contains any query parameters at all, all over the NodeBB. This would avoid duplicate content warnings in google search console. - Search engines may interpret
-
Even with
?page=n
urls, the best practice is to keepcanonical
andog:url
as/users
.If you really want to go all the way, there are some additional recommendations for pagination: namely, to have
rel=prev
andrel=next
links and to consider including "All users" page (only if practical).Follow up ChatGPT prompt:
Justas:
understood about ?filter=x . What if /users only shows first page of results, and to see all the results, the search engine would have to go through pages using ?page=n urls (?page=n urs are not in sitemap).
ChatGPT:
In the case where
/users
only shows the first page of results, and subsequent pages are accessible via query parameters like?page=n
, there are specific best practices to manage pagination for SEO effectively:
1. Canonical Tags
-
Canonical URL Should Be
/users
Always set the canonical URL on all paginated pages (/users?page=n
) to/users
. This consolidates link equity and signals to search engines that/users
is the primary page to index. -
Do not canonicalize each
/users?page=n
page to itself unless the content on those pages is truly standalone and critical (e.g., highly distinct and valuable results on deeper pages). For typical paginated lists, pointing all to/users
is preferred.
2. Pagination-Specific Meta Tags
To help search engines understand the relationship between paginated pages:
- Use rel="next" and rel="prev" links in the
<head>
section of your HTML to connect paginated pages in a sequence.<!-- Page 1 --> <link rel="next" href="/users?page=2" /> <!-- Page 2 --> <link rel="prev" href="/users?page=1" /> <link rel="next" href="/users?page=3" />
- These tags indicate that the pages are part of a series, helping search engines crawl and interpret them effectively.
3. Avoid Blocking
?page=n
URLs- If search engines cannot crawl beyond the first page, they may miss significant content. Ensure
?page=n
URLs are not blocked byrobots.txt
or set tonoindex
. - You don’t need to include these pages in the sitemap, but they should still be crawlable to allow search engines to discover deeper content.
4. Optimize Internal Linking
- Use clear internal links to paginated pages (e.g., next/previous buttons or numbered pagination).
- Include contextually relevant links to deeper pages where appropriate, ensuring that search engines and users can navigate your content efficiently.
5. Provide an "All Users" Page if Possible
- If feasible, create an "all users" page or a consolidated page that displays the entire dataset (or as much as practical). This page can serve as the canonical version, simplifying search engine access to all content.
6. Manage Crawl Budget
- For very large datasets, paginated URLs can impact crawl budget. If some pages are low value, consider using
noindex, follow
to de-prioritize them for indexing while still allowing link discovery.
7. Content Uniqueness Across Paginated Pages
- Avoid having paginated pages with only slight variations. Ensure each page contains meaningful and distinct content to minimize risks of perceived duplication.
Summary
- Set the canonical tag for all
/users?page=n
pages to/users
. - Use
rel="next"
andrel="prev"
to indicate the sequence of paginated pages. - Ensure paginated URLs are crawlable, even if they aren’t in the sitemap.
- Avoid indexing low-value paginated pages unless they provide unique content.
- Consider an "all results" page for improved discoverability if practical.
This strategy ensures search engines can navigate, index, and understand your paginated content while prioritizing
/users
as the central page. -
-
Any updates @baris ?
-
Don't use the first page of a paginated sequence as the canonical page. Instead, give each page its own canonical URL.
Pagination Best Practices for Google | Google Search Central  | Documentation  | Google for Developers
Learn best practices for indexing your ecommerce site when using pagination and incremental page loading and how this can impact Google Search.
Google for Developers (developers.google.com)
I am not an SEO export, but google seems to suggest giving each page it's unique canonical url. You could test on your own installation and remove the query params and see how google search console responds.