Google crawl error after site migration
-
I recently moved my site from sudonix.com to sudonix.org and placed a redirect at Cloudflare's edge to handle anyone using the old domain suffix. This all works as intended, although Google Search console insists that it won't crawl the pages of my forum because it thinks they have been redirected.
The sitemaps seem ok and reflect the new domain name, but is there anywhere else that may retain the previous .com domain causing Google to think it's a redirect?
I changed
config.json
to point to the new domain and everything works as expected. Interestingly, Bing doesn't seem to have any issue and indexes as expected.Any thoughts before I raise this with Google directly?
Thanks
-
Change of Address Tool - Search Console Help
Move your site from one domain to anotherAbout this tool Use the Change of Address tool when you move your website from one domain or subdomain to another: for instance, from example.c
(support.google.com)
-
You tried using the change of address tool and it didn't work or what?
-
A 301 redirect should instruct Google to assign the seo score to the new url, so that should be ok?
If it shows errors on the old URLs that might be OK insomuch that the new url is already indexed properly?
May need to confirm with webmaster tools, etc.
-
I think I found the answer here. It's Cloudflare's "Bot Fight Mode" that causes this.
Mode on
Mode off
Evidently, it's a well-known issue. If you want crawlers to ignore your site, just switch that bad boy on with no exceptions
-
-
@julian I mean, the purpose of the product is to fight bad bots, not known good bots (like search engines). But it seems that at least on free plans it blocks spiders a lot. And even on a paid plan you have to do quite a bit of work to get it working properly it seems.
-
@tankerkiller125 that's exactly what it's doing. In addition, it also blocks it's own page speed test! I uploaded over 2000 IP addresses into Cloudflare that should have made a bypass rule for Google's ASN, but that never worked at all - despite being suggested by Cloudflare in the first place.
Even the pro plan has issues with this as you suggest. I've switched it off, and now all crawl errors on my site have cleared down.
-
-
Back again with this. I thought that the issue solved, but it appears not. I've discussed this with Google themselves, and they tell me that a http 307 redirect is being returned by their crawler.
Is there any special permission that needs to be applied to spiders in NodeBB? The page, according to the Googlebot, redirects to /register which it shouldn't of course.
In the browser, everything works as expected and you land up on the page you requested. I have no idea why Googlebot thinks this is a redirect when it isn't. Seems Bing has no issues indexing the site either.
Here's an example of the 307 redirect
https://view.hugo-decoded.be/?scheme=https&url=sudonix.org%2Ftags%2Fjavascript&ua=Googlebot&ref=GoogleEdit - seems Googlebot is returning 307 and redirecting to
/register/complete
. -
I suppose the obvious question here is why every single link when queried by Googlebot is redirected to /register/complete ? I thought that the purpose of spider permissions was so that crawlers would work unhindered?
It seems currently, having checked properly, that every single page generated by NodeBB is seen as a 307 redirect to /register/complete by Google, Bing, and Yandex - they can't all be wrong.
This site
Bulk URL HTTP Status Code, Header & Redirect Checker
Redirect checker to easily check status codes, response headers, and redirect chains.
(httpstatus.io)
Is pretty useful and is at least mobile friendly. If I check the URL of this page for example, I get a HTTP 200 so nothing wrong here, but my install obviously says differently, and I'd like to fix that.
-
@phenomlab likely related to recent changed re: email handling. That's a regression that I should fix, it shouldn't act like that to spiders.
Best guess is I handled it by doing a check for
!= 0
when it should be> 0
-
@julian funny you should say that as my other site hostrisk.com doesn't have this issue and is running 3.01 (needs an upgrade)
But - the caveat here is that hostrisk does not permit registrations, so I guess that regression (if it existed in 3.0.1) won't apply? I'll upgrade that and see if the issue remains without registration enabled)