Our crawler was not able to access the robots.txt file on your site

Hello Mozzers!

I've received an error message saying the site can't be crawled because Moz is unable to access the robots.txt. I've spoken to the webmaster and he can't understand why the robot.txt can't be accessed in Moz.

https://www.thefurnshop.co.uk/robots.txt

and Google isn't flagging anything up to us.

Does anyone know how to solve this problem?

Thanks

@LoganRay This was our issue. Didn't know Moz tries to retrieve the HTTP robots.txt first. Our HTTPS redirect was not working on static files only, so the HTTP path to the robots.txt was failing. We did not notice it because the HSTS policy was forcing the browser to redirect.

Wanted to jump back in on this topic as I've just confirmed my initial suspicion.

I just added a new client to our Moz account and had the exact same issue, crawler unable to access the robots.txt file. It's a secure site and was configured in Moz without the HTTPS. When I go to the robots.txt file without https://www, it redirects to the same thing as yours where the / between the TLD and page path gets removed.

Reconfigure your site and it should begin to work.

There are 2 parts of your robots.txt that could be causing this, and it all just depends on how each bot is reading regular expressions in your robots.txt:

First, your Disallow: /? can be read as Disallow all paths starting with "/" with 0 to infinity characters "" and one character "?". Try replacing this part with Disallow: /*? to make it not crawl anything with a query string (which is what I believe you were going for).

Second, you have a open Disallow followed by the User-agent: rogerbot and while this should not be read this way, once again it all depends on how each bot reads the commands. To fix this you should change your Disallow following your Googlebot-Image as Disallow: /

Hi there,

There's something odd going on when I try to access your robots.txt file without the www. The www gets added back on, but when it does, the slash between the TLD and page path gets deleted, see below. I'm guessing your domain in Moz is configured without the www, which means RogerBot is getting redirected to this slash-less version of the file.

https://www.thefurnshop.co.ukrobots.txt

Welcome to the Q&A Forum

Browse the forum for helpful insights and fresh discussions about all things SEO.

Our crawler was not able to access the robots.txt file on your site

Got a burning SEO question?

Browse Questions

Explore more categories

Related Questions

Why does Moz only seem to be crawling a snap shot of the site I am working with?

Can I access old data/keyword research if I cancel my Moz Pro account?

Can Moz Monitor a JS Site?

I have a client with a wordpress.com site.

What Moz tool is best to find reasons google has not spidered by site

My site is not being fully crawled

New to using MOZ. Familiar with Google Analytics. With MOZ is there a code snippet to include on my site?

So the page-grader is giving my site an A, but it is ranking below some websites that the grader gives an F to. What is the point of the page grader?