Blocked URLs – robots.txt – Google Webmaster reports

  • Author
    Posts
  • #1061078

    timethief
    Member

    I don’t log in to my Google Webmasters account very often. This is what greeted me when I logged into my Google Webmasters account:

    robots.txt file
    http://onecoolsitebloggingtips.com/robots.txt
    Blocked URLs
    robots.txt 20

    Downloaded
    13 hours ago
    Status
    200 (Success)

    robots.txt file
    http://thistimethisspace.com/robots.txt
    Blocked URLs robots.txt 18
    Downloaded
    9 hours ago
    Status
    200 (Success)

    I am asking for Staff help with rectifying this situation by removing the robots.txt re: the blocked URLs please.

    The blog I need help with is onecoolsitebloggingtips.com.

    #1061138

    timethief
    Member

    P.S. I’d appreciate Staff fixing the dyslexic typo error in the thread title too.

    #1061139

    auxclass
    Member

    Is My Blog Working shows the robots.txt file as allowing search engines to index your site – some files and folders are blocked all the time relating to program files etc.

    Last time I linked to ismyblog I was caught in the spam filter – maybe things will be better this time

    http://ismyblogworking.com/onecoolsitebloggingtips.com

    #1061225

    macmanx
    Staff

    That’s actually quite normal, and you can see the blocked directories in the robots.txt files http://onecoolsitebloggingtips.com/robots.txt and http://thistimethisspace.com/robots.txt

    All pages below the directories are blocked, but they’re all pages that you don’t want indexed anyway.

    User-agent: *
    Disallow: /next/

    Keeps search engines from indexing each Next page on your blog. It’s better for the search engines to index the individual posts, not Page 1 of posts, Page 2 of posts, and so on. Also, it prevents duplicate content.

    User-agent: *
    Disallow: /activate/

    Prevents search engines from indexing your blog’s activation URL. Who knows how they would get it, but regardless they shouldn’t index it.

    User-agent: *
    Disallow: /wp-login.php

    Prevents search engines from indexing your login form.

    User-agent: *
    Disallow: /signup/

    Prevents search engines from indexing signup forms. Again, who knows how they would get it, but regardless they shouldn’t index it.

    User-agent: *
    Disallow: /related-tags.php

    Prevents search engines from indexing what is essentially a gigantic pile of words when not parsed correctly.

    User-agent: *
    Disallow: /cgi-bin/

    Prevents search engines from indexing any operational CGI scripts.

    User-agent: *
    Disallow: /wp-admin/
    Disallow: /wp-includes/

    Prevents search engines from indexing your admin directory and WordPress resource files.

    So, in summary, there are a lot of pages blocked by blocking those directories, but you don’t want any of those pages on Google.

    #1061226

    timethief
    Member

    @macmanx
    Thanks so much for the clarity. I really appreciate it.

    #1061234

    macmanx
    Staff

    You’re welcome!

The topic ‘Blocked URLs – robots.txt – Google Webmaster reports’ is closed to new replies.