Stopping Google from Searching a page

Discussion in 'Web Design and Development' started by Sdashiki, Jan 4, 2007.

  1. Sdashiki macrumors 68040

    Sdashiki

    Joined:
    Aug 11, 2005
    Location:
    Behind the lens
    #1
    i have a google search box on my website, that searches the websites content.

    typical setup.


    I have a page or two id like NOT to EVER show up in the search results regardless of the search criteria.


    how do I hide these pages from the google spider?

    or better yet, can I hide a directory from the bot?
     
  2. seanf macrumors 6502

    Joined:
    Aug 8, 2006
    Location:
    UK
    #2
    Use a robots.txt file. The following will stop all well-behaved bots from accessing the specified directory:

    Code:
    User-agent: *
    Disallow: /directoryname/
    See: http://www.robotstxt.org/wc/robots.html

    Sean :)
     
  3. Sdashiki thread starter macrumors 68040

    Sdashiki

    Joined:
    Aug 11, 2005
    Location:
    Behind the lens
    #3
    these ARE NOT real links of course



    when you goto my site

    www.blahblah.com

    it takes you automatically to the index.html file found in a sub directory of the "main site"

    i didnt set this up, my predecessor did.

    so really the homepage where the searchbox is:

    www.blahblah.com/main/


    ive placed the robots.txt file in the main server directory and also in the /main/ directory.

    neither stops the search box from finding the directory im trying to hide which is another level in

    www.blahblah.com/main/downloads/


    when i placed the txt file at the root website directory:

    and when inside the /main/ directory:

    but neither gives me any results?!
     
  4. redeye be macrumors 65816

    redeye be

    Joined:
    Jan 27, 2005
    Location:
    BXL
    #4
    Your links don't work :confused:

    ;)


    The robots.txt file aks the robots not to index the site.
    If you're already indexed, you will have to await the next crawling of the googlebots. I guess your site will be deleted from their index at that point (although I'm not sure).

    Edit:
    If you're searching for a specific URL, i guess nothing can stop a search engine from giving a result. It's like typing it in your address bar.
     
  5. Sdashiki thread starter macrumors 68040

    Sdashiki

    Joined:
    Aug 11, 2005
    Location:
    Behind the lens
    #5
    its one of those search boxes that has the radio buttons underneath for "the web" and "your site"

    so a keyword search only looks on your site.

    i thought this was on a per search basis and not simply googling google's database of my site that it already had?

    what i mean is i thought I was just using Googles search engine each and every time you search for any keyword on my site, so any changes would be immediate.
     
  6. redeye be macrumors 65816

    redeye be

    Joined:
    Jan 27, 2005
    Location:
    BXL
    #6
    I would think it just ads a site:blablabla.com to the search.
    You could check the url you get showing the results of your in-site search.
     
  7. ravenvii macrumors 604

    ravenvii

    Joined:
    Mar 17, 2004
    Location:
    Melenkurion Skyweir
    #7
    Yup, that's just what it does.
     
  8. Sdashiki thread starter macrumors 68040

    Sdashiki

    Joined:
    Aug 11, 2005
    Location:
    Behind the lens
    #8
    so i need to wait for the spider to come around again before it will ignore those directories?
     
  9. redeye be macrumors 65816

    redeye be

    Joined:
    Jan 27, 2005
    Location:
    BXL
  10. xoreu macrumors member

    Joined:
    Nov 12, 2006
    #10
    You can also use Google's Webmaster Tools to monitor the status of these restricted pages (there's a whole page dedicated to robots.txt on there) and find out a lot of other useful stuff about your site.
     

Share This Page