# While these settings are okay, a robots.txt file can tip a hacker off to files that they might find interesting and would otherwise not know about. It is much safer to edit the PHP files themselves to include a robots meta tag instead. # Use your HTML editor to search all files in your osCommerce install in source mode for "" (quotes just delimit what you are looking for and should not be included in the actual search). Immediately under this tag add one of the following two tags: # If the file in question is located in the admin directory it should not get indexed at all so you will want to add the meta tag to prevent any robot access to the file as follows: # # For files in your main catalog that you don't want indexed (such as the ones listed below in the sample robots.txt file) you will still want the robot to continue indexing the remainder of the site so you would use: # # Frankly I'm a bit surprised that the osCommerce development team didn't include such tags when they originally wrote the application. You might want to keep the robots.txt file with just the command to disallow the Google Image Bot as listed at the bottom of this file since that wouldn't give away any important information. # Sample robots.txt file (make sure the filename is ALL LOWERCASE on Linux/Unix systems) # This file should go in your web site's ROOT directory # The root directory is where your site's main /index.html file would be found # It is usually found in /yourhomedir/public_html/ or /yourhomedir/httpdocs # Where "yourhomedir" is your user account's name # # We invite you to also check out our popular contribution: Simple Template System (STS) # It lets you layout or change your OSC look-and-feel by modifying a single HTML file # http://www.oscommerce.com/community/contributions,1524 or SimpleTemplateSystem.com # Enjoy! - Brian Gallagher @ DiamondSea.com # This says to apply these settings to ALL search engine spiders/crawlers User-agent: * # These settings will keep spiders from indexing your unwanted pages # This assumes that your OSC install is in your web site's ROOT directory # ie: http://www.yoursite.com/index.php <- Use if this brings up your OSC main page Disallow: /perso Disallow: /includes Disallow: /cgi-bin Disallow: /account.php Disallow: /account_edit.php Disallow: /account_history.php Disallow: /account_history_info.php Disallow: /account_password.php Disallow: /add_checkout_success.php Disallow: /address_book.php Disallow: /address_book_process.php Disallow: /advanced_search.php Disallow: /checkout_confirmation.php Disallow: /checkout_payment.php Disallow: /checkout_payment_address.php Disallow: /checkout_process.php Disallow: /checkout_shipping.php Disallow: /checkout_shipping_address.php Disallow: /checkout_success.php Disallow: /contact_bean.php Disallow: /cookie_usage.php Disallow: /create_account.php Disallow: /create_account_success.php Disallow: /login.php Disallow: /password_forgotten.php Disallow: /popup_image.php Disallow: /shopping_cart.php Disallow: /product_reviews_write.php # These settings will keep spiders from indexing your unwanted pages # This assumes that your OSC install is in your web site's ROOT directory # ie: http://www.yoursite.com/catalog/index.php <- Use if this brings up your OSC main page # Feel free to add any other pages on your site that you don't want to be indexed by # the search engines. # PLEASE NOTE: Any pages that you list here should be secured by other means if you # don't want people to be able to view them, as some malicious users will look at a # robots.txt file to try to find "hidden" or "secret" areas of web sites to find # confidential information. # Just Uncomment a line or add new ones as you see fit. # Disallow: /private # Disallow: /hidden # IF YOU DO NOT WISH TO HAVE THE GOOGLE IMAGE BOT SCAN YOUR DOMAIN FOR IMAGES # THEN YOU CAN INCLUDE THE FOLLOWING IN YOUR ROBOTS FILE. # I FOUND THAT MY BANDWIDTH USAGE DROPPED BY A MASSIVE AMOUNT AFTER I GOT RID # OF THE GOOGLE IMAGE BOT. ALL I HAD WAS IMAGE HUNTERS STEALING PRODUCT SHOTS # AND NOT EVEN BROWSING THE SITE.