Help - Search - Members - Calendar
Full Version: GoogleBot & Mod_security, problems at cPanel + RHE 3
The Planet Forums > Control Panels > cPanel/WHM
Sh4ka
I'm having some issues with 1 website.. it seems GoogleBot is rejected (with 403 error status) when it is trying to index all webpages from a "N" site.

Right now I just investigated the audit_log (mod_security logs) but no errors are appearing from that site showing googlebot rejected.... I think it is due to a mod_security rule.. but i can not find which rule is..

Also have CSF firewall running since a month ago with normal rules but, i don't think that may be the problem..

This are my actual mod_security rules.. should any of this rule cause a problem with googlebot ??

And how can I add all googlebot IPs (or any host *.googlebot.com) to be accepted in mod_security... like a white list.. ???? Any idea where can i get googlebot ips?

CODE
<IfModule mod_security.c>

# Turn the filtering engine On or Off

SecFilterEngine On



# Change Server: string

#SecServerSignature "IIS"



# This setting should be set to On only if the Web site is

# using the Unicode encoding. Otherwise it may interfere with

# the normal Web site operation.

SecFilterCheckUnicodeEncoding Off



# The audit engine works independently and

# can be turned On of Off on the per-server or

# on the per-directory basis. "On" will log everything,

# "DynamicOrRelevant" will log dynamic requests or violations,

# and "RelevantOnly" will only log policy violations

SecAuditEngine DynamicOrRelevant



# The name of the audit log file

SecAuditLog logs/audit_log



# Should mod_security inspect POST payloads

SecFilterScanPOST On



# Action to take by default

SecFilterDefaultAction "deny,log,status:403"



## ## ## ## ## ## ## ## ## ##

## ## ## ## ## ## ## ## ## ##



# Require Content-Length to be provided with

# every POST request

SecFilterSelective REQUEST_METHOD "^POST$" chain

SecFilterSelective HTTP_Content-Length "^$"



# Don't accept transfer encodings we know we don't handle

# (and you don't need it anyway)

SecFilterSelective HTTP_Transfer-Encoding "!^$"



# Protecting from XSS attacks through the PHP session cookie

SecFilterSelective ARG_PHPSESSID "!^[0-9a-z]*$"

SecFilterSelective COOKIE_PHPSESSID "!^[0-9a-z]*$"



SecFilter "viewtopic.php?" chain

SecFilter "chr(([0-9]{1,3}))" "deny,log"



# Block various methods of downloading files to a server

SecFilterSelective THE_REQUEST "wget "

SecFilterSelective THE_REQUEST "lynx "

SecFilterSelective THE_REQUEST "scp "

SecFilterSelective THE_REQUEST "ftp "

SecFilterSelective THE_REQUEST "cvs "

SecFilterSelective THE_REQUEST "curl "

SecFilterSelective THE_REQUEST "telnet "

SecFilterSelective THE_REQUEST "ssh "

SecFilterSelective THE_REQUEST "echo "

SecFilterSelective THE_REQUEST "links -dump "

SecFilterSelective THE_REQUEST "links -dump-charset "

SecFilterSelective THE_REQUEST "links -dump-width "

SecFilterSelective THE_REQUEST "links http:// "

SecFilterSelective THE_REQUEST "links ftp:// "

SecFilterSelective THE_REQUEST "links -source "

SecFilterSelective THE_REQUEST "mkdir "

SecFilterSelective THE_REQUEST "cd /tmp "

SecFilterSelective THE_REQUEST "cd /var/tmp "

SecFilterSelective THE_REQUEST "cd /etc/httpd/proxy "

SecFilterSelective THE_REQUEST "/config.php?v=1&DIR "

SecFilterSelective THE_REQUEST "&highlight=%2527%252E "

SecFilterSelective THE_REQUEST "changedir=%2Ftmp%2F.php "

SecFilterSelective THE_REQUEST "arta.zip "

SecFilterSelective THE_REQUEST "cmd=cdx20/var "

SecFilterSelective THE_REQUEST "HCL_path=http "

SecFilterSelective THE_REQUEST "netenberg "

SecFilterSelective THE_REQUEST "psybnc "

</IfModule>


THanks.
Sh4ka
I just googled for a while.. and get some ideas to allow GoogleBot into mod_security without problems.. I really don't know if this will work:

CODE
# GoogleBot using IP ranges...

SecFilterSelective REMOTE_ADDR "^216.239.57.99$" nolog,allow

SecFilterSelective REMOTE_ADDR "^209185.$" nolog,allow

SecFilterSelective REMOTE_ADDR "^216.33$" nolog,allow

SecFilterSelective REMOTE_ADDR "^64233.$" nolog,allow

SecFilterSelective REMOTE_ADDR "^64.68$" nolog,allow

SecFilterSelective REMOTE_ADDR "^66249.$" nolog,allow

SecFilterSelective REMOTE_ADDR "^7214.$" nolog,allow

SecFilterSelective REMOTE_ADDR "^86.48$" nolog,allow



# GoogleBot by user-agent...

SecFilterSelective HTTP_USER_AGENT "Google" nolog,allow

SecFilterSelective HTTP_USER_AGENT "Googlebot" nolog,allow

SecFilterSelective HTTP_USER_AGENT "GoogleBot" nolog,allow

SecFilterSelective HTTP_USER_AGENT "googlebot" nolog,allow

SecFilterSelective HTTP_USER_AGENT "Googlebot-Image" nolog,allow

##

SecFilterSelective HTTP_USER_AGENT "AdsBot-Google" nolog,allow

SecFilterSelective HTTP_USER_AGENT "Googlebot-Image/1.0" nolog,allow

SecFilterSelective HTTP_USER_AGENT "Googlebot/2.1" nolog,allow

SecFilterSelective HTTP_USER_AGENT "Googlebot/Test" nolog,allow

SecFilterSelective HTTP_USER_AGENT "Mediapartners-Google/2.1" nolog,allow

####################
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.