Help - Search - Members - Calendar
Full Version: your basic htaccess file
The Planet Forums > Control Panels > Plesk
Pages: 1, 2
Ric
Here is the standard .htaccess file we drop into every domain. I just thought it might be useful to some of you, comments follow.


ErrorDocument 401 /401.html
ErrorDocument 403 /403.html
ErrorDocument 404 /404.html
ErrorDocument 500 /500.html

DirectoryIndex index.html index.shtml index.shtm index.cgi /403.html


deny from all



#Get rid of file sucking utilities
[edited]
this list has been revised, follow the thread
#End

deny from .id
deny from .interpacket.net
deny from .lt
deny from .mk
deny from .my
deny from .ro
deny from .yu
deny from 139.92
deny from 152.158
deny from 161.142
deny from 194.102.130
deny from 194.165
deny from 202.134
deny from 202.145
deny from 202.146
deny from 202.147
deny from 202.148
deny from 202.149
deny from 202.150
deny from 202.151
deny from 202.152
deny from 202.153
deny from 202.154
deny from 202.155
deny from 202.156
deny from 202.157
deny from 202.158
deny from 202.159
deny from 202.160
deny from 202.162
deny from 202.164
deny from 202.168
deny from 202.171
deny from 202.178
deny from 202.180
deny from 202.183
deny from 202.184
deny from 202.185
deny from 202.186
deny from 202.187
deny from 202.188
deny from 202.189
deny from 202.190
deny from 202.4
deny from 202.46
deny from 202.47
deny from 202.57
deny from 202.58
deny from 202.93
deny from 202.95
deny from 207.192.198
deny from 210.14
deny from 210.16
deny from 210.186
deny from 210.19
deny from 210.56
deny from 212.138
deny from 212.19
deny from 212.50
deny from 212.59
deny from 213.169
deny from 213.240
deny from 216.3.242.10
deny from 217.9
deny from 62.220.194
deny from 64.110
deny from 64.49

Deny from 203.106
Deny from 203.130.254
Deny from 208.210.48
Deny from 208.210.49
Deny from 208.210.50
Deny from 208.210.51


The first 4 lines specify your custom error pages.

The next statement will prevent casual directory listing by web users if there is no index file.

The rewrite statement blocks some of the most popular site downloaders. We are compiling a more complete list, anyone have some to add? [edited] see a revised list below.

The deny list locks out the entire country of Indonesia due to the fact 99.5% of our credit card fraud attempts originated from there. Even if the domain is not an e-commerce site we still deny access to this list, as long as CC fraud is allowed to run rampant in that country, they don't deserve access.

Rick
pry
I have been fighting these file snatchers for over a year, they cost me a bundle in excess bandwidth at my old isp.

Anyway add to your lists the following,

RewriteCond %{HTTP_User_Agent} ^ HTTrack.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ WebStripper.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ WebCapture.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ Scooter-W3.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ WebCopier.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ FlashGe.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ Webdupe.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ NetAnts.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ Pockey.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ Disco Pump.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^ Internet Ninja.* [NC,OR]

I would also like to point out that your

RewriteRule ^.*$ /wait.html [L]

will cause the httpd damon to hang forever sucking up your cpu resources looking for /wait.html if you do not have a file named that in your root directory.
mmoncur
I put some similar code in my .htaccess file a couple of weeks ago and then spent a week tracking down the mysterious crashes. Be very careful with mod.rewrite as it's very ill-behaved when anything isn't to its liking.

The answer, or at least an answer, is to set the rules up like this:
CODE
RewriteBase /

RewriteCond...

RewriteCond %{HTTP_USER_AGENT} ^JetCar.* [NC]

RewriteRule .* wait.html [L]


The RewriteBase directive tells Apache where the paths of rewritten files start (in the virtual site's virtual directory space) and avoids the error, and then you don't need the leading slash for wait.html.

I've learned the hard way to test rules like this immediately after installing them so that I don't give the robots I'm trying to block the power to crash my site. Here's an easy test from the command line:
CODE
lynx -useragent=GetRight [url]http://www.yoururl.com/[/url]

This requests the site using "GetRight" as the user agent.

If you ever have a rule that causes this problem, when you run the test or a robot accesses your site, it grabs an httpd thread that eats the CPU and memory until you kill it. It's easy to tell that this has happened if you look at the top of the list in the "top" utility. If you have an httpd thread using 90% of the CPU (or two using 45% each, or more...) then the worst has happened.

Once you notice that, kill Apache (/psa/rc.d/httpd stop) and fix the .htaccess file, then restart Apache (/psa/rc.d/httpd start) - if you do this within a few minutes there will be no ill effects.

Good luck and thanks for the list of UAs to block!
Ric
That is strange, it has been working fine on our Sun for years and that is a shared server, I wonder if it is a linux fluke. The Sun is running the same version of Apache, I tried your test and it did not have any adverse effects on server cpu usage, even with several clients connected that way.

We have not even started moving our domains to our new server yet so I am glad you told me about this. I would have never suspected that to be the problem and would have probably fought it for weeks!

We use another rewrite statement to deny bandwidth thieves the ability to serve images from our server. It does not deny access, just direct linking. We use these selectivly for serious abuse because a lot of them will suck cpu time on a high bandwidth site...


RewriteEngine on
RewriteCond %{HTTP_REFERER} http://www.ripoff.com
RewriteRule ^(.*)$ http://www.ripoff.com


Rick
mmoncur
It could easily be a bug specific to one Linux version, one Apache version, or even Plesk somehow. I've never heard of anyone else with the problem until pry mentioned it in this thread - I've done lots of Web searches and found only a few vague mentions.

The mod.rewrite docs do say that you need the RewriteBase directive to use it in a virtual site's .htaccess, but they don't say anything about the dire consequences. icon_smile.gif

Considering that every "blocking bad robots" example on the web ends with the same [L] flag and would cause this bug, it must only happen on certain systems.

I use a similar set of rewrite rules to prevent hot-linking of images. This one's a bit more strict and only allows my specific URLs or a blank as the referrer.
CODE
### Prevent "hot linking" of my images from other sites

RewriteEngine on

RewriteCond %{HTTP_REFERER} !^$

RewriteCond %{HTTP_REFERER} !^[url]http://my.url.1/.[/url]*$     [NC]

RewriteCond %{HTTP_REFERER} !^[url]http://www.my.url.1/.[/url]*$ [NC]

RewriteCond %{HTTP_REFERER} !^[url]http://my.url.2/.[/url]*$ [NC]

RewriteCond %{HTTP_REFERER} !^[url]http://my.ip.address/.[/url]*$ [NC]

RewriteRule .*.(gif|GIF|jpg|JPG)$        -                  [F,L]
Griffith
I tried to add the "ErrorDocument" code for a domain at my raq server... I got this error:

The server encountered an internal error or misconfiguration and was unable to complete your request.

Any ideas why I got it?

Where should I put it? domain/web or in domain/
pry
Note, this is either Apache or Linux problem, because I am running a Ensim box... I tend to think it is a Linux problem because i did not have this problem on my old FreeBSD box.

Actually I came across this thread while looking for something else and noticed the error and thought I would mention it and also to add more clients to the File Snatcher List I really hate those things....

I spent an entire day trying to figure out what was wrong before I stumbled upon the cause... Mod_Rewrite is very unforgiving.

In Ricks example you get the following condition because of the "RewriteRule ^.*$ /wait.html [L]"

6:15am up 16 min, 1 user, load average: 1.02, 0.68, 0.31
73 processes: 70 sleeping, 3 running, 0 zombie, 0 stopped
CPU states: 99.2% user, 0.7% system, 0.0% nice, 0.0% idle
Mem: 1012160K av, 227808K used, 784352K free, 1352K shrd, 3936K buff
Swap: 530104K av, 0K used, 530104K free 58988K cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
940 apache 18 0 118M 118M 6684 R 98.7 11.9 4:52 httpd
1446 root 11 0 1084 1084 840 R 0.5 0.1 0:02 top

PID 940, is hung using 118 megs of memory and using 98.7% of the cpu

If you check netstat -p -o you will see the process locked in a terminal "Close_Wait" state the -p option shows the PID number and the -o option shows the time the process has been in this state.

you can kill -9 the process or restart apache

"My Fix" is the same configuration as Michael Posted, It sems to work without any adverse reactions.

Michael, Thanks for the image hot link rules, that is exactly what I was looking for in the 1st place.

Grifith, make sure your .htaccess file has

"Options +FollowSymlinks" as the 1st item listed Without the " " that should cure your problem.

Paul
Ric
Yes! That is a far better solution for blocking direct links, thanks! Don't forget the https:// for the approved list if your doing anything with SSL on your site.

I edited the first post so people won't grab it without reading the rest of the thread, a combined revised list follows.


#Get rid of file sucking utilities
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_User_Agent} ^HTTrack.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^WebStripper.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^WebCapture.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^Scooter-W3.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^WebCopier.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^FlashGe.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^Webdupe.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^NetAnts.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^Pockey.* [NC,OR]
RewriteCond %{HTTP_User_Agent} ^Disco.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GoZilla.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^wget.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SiteSnagger [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ImageGrab.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^JetCar.* [NC]
RewriteRule .* wait.html [L]
#End


I had to shorten ...
RewriteCond %{HTTP_User_Agent} ^ Disco Pump.* [NC,OR]
to...
RewriteCond %{HTTP_User_Agent} ^ Disco.* [NC,OR]

and I had to remove...
RewriteCond %{HTTP_User_Agent} ^ Internet Ninja.* [NC,OR]

Because spaces are not allowed and shortening it to just "Internet" is probably not a good idea icon_smile.gif

Rick
pry
Options +FollowSymlinks

ErrorDocument 404 /error404.html


### Prevent "hot linking" of image files from other sites
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://yourdomain.xxx/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.xxx/. *$ [NC]
RewriteCond %{HTTP_REFERER} !^http://xxx.xxx.xxx.xxx/.*$ [NC]
RewriteRule .*.(gif|GIF|jpg|JPG)$ - [F,L]

### Prevent "File Snatching Agents"
RewriteEngine on
RewriteBase /
#RewriteConditions
RewriteCond %{HTTP_USER_AGENT} ^JetCar.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^Teleport.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^Offline.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^GetRight.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^GoZilla.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^wget.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^WebCapture.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^Scooter-W3.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^FlashGe.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^Webdupe.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^Pockey.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^DiscoPump.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^InternetNinja.* [NC]
RewriteRule .* wait.html [L]

This is off a running virtual domain on my box so everything works

Notes,
I only use a custom error 404 page the other are apache generated Rename to whatever you custom 404 page is .

On the image hot link just put in you domain and ip where indicated. (this works great Thanks Michael) Tested it on a few sites that were lifing my images and all that displays now is a broken image link icon.... Sweet....

On the File snatchers I reposted it here because it is case sensitive, and {HTTP_User_Agent} will not work, {HTTP_USER_AGENT} will
The last 2 lines are subjective...
Wont know for sure if they will work in this context until they get hit.

I'll go over my log reports from the old server and see if there more to add to this list

Paul
mmoncur
pry, you need [NC,OR] on all of the lines in your file-snatching section except the last one. Otherwise it defaults to "AND" and nothing will match.

(Don't use OR in the hotlink prevention code - that one uses ! (NOT) for each URL so AND is correct.)

Oh, with the hotlink code you can put in an image URL (using the same format as the RewriteRule in the file snatching section) instead of "-" and replace the image - it's fun to create a huge "stolen image" graphic and watch it appear on other people's sites when they've inlined your image. I don't bother since it's a waste of bandwidth, though.
Ric
NOTE:

You can't use the OR arg with Jetcar, it must remain...

RewriteCond %{HTTP_USER_AGENT} ^JetCar.* [NC]

I don't know why, it is the only one that produces an error.

Another difference in using mod rewrite under Solaris... HTTP_User_Agent works as well as HTTP_USER_AGENT or at least it does not give an error and I see none of the clients specified that way in the log.

I have tried leaving out the spaces in the past by the way, it won't work. Shortening it just Disco will though and it won't block any legitimate clients.
Ric
Ok... slightly different syntax for blocking hot links and serving a thief.gif instead...

RewriteRule .*.(gif|GIF|jpg|JPG)$ http://www.yourdomain.xxx/thief.gif [R,L]

Have to add the same...

RewriteBase /

as with the robot block.
mmoncur
I think the trouble using OR with JetCar is just that it's the last one in your list (so there's nothing to OR with)... I always leave it off the last one.

Oh, if you have names that include spaces I think you can put in a period (meaning any character) instead of the space without any trouble.
Ric
No, it is the first one on my list and I don't delete the OR on the last entry either.

The OR will cause an error in the jetcar client line no matter where you put it.

I will try the period, I looked for a single char wildcard awhile back and couldn't find it.
mmoncur
Really? That's odd. I haven't tried adding JetCar to my list yet.
Griffith
"I tried to add the "ErrorDocument" code for a domain at my raq server... I got this error:

The server encountered an internal error or misconfiguration and was unable to complete your request.

Any ideas why I got it?

Where should I put it? domain/web or in domain/"

Why does this happend??
Griffith
got it working:D

Had to edit my access.conf file.. and allow it icon_razz.gif
4web-space
Ok while you guys were busy confusing each other can we have all this almalgamted into one post. Also if you would give instructions on where the .htaccess file should go and how to enable .htaccess. The wait.html part of the thread completely lost me what is the wait.html file?

If you would post this as a HowTo as well im sure myself and others would much appreciate it

Thanks

Robbie
4web-space
lol sorry if that sounded a bit wrong it wasnt supposed to.

I know what an ht file is and what it does.

Some of the things you were discussing were a little new mostly the agent blocking and image linking. What was causing me problems was that one minute someone would say do it like this but then someone would say no thats wrong do it this way.

I think this a great resource you guys have produced for htaccess i just think if you were to consolidate it to one post and stick it in the howto forum it would be very valuable to the masses at rackshack!

Thanks again

Robbie
Ric
Sorry I jumped to conclusions, you have to admit it sounded pretty sarcastic but I have been guilty of the same thing at times.

There are so many things you can do with an .htaccess file that it really has to be customised for each persons needs. In fact, we have a number of different ones that we use on different domains. This is the standard one we use though on most of them...



To implement this just copy the code between the snip lines, save it as a plain text file and upload it (using ASCII upload mode) to the web root directory (httpdocs, public_html, etc.) of your site naming it .htaccess

#SNIP BELOW - OUR BASIC .HTACCESS FILE - START

###SERVE CUSTOM ERROR DOCS - BEGIN
Options +FollowSymlinks
ErrorDocument 401 /401.html
ErrorDocument 403 /403.html
ErrorDocument 404 /404.html
ErrorDocument 500 /500.html
###SERVE CUSTOM ERROR DOCS - END

###STOP CASUAL DIR BROWSING - BEGIN
DirectoryIndex index.html index.shtml index.shtm index.php index.cgi /403.html


deny from all

###STOP CASUAL DIR BROWSING - END

###STOP HOT LINKERS - BEGIN
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://xxx.xxx.xxx.xxx/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^https://yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^https://www.yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^https://xxx.xxx.xxx.xxx/.*$ [NC] RewriteRule .*.(gif|GIF|jpg|JPG)$ - [F,L]
###STOP HOT LINKERS - END

###STOP ROBOT DOWNLOADERS - BEGIN
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_USER_AGENT} ^JetCar.* [NC]
RewriteCond %{HTTP_USER_AGENT} ^FlashGet.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Teleport.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebZIP.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Offline.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetRight.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Go!Zilla.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GoZilla.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebReaper.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ia_archiver.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^wget.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^HTTrack.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebStripper.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCapture.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Scooter-W3.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^WebCopier.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^FlashGe.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Webdupe.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NetAnts.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Pockey.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^DiscoPump.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^InternetNinja.* [NC,OR]
RewriteRule .* - [F,L]
###STOP ROBOT DOWNLOADERS - END

###BASIC DENY LIST START
Deny from .id
Deny from .interpacket.net
Deny from .lt
Deny from .mk
Deny from .my
Deny from .ro
Deny from .yu
Deny from 139.92
Deny from 152.158
Deny from 161.142
Deny from 194.102.130
Deny from 194.165
Deny from 202.134
Deny from 202.145
Deny from 202.146
Deny from 202.147
Deny from 202.148
Deny from 202.149
Deny from 202.150
Deny from 202.151
Deny from 202.152
Deny from 202.153
Deny from 202.154
Deny from 202.155
Deny from 202.156
Deny from 202.157
Deny from 202.158
Deny from 202.159
Deny from 202.160
Deny from 202.162
Deny from 202.164
Deny from 202.168
Deny from 202.171
Deny from 202.178
Deny from 202.180
Deny from 202.183
Deny from 202.184
Deny from 202.185
Deny from 202.186
Deny from 202.187
Deny from 202.188
Deny from 202.189
Deny from 202.190
Deny from 202.4
Deny from 202.46
Deny from 202.47
Deny from 202.57
Deny from 202.58
Deny from 202.93
Deny from 202.95
Deny from 207.192.198
Deny from 210.14
Deny from 210.16
Deny from 210.186
Deny from 210.19
Deny from 210.56
Deny from 212.138
Deny from 212.19
Deny from 212.50
Deny from 212.59
Deny from 213.169
Deny from 213.240
Deny from 216.3.242.10
Deny from 217.9
Deny from 62.220.194
Deny from 64.110
Deny from 64.49
Deny from 61.5
Deny from 203.106
Deny from 203.130.254
Deny from 208.210.48
Deny from 208.210.49
Deny from 208.210.50
Deny from 208.210.51
Deny from 211.104
Deny from 211.105
Deny from 211.106
Deny from 211.107
Deny from 211.108
Deny from 211.109
Deny from 211.110
Deny from 211.111
Deny from 211.112
Deny from 211.113
Deny from 211.114
Deny from 211.115
Deny from 211.116
Deny from 211.117
Deny from 211.118
Deny from 211.119
Deny from 213.137
Deny from 207.115.179
###BASIC DENY LIST END

#SNIP ABOVE - OUR BASIC .HTACCESS FILE - END
Ric
Now remember that although that works fine on our Sun server under Solaris, apparently running under Linux, at least with the Red Hat setup used here, you have to add this as the first line for your custom error docs to work...

Options +FollowSymlinks

Additionally, dont forget to edit the ...

### Prevent "hot linking" of image files from other sites
RewriteEngine on
RewriteBase /
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://www.yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^http://xxx.xxx.xxx.xxx/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^https://yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^https://www.yourdomain.com/.*$ [NC]
RewriteCond %{HTTP_REFERER} !^https://xxx.xxx.xxx.xxx/.*$ [NC] RewriteRule .*.(gif|GIF|jpg|JPG)$ - [F,L]

part to reflect your own domain name where it says yourdomain.com and your IP where it says xxx.xxx.xxx.xxx

Rick
4web-space
Just to clarify

RewriteRule .* wait.html [L]

Are we required to create the wait.html file and where should it be placed

Thanks

Robbie

PS Does htaccess need enabling in httpd.conf or is it enabled as default psa2.5? I have the code for doing this just want to check
mmoncur
You should (MUST) create the wait.html file. It should be in the root directory for a domain if you used "RewriteBase /". (i.e. in the /usr/local/psa/home/vhosts/domain.name/httpdocs directory).

If the file doesn't exist mod.rewrite tends to let you know by eating all of the server's memory and eventually crashing it. One of those things that make you wish for a good old-fashioned Windows Blue Screen of Death. cool.gif

.htaccess is enabled by default for all domains in Plesk, at least it is in version 2.0. I haven't bothered with the upgrade to 2.5 yet.
4web-space
is the wait.html file called when someone trys to download with one of the download agents?

So basically it would have a "Bugger off" message on lol ? or not?

How would you go about stopping your clients from deleting these and crashing the server?

Robbie
Ric
That is a new one on me, I have never used a wait.html file. We determined that Unix or at least our Sun under Solaris handles that switch a little differently than Linux though.

Just to be safe I think I will add a wait.html in the root directory. I am just going to copy our standard 403.html error doc to use for it.

Rick
mmoncur
Well, if you're using a rule like this:

RewriteRule .* wait.html [L]

...then you should definitely create the file (wait.html in this case). Otherwise you can just use "-" for the filename and change the [L] to a [F,L] - this will give them your standard 403 error response and you don't need to bother creating a file.

I redirect to a file that explains why they were redirected and includes a link to contact me if they're a legitimate user. This hasn't happened yet, but I like to be friendly just in case.
4web-space
That sounds a better solution for now because if you had to have an extra file wait.html in the clients httpdocs you are bound to get a client who deletes and crashes your system. If someone wanted to use the wait method could you use chown or something similar on the file to stop them deleting it?

Or would it be a case of dumping something in a cron to check the file existed otherwise rewrite it?

Hope this doesnt sound too foolish lol

Robbie
Ric
Your right, all to likely that I will forget to upload a wait.html myself. I editied that line of my post to read

RewriteRule .* - [F,L]

To prevent anyone from accidently doing the same.
aussie
QUOTE
Originally posted by Ric
Here is the standard .htaccess file we drop into every domain. I just thought it might be useful to some of you, comments follow.


ErrorDocument 401 /401.html
ErrorDocument 403 /403.html
ErrorDocument 404 /404.html
ErrorDocument 500 /500.html

DirectoryIndex index.html index.shtml index.shtm index.cgi /403.html


deny from all



#Get rid of file sucking utilities
[edited]
this list has been revised, follow the thread
#End


The deny list locks out the entire country of Indonesia due to the fact 99.5% of our credit card fraud attempts originated from there. Even if the domain is not an e-commerce site we still deny access to this list, as long as CC fraud is allowed to run rampant in that country, they don't deserve access.

Rick


I totally agree with you after the events on our system over the past few days. Thanks for this list. I am doing the same.
Ric
I don't remember if it was mentioned in this thread or not and you might know this already but it is important to redirect 403 errors to another domain without a deny list on it.

A 403 >404 >403 loop can knock a server down. We redirect to a domain with little on it and specify a different name for the multi site 403 than the standard 403.html on that server...

ErrorDocument 401 /401.html
ErrorDocument 403 http://www.domain.com/401_multi_site.html
ErrorDocument 404 /404.html
ErrorDocument 500 /500.html

Rick
madsere
I've just moved my Ensim Howto/FAQ pages from www.webscorpion.com/ensim to ensim.webscorpion.com. As a temporary measure I've added a .htaccess file to www.webscorpion.com/ensim that does a simple redirect to ensim.webscorpion.com. The .htaccess file looks like this:

ErrorDocument 401 http://ensim.webscorpion.com/
ErrorDocument 403 http://ensim.webscorpion.com/
ErrorDocument 404 http://ensim.webscorpion.com/
ErrorDocument 500 http://ensim.webscorpion.com/

I'd prefer having a more intelligent approach though. The above cuts off any path/file that comes after and just redircts to the base of the site. Also, it doesn't update the browsers URL field with the new URL.

I've seen a rewritehack somewhere that basically rewrites the url, adds whatever path/file was after the domain name and puts that back to the browser ... though I can't seem to find it anywhere ... Anyone here know the quick & dirty on this?
Ric
RewriteRule ^anypage.html$ http://www.domain.com/path/anypage.html
[R]


^ specifies the begining of the matching string and $ specifies the end of the match.

[R] is the redirect and it should take the new url into the browser.

Leaving the [R] off would still redirect but it won't carry the new url over.

Does that help?

Rick
madsere
Hm, that only helps for anypage.html if I understand right.

I need a hack that will take, say, http://www.webscorpion.com/ensim/[anything] and rewite that to http://ensim.webscorpion.com/[anything] - not just for one specific page.
Ric
Yes, your correct, I have only used it for a specific page and I cant recall ever seeing a rule for a site wide redirect short of the one you already mentioned.

The Apache docs are somewhat ambiguous on this, I usually figure out rules by trial and error if I can't find better docs on a specific rule.

Maybe not a good solution but you could use the rule I mentioned and add a rule for each page you want to redirect. Not very efficient if you have a lot of pages but not bad if you just have a handful.

There very well might be a rewrite statement that does exactly what you want but I don't know how.

As a side note, I have had problems with rewrites to sub-domains. I forget exactly what the deal was but I remember I finally gave up on it.

Rick
madsere
Oh, I have subdomains working. ensim.webscorpion.com is a user-subdomain to www.webscorpion.com (www.webscorpion.com/~ensim)

Too bad about the wildcard redirects. I know it's possible, I've had a hack that did it, just lost it and can't find it again (and too busy right now to do the trial/error thing)
Ric
I think I was attempting to do an http redirect to an existing subdomain on another server when I was not able to make it work, could have been a number of things not related to .htaccess though.

Does Ensim support subdomains out of the box or is it something you did?

We had a client who wanted to use them on our Plesk based server but the Plesk forum replies I received about it said it could not be done.

Rick
madsere
It's "something I did". http://ensim.webscorpion.com icon_smile.gif Ensim only supports plain aliases out of the box.
mmoncur
madsere: I think this will work for your situation. This would be the
.htaccess file under www.webscorpion.com:
CODE
RewriteRule  ^ensim/(.*) [url]http://ensim.webscorpion.com/[/url]$1 [R=301]


The (.*) captures whatever is after /ensim/ and the $1 regurgitates it into the redirect URL.
madsere
I was looking for something like this, but there must be something missing. I just get a "Forbidden. You don't have permission to access /test/ on this server.". Try for yourself: http://www.webscorpion.com/test. In the directory test is just one file, .htaccess owned by the correct site admin and containing one line:
CODE
RewriteRule  ^ensim/(.*) [url="http://ensim.webscorpion.com/$1"]http://ensim.webscorpion.com/$1[/url] [R=301]

What am I missing?
mmoncur
I think the rewrite command needs to be in the .htaccess file of the main httpdocs directory for www.webscorpion.com. Otherwise the "ensim" won't be visible in the URL. (in other words, if I access http://www.webscorpion.com/ensim/whatever, the test directory isn't used at all. If I access www.webscorpion.com/test, the rule doesn't do anything because the URL doesn't contain "ensim".)

As to why the test directory gives an error, that's probably because there are no files there. icon_smile.gif
greyboy
Rick,

Just a note about the " " (space), such as in "Internet Ninja.*" You can simply escape it like this:

RewriteCond %{HTTP_USER_AGENT} ^Internet Ninja.* [NC,OR]


Here is some other info that may be of interest to some.

If you need to protect against (stupid) email harvestors, you can use mod_rewrite against them as well:

RewriteCond %{HTTP_USER_AGENT} ^Bullseye.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailCollector.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailSiphon.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^EmailWolf.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^ExtractorPro.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^Crescent.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^CherryPicker.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^fastlwspider.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^SurfWalker.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^GetWebPage.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^lwp-trivial.* [NC,OR]
RewriteCond %{HTTP_USER_AGENT} ^NICErsPRO.*
RewriteRule ^/.* - [F,L]


and I'm sure there are many others.

One way to find spiders that are not honoring the robots.txt (using robots.txt to find where to go) file is to place something like this in robots.txt

Disallow: /someDirThatIsntActuallyUsed

and simply track those that are trying to access it.

And there are many other uses for mod_rewrite. Just be aware, though, that it can cause the load and cpu resources to increase. Use it when needed, but only what you need.

N
Ric
QUOTE
Originally posted by greyboy
Rick,

Just a note about the " " (space), such as in "Internet Ninja.*" You can simply escape it like this:

RewriteCond %{HTTP_USER_AGENT} ^Internet Ninja.* [NC,OR]


Thanks! That was a sticky one since ^InternetNinja.*, ^Internet_Ninja.* & ^Internet.Ninja.* does not work. I could not find anything in the apache docs about a single char wildcard.

Rick
cscs
have any of you actually gotten the thief image RewriteRule to work? it just loads and loads on the offending site, without ever loading the image (just displays a 'bad image' image). i know the image exists, but it's just not showing.

I have +FollowSymlinks plus this:

CODE
RewriteEngine on

RewriteBase /

RewriteCond %{HTTP_REFERER} !^$

RewriteCond %{HTTP_REFERER} !^[url]http://domain.com/.*[/url]$ [NC]

RewriteCond %{HTTP_REFERER} !^[url]http://www.domain.com/.*[/url]$ [NC]

RewriteCond %{HTTP_REFERER} !^[url]http://xxx.xxx.xxx.xxx/.*[/url]$ [NC]

RewriteRule .*.(gif|GIF|jpg|JPG|PNG|png)$ [url]http://www.domain.com/banner.gif[/url] [F,L]


but it won't do anything.

also, for all the 40?'s, why don't you guys just direct to /index.php ? how many viewers actually want / need to see a 501 / 404 / 403 ? i just drop my users to frontpage.

great idea this thread! thanks! most of it works for me, even tho i'm on ensim.

oh yeah, wanted to also ask how to do a redirect from domain.com to www.domain.com ? to aid in cookie issues (need to be coming from www.domain not domain.com).
Ric
It works fine for me... of course you need to put the thief.gif on another server.

I hardly ever use that method, it is better to use the other method that just denies theimages instead of serving a thief.gif file.

Rick
cscs
QUOTE
Originally posted by Ric
It works fine for me... of course you need to put the thief.gif on another server.

weird, wonder why it's not working for me?

thief.gif is in http://www.domain.com/thief.gif, but it's still no show. icon_sad.gif

can someone suggest how i might do a:

domain.com -> www.domain.com rewrite?

tried this but no go, what'm i doing wrong?

CODE
RewriteCond ^[url]http://domain.com/.*[/url]$ [NC]

RewriteRule [url]http://www.domain.com[/url] [R]
theguy
where exactly i need to upload this file?

/home/virtual/domain/var/www/html/ ?

it's not working
siderman
I've millions of different ways of inserting the code into my .htaccess and none of it works. Here's my current .htaccess that's working:

CODE
# -FrontPage-



IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*



<Limit GET POST>

order deny,allow

deny from all

allow from all

</Limit>

<Limit PUT DELETE>

order deny,allow

deny from all

</Limit>

AuthName [url]www.mydomain.com[/url]

AuthUserFile /home/virtual/site2/fst/var/www/html/_vti_pvt/service.pwd

AuthGroupFile /home/virtual/site2/fst/var/www/html/_vti_pvt/service.grp


And then when I put in:

CODE
RewriteEngine on

RewriteBase /

RewriteCond %{HTTP_REFERER} !^$

RewriteCond %{HTTP_REFERER} !^[url]http://mydomain.com/.*[/url]$ [NC]

RewriteCond %{HTTP_REFERER} !^[url]http://www.mydomain.com/.*[/url]$ [NC]

RewriteCond %{HTTP_REFERER} !^[url]http://xx.xxx.xxx.xxx/.*[/url]$ [NC]

RewriteCond %{HTTP_REFERER} !^[url]https://mydomain.com/.*[/url]$ [NC]

RewriteCond %{HTTP_REFERER} !^[url]https://www.mydomain.com/.*[/url]$ [NC]

RewriteCond %{HTTP_REFERER} !^[url]https://xxx.xx.xxx.xxx/.*[/url]$ [NC] RewriteRule .*.(gif|GIF|jpg|JPG)$ - [F,L]



I get a 500 internal server error. I have no clue what I'm doing wrong, I followed what you guys were saying but nothing's working icon_sad.gif
ChingChong
I uploaded some mp3 files to my server.

I hotlinked them.

but now when i stream them, Winamp and Real One players cannot play it because they are hotlinked, can anyone tell me how i can hotlink my MP3's and STILL stream mp3's

P.S. some one earlier, not in this thread, said it has something to do with {HTTP_USER_AGENT}

can anyone help me thanx
Ric
QUOTE
Originally posted by siderman
[B]I've millions of different ways of inserting the code into my .htaccess and none of it works. Here's my current .htaccess that's working:

[code]
# -FrontPage-


Frontpage uses the .htaccess file for its own stuff and I doubt if you can integrate standard apache statements without screwing it up.

I am pretty sure frontpage only uses the .htaccess in the web root though, you can use a regular .htaccess file in any sub-directory.

Rick
ChingChong
QUOTE
Originally posted by ChingChong
I uploaded some mp3 files to my server.

I hotlinked them.

but now when i stream them, Winamp and Real One players cannot play it because they are hotlinked, can anyone tell me how i can hotlink my MP3's and STILL stream mp3's

P.S. some one earlier, not in this thread, said it has something to do with {HTTP_USER_AGENT}

can anyone help me thanx


can ANYONE help me? it would be really great, thanx guys
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2009 Invision Power Services, Inc.