Help - Search - Members - Calendar
Full Version: HOW-TO: PRM (Process Resource Monitor)
The Planet Forums > System Administration > HOWTOs
Pages: 1, 2
anand
anyone with a answer to my problem ?
Erwin
I can't believe I've just installed this. It works WONDERFULLY! Should have done this a long time ago.

Thanks rfxn for another wonderful resource, that is free.

It works great. I've not ignored httpd processes for now - if a httpd decides to go rogue, it gets taken down. So far, it works great. Server loads are low.

Of course, it's not peak time yet - I'll wait and see how it performs in another few hours.
Erwin
Just a tip:

In this directory:

/usr/local/prm/rules

There are special rules for httpd and mysql you need to edit. icon_smile.gif
BoiTaiTui
will this work on RHEL ensim3.7?
Goliath
I'm on RHEL w/ Ensim Pro 3.7. Seems to run fine.
z3roon3
thnks for sharing,

however this killing my httpd not just the exceeded pid.

ensim prm(22772): process 20266 exceeded resource limits, killed.

ensim prm(22772): check /usr/local/prm/killed/20266 for process specific information.

ensim prm(22772): get_pinfo() value asignment error; aborting.

well at that point apache is already down(locked)
any idea?
my system is rh 7.2
Goliath
My httpd was shutdown initially for too many processes. It's best to change critical services to levels that cannot be shutdown. This is done in the rules dir, iirc. I'm not sure what hooks this program specifically uses, but it may kill offending processes similar to:

killall httpd

Instead of:

kill kill 12345

Hope that makes more sense to you.
z3roon3
thanks:)

this helped me a lot

now, it works like a charm
rkenney
I've got detected multiple prm processes; aborting. in my log file.....


I followed the instructions, what does this mean?

Russ:confused:
rfxn
Make sure your running the current version of PRM; http://www.r-fx.org/prm.php
Savage1
I've installed 0.5, and I have 2 php processes that need to be killed. And no matter what i set the conf.prm values to, or rules/php values too, I cant get PRM to kill ANYTHING.

This is what I want to kill, as listed in top.


PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
26878 admin32 25 0 15240 14M 3872 R 37.2 3.0 27:42 php
28263 admin32 25 0 15232 14M 3872 R 36.3 3.0 20:49 php

This is what I set in /usr/local/prm/rules/php just to try and make it kill ANYTHING php.


# seconds to wait before rechecking a flaged pid (pid's noted resource
# intensive but not yet killed).
WAIT="5"

# counter limit that a process must reach prior to kill. the counter value
# increases for a process flaged resource intensive on rechecks.
KILL_TRIG="2"

# argument to pass onto kill commands
KARG="9"

# Max CPU usage readout for a process - % of all cpu resources (decimal values unsupporte$
MAXCPU="5"

# Max MEM usage readout for a process - % of system total memory (decimal values unsuppor$
MAXMEM="1"

# Max processes for a given command - this is not max processes for user but rather the e$
MAXPS="30"

obviously seems these settings are overkill, but even with this those PHP sessions kept on going. I cant get prm to kill anything.

Help?

-Sav
jrap
Could someone post their edited rules/httpd and rules/mysql files? I would like to take 'httpd' and 'mysqld' out of my ignore file, but the default settings are a bit too low I think.
z3roon3
oops sorry double post
z3roon3
QUOTE
Originally posted by Savage1
I've installed 0.5, and I have 2 php processes that need to be killed.  And no matter what i set the conf.prm values to, or rules/php values too, I cant get PRM to kill ANYTHING.  

This is what I want to kill, as listed in top.


-Sav

maybe u don't reach the load average on the server ?
z3roon3
QUOTE
Originally posted by jrap
Could someone post their edited rules/httpd and rules/mysql files?  I would like to take 'httpd' and 'mysqld' out of my ignore file, but the default settings are a bit too low I think.

the defaults are not too low . give it a shot
z3roon3
using this neat prm for a month. but sometimes ( approximately 4 -5 times a week) it kills apache. Statistically it works great. I hav a quite busy server and this helps me a lot I would appreciate an automatically restart.icon_smile.gif
jeroman
For Auto restart of for example Apache you can use SIM.
Same place - rfx network.

BUT if you use SSL certs on any sites on the server be sure to
set the values for httpd really high in PRM.
Otherwise it will kill those processes and then apache will freeze/fail.

I have had a huge issue with this where apache fails 20-30 times per day. I installed SIM so it restarts it auto.
After several days/week of investigating by me, google and cpanel team we couldn't find anything....
When I adjusted the PRM NOT to kill the httpd processes the issue was resolved.

I could'n know this first because the apache fail issue was there before the PRM program. But after reinstall of openssl and recompile apache the issue was slowed down but not resolved.
So PRM auctually instead of helping made it worst and was still bugging the system after the real fix was made.
Savage1
QUOTE
Originally posted by z3roon3
maybe u don't reach the load average on the server ?


Server was pegged at 99% CPU usage overall because of these processes. There would be a bunch of them popping up, and I had to manually check and kill them when the do. Real pain in the a**

-Sav
Goodspeed
How can I do clean uninstall of PRM?
Ronny
just delete the directory /usr/local/prm then delete the cronjob in /etc/cron.d
jameshsi
Do we need to clean the log from time to time ?
Is there any scripts will do that so we don't have to worry about ?
jameshsi
This is what I got last night:

- Event Summary:
USER: dyncs
PID : 928
CMD : /usr/bin/spamd
CPU%: 0 (limit: 40)
MEM%: 1 (limit: 20)
PROCS: 76 (limit: 25)

- Event Summary:
USER: mixk
PID : 19045
CMD : /usr/bin/spamd
CPU%: 0 (limit: 40)
MEM%: 1 (limit: 20)
PROCS: 64 (limit: 25)

I got quite a few of this emails within 1 hour, and it's not same user, so what is that means and what should I do about it ?
oziris
Installed PRM version 5. I've been watching top for a while and see that it is not killing spamd process. What did I do wrong?

15:37:49 up 4 days, 48 min, 1 user, load average: 5.90, 4.18, 2.96
333 processes: 315 sleeping, 5 running, 2 zombie, 11 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 93.8% 0.0% 6.1% 0.0% 0.0% 0.0% 0.0%
Mem: 1031000k av, 988292k used, 42708k free, 0k shrd, 122448k buff
381788k active, 491160k inactive
Swap: 1052248k av, 36828k used, 1015420k free 188252k cached

PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
28509 root 16 0 21120 20M 12132 S 44.2 2.0 7:21 0 spamd
2038 root 16 0 20460 18M 18936 S 10.3 1.8 9:28 0 httpd13
22685 root 17 0 1572 1572 928 R 3.0 0.1 1:51 0 top
rfxn
It is obviously doing it job if you have 76 spamd processes spawning for a single user. icon_smile.gif

However im not exactly sure what too do with regards too spamd, i have the same issues of hundreds of spamd processes spawning. But without PRM spamd can take out the whole server (forks too many processes or too much memory).

QUOTE
Originally posted by jameshsi
This is what I got last night:

- Event Summary:
USER: dyncs
PID : 928
CMD : /usr/bin/spamd
CPU%: 0 (limit: 40)
MEM%: 1 (limit: 20)
PROCS: 76 (limit: 25)

- Event Summary:
USER: mixk
PID : 19045
CMD : /usr/bin/spamd
CPU%: 0 (limit: 40)
MEM%: 1 (limit: 20)
PROCS: 64 (limit: 25)

I got quite a few of this emails within 1 hour, and it's not same user, so what is that means and what should I do about it ?
oziris
What about mine? I don't think it even tried killing it. Do I have to take some process out of default ignore file for it to be able to kill spamd?

Thanks,
Predrag
omnibus
Sorry for the question...
I've verified that user "root" is in the "ignore" file. Does this mean that I can add a rule named as a particular user (i.e. admin45)? So I can limit that user processes...
Thank you.
omnibus
Looking at my prm_log file, I can read only this:

Sep 21 23:00:02 he4 prm(22059): cleared stale lock file file.
Sep 21 23:00:19 he4 prm(22059): get_pinfo() value asignment error; aborting.
Sep 21 23:04:00 he4 prm(25430): cleared stale lock file file.
Sep 21 23:04:15 he4 prm(25430): get_pinfo() value asignment error; aborting.
Sep 21 23:08:00 he4 prm(27989): cleared stale lock file file.
Sep 21 23:08:15 he4 prm(27989): get_pinfo() value asignment error; aborting.
Sep 21 23:12:00 he4 prm(30296): cleared stale lock file file.

What's about? Is it just an issue?
Thank you.
JLChafardet
this question maybe is hella dumb, but i am on the need to do it.

Is this normal?

Nov 04 00:16:01 hve01 prm(13455): system load (0) below check requirment; aborting.
Nov 04 00:20:00 hve01 prm(13607): system load (0) below check requirment; aborting.
Nov 04 00:24:00 hve01 prm(13701): system load (0) below check requirment; aborting.
Nov 04 00:28:00 hve01 prm(13906): system load (0) below check requirment; aborting.

????
koolnyze
QUOTE
Is this normal?


Yes this is normal.
naveen3
QUOTE
using this neat prm for a month. but sometimes ( approximately 4 -5 times a week) it kills apache. Statistically it works great. I hav a quite busy server and this helps me a lot I would appreciate an automatically restart.


Hello Friends,
my problem is with tomcat.
prm kills tomcat thread and all jsp pages shows "500 internal server error"

i tried to modify the script so that i restart tomcat after killing tomcat thread.

if [ "$FCNT" -gt "$KILL_TRIG" ]; then
eout "process $pid exceeded resource limits, killed."
eout "check $INSPATH/killed/$pid for process specific information."
#my code start
if [ "$cmd" = "/usr/java/j2sdk1.4.0/bin/java" ]; then
eout "Restarting tomcat"
cd / ; /etc/rc.d/init.d/tomcat4 restart >> /dev/null 2>&1
fi
#my code ends
cat > $KSP/$pid <
but does not restart tomcat and 500 error comes.

Any clue ?
dennys
I have this script running, but when a process goes over the thresshold, I don't see anything logged. It's almost like the script is not running. If I kill the process manually, then I start to get the regular log events on prm_log:
CODE
Jan 06 12:20:00 rhes prm(4380): system load (0) below check requirment; aborting.

Jan 06 12:28:00 rhes prm(7142): system load (0) below check requirment; aborting.


Anyone has any ideas?

Edit: I verified that the script is being called from the cron log. Is just not reporting/doing anything icon_sad.gif
JLChafardet
QUOTE
JLChafardet
Regular EV1-Forum Member

Registered: Feb 2004
Location: Caracas - Venezuela
Posts: 145
 

this question maybe is hella dumb, but i am on the need to do it.

Is this normal?

Nov 04 00:16:01 hve01 prm(13455): system load (0) below check requirment; aborting.
Nov 04 00:20:00 hve01 prm(13607): system load (0) below check requirment; aborting.
Nov 04 00:24:00 hve01 prm(13701): system load (0) below check requirment; aborting.
Nov 04 00:28:00 hve01 prm(13906): system load (0) below check requirment; aborting.

????

__________________
José Luis Chafardet Grimaldi
Project Director
ISAVE C. A.
-=-=-=-=-=-=-
PSA 7.5.1 Reloaded - RHEL ES 3
-----------------------------------------------
irc.ev1.net:6667 #ev1regulars


I did asked the same long ago and it is normal. icon_smile.gif

regards,
MugenSi00
great thread
[over]
hellow wen I started prm whitz shell comand: -s option I see this msg but no found what is the problem? sonsing help me please?

thx and sorry because my english is very bad.

usage: /usr/local/sbin/prm [-s] [-q] [-j]
root@babylon [~/prm-0.5]# /usr/local/sbin/prm -s
PRM version 0.5
Copyright © 1999-2003, R-fx Networks
Copyright © 2003, Ryan MacDonald
This program may be freely redistributed under the terms of the GNU GPL

Jul 06 03:02:17 babylon prm(3756): system load (0) below check requirment; aborting.
Pavo
Sick.
merlinpa1969
Hello, I am getting this ALOT in the last few days

This is an automated status warning from lancelot.camelot-hosting.com. The process (15612) has exceeded defined resource limits, as such a kill signal was invoked from the process resource monitor.

- Event Summary:
USER: nobody
PID : 15612
CMD : /usr/local/apache/bin/httpd
CPU%: 3 (limit: 65)
MEM%: 75 (limit: 25)
PROCS: 65 (limit: 150)

and when it kills em server goes completly down for ( in this last instance 2 hrs )

how can I trace this to see who the user is?
solokron
PHP Suexec or suPHP combo'd with Apache Status and CPU/Memory/MySQL Usage works great.
merlinpa1969
any idea why I would have this
for cpu/memory/mysql

Note: These figures are averages since 0000 hours today.
Note: This script will not able to track cgi cpu/memory usage if you do not have suexec installed.
Note: Percentages are based on one cpu. If you have 2 cpus divide the number in half to get the percentage of all cpu power used
User Domain %CPU %MEM Mysql Processes

thats IT there is nothing else on the page
mysticeti
QUOTE (naveen3)
Hello Friends,
my problem is with tomcat.
prm kills tomcat thread and all jsp pages shows "500 internal server error"

i tried to modify the script so that i restart tomcat after killing tomcat thread.

if [ "$FCNT" -gt "$KILL_TRIG" ]; then
               eout "process $pid exceeded resource limits, killed."
               eout "check $INSPATH/killed/$pid for process specific information."
#my code  start              
if [ "$cmd" = "/usr/java/j2sdk1.4.0/bin/java" ]; then
                       eout "Restarting tomcat"
                       cd / ; /etc/rc.d/init.d/tomcat4 restart >> /dev/null 2>&1
               fi
#my code ends
               cat > $KSP/$pid <
but does not restart tomcat and 500 error comes.

Any clue ?


Two years too late but... This worked for me:

CODE
               if [ "$user" == "tomcat4" ]; then

                   eout "Starting tomcat4."

                   /sbin/service tomcat4 start >> $LOG 2>&1

               fi


I added that code right after the USR_ALERT mail.
Keith Read
I see that this thread is a couple of years old. But is PRM still functional? Has the issue of shutting down Apache been resolved? We have a couple of accounts that are continuously overloading the server, we are manually suspending the accounts for 10 minutes when the server load gets above 8.00. We are considering PRM but concerned that it will shut down apache.

If there is another solution, we would like to know about it.

Thanks,

Keith
This is a "lo-fi" version of our main content. To view the full version with more information, formatting and images, please click here.
Invision Power Board © 2001-2010 Invision Power Services, Inc.