- Alexander Leidinger - https://www.leidinger.net/blog -

Cheap process mon­i­tor­ing (no addi­tion­al soft­ware required)

I have an old sys­tem (only the hard­ware, it runs ‑cur­rent) which reboots itself from time to time (most­ly dur­ing the dai­ly periodic(8) [1] run, but also dur­ing a lot of com­pil­ing (por­tup­grade)). There is no obvi­ous rea­son (no pan­ic) why it is doing this. It could be that there is some hard­ware defect, or some­thing else. It is not impor­tant enough to get a high enough pri­or­i­ty that I try hard to ana­lyze the prob­lem with this machine. The annoy­ing part is, that some­times after a restart apache does not start. So if this hap­pens, the solu­tion is to login and start the web­serv­er. If the web­serv­er would start each time, near­ly nobody would detect the reboot (root gets an EMail on each reboot via an @reboot crontab entry).

My prag­mat­ic solu­tion (for ser­vices start­ed via a good rc.d script which has a work­ing sta­tus com­mand) is a crontab entry which checks peri­od­i­cal­ly if it is run­ning and which restarts the ser­vice if not. As an exam­ple for apache and an inter­val of 10 minutes:

*/10 * * * *    /usr/local/etc/rc.d/apache22 status >/dev/null 2>&1 || /usr/local/etc/rc.d/apache22 restart

For the use case of this service/machine, this is enough. In case of a prob­lem with the ser­vice, a mail with the restart out­put would arrive each time it runs, else only after a reboot for which the ser­vice did not restart.

[7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47]Share/Save [48]
2 Comments (Open | Close)

2 Comments To "Cheap process mon­i­tor­ing (no addi­tion­al soft­ware required)"

#1 Comment By Bryan D On May 1, 2010 @ 18:22

I used to do this as well, but have since moved to dae­mon­tools. Ensures process­es are *always* up with­out a 10 minute win­dow of down­time pos­si­ble. Espe­cial­ly if your app if crash­ing as well.

#2 Comment By netchild On May 3, 2010 @ 09:53

Dae­mon­tools is not a part of the base sys­tem, so it would require to install it addi­tion­al­ly. The post I made is for cas­es where the uptime does not mat­ter that much, and where you do not want to spend the time to install and con­fig­ure addi­tion­al soft­ware. And the 10 min­utes was just an exam­ple, you could let cron check every minute if you want. But as I wrote, in my case it does not mat­ter much. Places where I would use dae­mon­tools are places where such a reboot for an unknown rea­son which I described would not be accept­able (num­ber one pri­or­i­ty would be to find out the rea­son of the reboot and do some­thing to fix the problem).