Cheap pro­cess mon­it­or­ing (no ad­di­tion­al soft­ware re­quired)

I have an old sys­tem (only the hard­ware, it runs -cur­rent) which re­boots it­self from time to time (mostly dur­ing the daily periodic(8) run, but also dur­ing a lot of com­pil­ing (por­tup­grade)). There is no ob­vi­ous reas­on (no pan­ic) why it is do­ing this. It could be that there is some hard­ware de­fect, or some­thing else. It is not im­port­ant enough to get a high enough pri­or­ity that I try hard to ana­lyze the prob­lem with this ma­chine. The an­noy­ing part is, that some­times af­ter a re­start apache does not start. So if this hap­pens, the solu­tion is to lo­gin and start the web­server. If the web­server would start each time, nearly nobody would de­tect the re­boot (root gets an EMail on each re­boot via an @reboot cront­ab entry).

My prag­mat­ic solu­tion (for ser­vices star­ted via a good rc.d script which has a work­ing status com­mand) is a cront­ab entry which checks peri­od­ic­ally if it is run­ning and which re­starts the ser­vice if not. As an ex­ample for apache and an in­ter­val of 10 minutes:

*/​10 * * * *    /usr/local/etc/rc.d/apache22 status >/​dev/​null 2>&1 || /usr/local/etc/rc.d/apache22 re­start

For the use case of this service/​machine, this is enough. In case of a prob­lem with the ser­vice, a mail with the re­start out­put would ar­rive each time it runs, else only af­ter a re­boot for which the ser­vice did not re­start.

In­ter­est­ing pro­jects in the GSoC

I coun­ted 18 pro­jects which are given to FreeBSD in this years GSoC. For 3 of them I have some com­ments.

Very in­ter­est­ing to me is the pro­ject which is named Col­lect­ive lim­its on set of pro­cesses (a.k.a. jobs). This looks a bit like the Sol­ar­is contract/​project IDs. If this pro­ject res­ults in some­thing which al­lows the user­land to query which PID be­longs to which set, than this al­lows some nice im­prove­ment for start scripts. For ex­ample at work on Sol­ar­is each ap­plic­a­tion is a mix of sev­er­al pro­jects (apache = “name:web” pro­ject, tom­c­at = “name:app” pro­ject, Or­acle DB = “name:ora” pro­ject). Our man­age­ment frame­work (writ­ten by a co-​worker) al­lows to eas­ily do some­thing with those pro­jects, a “show” dis­plays the prstat (sim­il­ar to top) in­fo just for pro­cesses which be­long to the pro­ject, a “kill” sends a kill-​signal to all pro­cesses of the pro­ject, and so on. We could do some­thing sim­il­ar with our start scripts by de­clar­ing a namespace (FreeBSD:base:XXX /​ FreeBSD:ports:XXX?) and may­be num­ber space (de­pend­ing on the im­ple­ment­a­tion) as re­served and use it to see if pro­cesses which be­long to a par­tic­u­lar script are still run­ning or kill them or whatever.

The oth­er two pro­jects I want to com­ment upon here are Com­plete libp­kg and cre­ate new pkg tools and Com­plete Pack­age sup­port in the pkg_​install tools and cleanup. Both pro­jects ref­er­ence libp­kg in their de­scrip­tion. I hope the ment­ors of both pro­jects pay some at­ten­tion to what is go­ing on in the oth­er pro­ject to not cause dependencies/​clashes between the stu­dents.

That I do not men­tion oth­er pro­jects does not mean that they are not in­ter­est­ing or sim­il­ar, it is just that I do not have to say some­thing valu­able about them…

HOWTO ment­or in the GSoC (ini­tial com­mu­nic­a­tion with the stu­dent)

Every ment­or in the GSoC has a dif­fer­ent way of hand­ling stu­dents. Here is what I do.

The stu­dent in­tro­duced him­self to me as re­ques­ted by our soc–ad­mins in the ini­tial mail to our stu­dents. He looked up in which timezone I am (pub­lic in­fo) and presen­ted his timezone (and rough loc­a­tion) to me. That is nice. He also offered dif­fer­ent com­mu­nic­a­tion chan­nels (ba­sic­ally EMail and IM).

I con­firmed what he looked up, and presen­ted what I did in the past GSoC in which I par­ti­cip­ated so that he has an idea if am new to the game or not. I told him that quick/​short ques­tions are bet­ter asked via IM, while long ex­plan­a­tions or ques­tions are bet­ter handled via EMail. I also gave him a rough over­view when he can ex­pect quick an­swers from me and when I am not avail­able.

Fol­low­ing are some ques­tions I asked him, so that I get an im­pres­sion about what to ex­pect and that I can plan a bit (some of those may already be told in stu­dent ap­plic­a­tion, but I prefer to have everything in one place):

  • From when to when do you in­tent to spend how much time for the GSoC?
  • Any hol­i­days /​ non-​availability planned dur­ing the GSoC?
  • Any uni­ver­sity–stuff (exams/​lessons/​…) dur­ing this time (the uni has higher pri­or­ity than the GSoC for Google)?
  • Any­thing else in par­al­lel of the GSoC (some paid work, tak­ing care about ill (grand-)parents, …)?
  • At what level of know­ledge do you see your­self re­gard­ing computer-​science/​programming/​OS-​concepts (re­l­at­ive to oth­er stu­dents and re­l­at­ive to the top­ic)?
  • How do you want to start about the pro­ject (where do you want to start, what do you in­tent to do… just a quick over­view… a bit more than say­ing “I add X”, but not as far as copy&paste of code ex­amples)?

More im­port­ant than that (IMO), is to give an idea what is ex­pec­ted from the stu­dent:

  • you have FreeBSD–cur­rent in­stalled (on a real PC or in a vir­tu­al ma­chine)
  • you give me a re­port about the status each week (“did noth­ing” is also a val­id re­port, it gives me the in­fo that you are still alive and did not lose in­terest in the GSoC)
  • if your sched­ule changes in a sig­ni­fic­ant way, give me a little no­ti­fic­a­tion (e.g. “I can not do any­thing next week”)
  • if you spend more than 30 minutes with a prob­lem, pre­pare an email with the prob­lem de­scrip­tion; if this pre­par­a­tion did not solve your prob­lem, send me the mail (if you solve the prob­lem 5 minutes later, no prob­lem, I prefer to get a mail too much than to have you stuck with some­thing for an in­cred­ible amount of time)

A ment­or does not know everything, off course, so the stu­dent should be sub­scribed to hackers@ and current@, and if there is a spe­cific list which matches good to the pro­ject he is work­ing on, then to this mail­ing list too. This al­lows the ment­or to tell the stu­dent to send a mail with the ques­tions to one of those lists without much pre­par­a­tion to re­ceive all an­swers.

An­other help­ful re­source is the FreeBSD ker­nel cross-​reference. For some people my doxy­gen gen­er­ated docs of parts of the FreeBSD ker­nel may be help­ful (put un­for­tu­nately not a lot of doxy­gen–markup is with­in our source code).

I also told that he shall pre­pare him­self that I will ask him to send a ref­er­ence to a patch of his work long enough be­fore the GSoC ends to an ap­pro­pri­ate mail­ing list, and that com­ments from there re­gard­ing changes he must or shall do are not some­thing bad, but a way to im­prove the res­ult and/​or his skills.

Ment­or­ing again in the GSoC

Seems that I will act­ively ment­or again in this Google Sum­mer of Code (as op­posed to just re­view the sub­mis­sions from stu­dents and/​or act­ing as a fall-​back ment­or).

The pro­ject I will ment­or is the “Make op­tion­al ker­nel sub­sys­tems re­gister them­selves via sy­sctl”-one from the FreeBSD ideas page.

The stu­dent already got in­to con­tact with me and it looks like he is mo­tiv­ated (he is already sub­scribed to sev­er­al FreeBSD mailing­lists, which is not a re­quire­ment we have in our GSoC docs).

One-​Time-​Passwords for Horde/​IMP?

I search a way to use one-​time–pass­words for Horde/​IMP on FreeBSD. I do not want to use PAM (loc­al users on the ma­chine). Cur­rently I use the au­then­tic­a­tion via IMAP4 (link between the IMAP4-​server and post­fix via MySQL, to have the same PW for send­ing and re­ceiv­ing), and I ex­pect that not all users of Horde/​IMP will use OTP if avail­able, so the prob­lem case is not that easy. I can ima­gine a solu­tion which tries to au­then­tic­ate via OTP first, and if it suc­ceeds gets a pass­word for the lo­gin to the IMAP4 server. If the OTP-​auth fails, it could try the entered pass­word for the lo­gin to the IMAP4 server. Mi­grat­ing ex­ist­ing users to a new solu­tion can be done by telling them to en­ter the pass­word from the ma­chine of the per­son do­ing the mi­gra­tion. The solu­tion needs to auto­mat­ic­ally lo­gin to the IMAP4 server, en­ter­ing a pass­word for the IMAP4 server af­ter the OTP-​login to Horde is not an op­tion.

Oh, yes, send­ing the pass­words over SSL is not an op­tion (that is already the only way to lo­gin there). The goals are to have

  • an easy to re­mem­ber pass­word for an OTP app on the mo­bile to gen­er­ate the real pass­word
  • the pass­word ex­pire fast, so that a stolen pass­word does not cause much harm
  • not the same login-​password for dif­fer­ent ser­vices (mail-​pw != jabber-​pw != user-​pw)