The 7‑stable system on which I have stability problems after an update from 7.1 to 7.2/7‑stable is now semi-stable.
The watchdog reboots after one minute of no reaction (currently it is able to run 3 – 4 hours), and the jails come up without problems now.
The problem with the jails was, that e.g. the mysql-server startup went into the STOP state because TTY-input was “requested”. I solved the problem by using /dev/null as input on jail-startup. On ‑current I do not see this behavior (I have a 9‑current system with a lot of jails which reboots every X days, and there mysql does not go into the STOP state).
I also start the jails in the background, so that one blocking jail does not block everything (done like in ‑current).
To say this with code:
--- /usr/src/etc/rc.d/jail 2009-02-07 15:04:35.000000000 +0100 +++ /etc/rc.d/jail 2009-12-16 17:03:12.000000000 +0100 @@ -556,7 +556,8 @@ fi _tmp_jail=${_tmp_dir}/jail.$$ eval ${_setfib} jail ${_flags} -i ${_rootdir} ${_hostname} \ - \\"${_addrl}\\" ${_exec_start} > ${_tmp_jail} 2>&1 + \\"${_addrl}\\" ${_exec_start} > ${_tmp_jail} 2>&1 \\ + </dev/null if [ "$?" -eq 0 ] ; then _jail_id=$(head -1 ${_tmp_jail}) @@ -623,4 +624,4 @@ if [ -n "$*" ]; then jail_list="$*" fi -run_rc_command "${cmd}" +run_rc_command "${cmd}" &
I also identified 57 patches for ZFS which are in 8‑stable, but not in 7‑stable (I do not think they could solve the deadlock, but I do not really know, and now that there is one FS on ZFS, I would like to get as much fixed as possible). Some of them should be merged, some would be nice to merge, and some I do not care much about (but if they are easy to merge, why not…). I already have all revisions and the corresponding commit logs available in an email-draft.
Now I just need to write a little bit of text and find some people willing to help (some of the changes need a review if they are applicable to 7‑stable, and everything should be tested on a scratch-box).