Alexander Leidinger

Just another weblog

Dec
30

Some fixes for ZFS on 7-stable (more testers wanted)

Due to the prob­lems with a 7–sta­ble machine, I had a look at some unmerged fixes for ZFS (58 changes not merged).

I back­ported some of those changes from 8-stable to 7-stable, I have this run­ning on one 7-stable machine. I would like to get some more feed­back for it (even an “it works for me” would be great). The main part of this change is that the FreeBSD taskqueue is used now instead of the open­so­laris one (and some other changes which may improve the ZFS experience).

It would also be nice if some­one could have a look at the FIRST_THREAD_IN_PROC part. Can there be more than one thread at this place (I do not think so) and I should use FOREACH_THREAD_IN_PROC_instead?

How to apply:

  • cd /usr/src/
  • fetch http://www.Leidinger.net/FreeBSD/test/releng7_zfs_merge3.diff
  • fetch http://www.Leidinger.net/FreeBSD/test/opensolaris_taskq.c
  • fetch http://www.Leidinger.net/FreeBSD/test/taskq.h
  • mv taskq.h sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h
  • mv opensolaris_taskq.c sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c
  • patch –p 0 –quiet <releng7_zfs_merge3.diff
  • ignore the 2 .rej files
  • rm –f sys/cddl/compat/opensolaris/sys/taskq_impl.h*
  • rm –f sys/cddl/compat/opensolaris/sys/taskq.h*
  • rm –f sys/cddl/contrib/opensolaris/uts/common/os/taskq.c*
  • rebuild ker­nel

I do not list all of those 16 of 58 out­stand­ing patches which are cov­ered here, a detailed list can be found on the sta­ble and fs mail­inglists.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
Share

Tags: , , , , , , , , ,
Dec
16

Sta­bi­liz­ing 7-stable…

The 7–sta­ble sys­tem on which I have sta­bil­ity prob­lems after an update from 7.1 to 7.2/7-stable is now semi-stable.

The watch­dog reboots after one minute of no reac­tion (cur­rently it is able to run 3 – 4 hours), and the jails come up with­out prob­lems now.

The prob­lem with the jails was, that e.g. the mysql–server startup went into the STOP state because TTY-input was “requested”. I solved the prob­lem by using /dev/null as input on jail-startup. On –cur­rent I do not see this behav­ior (I have a 9–cur­rent sys­tem with a lot of jails which reboots every X days, and there mysql does not go into the STOP state).

I also start the jails in the back­ground, so that one block­ing jail does not block every­thing (done like in –current).

To say this with code:

--- /usr/src/etc/rc.d/jail      2009-02-07 15:04:35.000000000 +0100
+++ /etc/rc.d/jail      2009-12-16 17:03:12.000000000 +0100
@@ -556,7 +556,8 @@
 fi
 _tmp_jail=${_tmp_dir}/jail.$$
 eval ${_setfib} jail ${_flags} -i ${_rootdir} ${_hostname} 
-                       \"${_addrl}\" ${_exec_start} > ${_tmp_jail} 2>&1
+                       \"${_addrl}\" ${_exec_start} > ${_tmp_jail} 2>&1 \
+                       </dev/null

 if [ "$?" -eq 0 ] ; then
 _jail_id=$(head -1 ${_tmp_jail})
@@ -623,4 +624,4 @@
 if [ -n "$*" ]; then
 jail_list="$*"
 fi
-run_rc_command "${cmd}"
+run_rc_command "${cmd}" &

I also iden­ti­fied 57 patches for ZFS which are in 8-stable, but not in 7-stable (I do not think they could solve the dead­lock, but I do not really know, and now that there is one FS on ZFS, I would like to get as much fixed as pos­si­ble). Some of them should be merged, some would be nice to merge, and some I do not care much about (but if they are easy to merge, why not…). I already have all revi­sions and the cor­re­spond­ing com­mit logs avail­able in an email-draft.

Now I just need to write a lit­tle bit of text and find some peo­ple will­ing to help (some of the changes need a review if they are applic­a­ble to 7-stable, and every­thing should be tested on a scratch-box).

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
Share

Tags: , , , , , , , , ,
Dec
15

Sta­bil­ity prob­lems with 7-stable

On the machine where I host this blog, I have/had some sta­bil­ity prob­lems.

Last week I updated the machine from FreeBSD 7.1-pX to 7.2-p5 (GENERIC ker­nel in both cases). 5 – 10 Min­utes after the reboot into the new ver­sion the machine had a dead­lock. After some road­blocks (order­ing a KVM-switch from the hoster, the KVM-switch not work­ing with a proxy (dur­ing lunchtime at work), a bro­ken video-capture of the KVM-switch and a replace­ment on Mon­day morn­ing to not pay the WE-fees), I spend a big part of the night to get it sta­ble. I tried dis­abling SMP, enabling INVARIANTS and WITNESS, chang­ing the sched­uler, cut­ting the soft­ware mir­ror (to rule out a mis­match between the con­tent of the disks after all the hard reboots) and updat­ing to 7-stable.

Unfor­tu­nately noth­ing helped. :(

Googling a lit­tle bit around (it is a AMD Dual–Core sys­tem with NVidia MCP61 chipset) was lead­ing me to a post on the mail­inglists from 2008 which talks about an issue with the buffer cache. I do not know if this is still an issue (I have send a email to kib@ to ask about it), and my sce­nario is not the same as the one which is described in the mail, but because of this I decided to switch one of the two UFS mir­rors to ZFS.

The first boot into the ZFS caused again a reboot after some min­utes (I do not know if it was because of a mem­ory exhausted panic, or because of a dead­lock), but as I did not tune the ker­nel for ZFS I am tempted to believe that I should not count that. Now, after tun­ing the ker­nel (increas­ing the kmem_size to 700M, no prefetch­ing, lim­it­ing the ARC to 40M) it is up since nearly 2h (as of this writ­ting… cross­ing fin­gers). Before it was not able to sur­vive more than some min­utes with just the jail for the mails up. Now I not only have the mail-jail up, but also the jail for the blog (one jail still dis­abled, but I will take care about that after this post).

I do not know if only increas­ing the kmem_size would have helped with the prob­lem, but as I was test­ing a GENERIC ker­nel + gmir­ror mod­ule in the begin­ning, I expected that the auto-tuning of this value should have been enough for such a sim­ple setup (2GB RAM, 2 disks with 3 par­ti­tions each, one par­ti­tion pair for root, one for swap, one for the jails).

I hope that I sta­bi­lized the sys­tem now. It may be the case that I will test some patches in case some­one comes up with some­thing, so do not be sur­prised if the blog and email to me is a lit­tle bit flaky.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
Share

Tags: , , , , , , , , ,
Dec
10

Progress with Net­worker bugs

Our bug with savep­npc which causes the post-command to start one minute after the pre-command even if the backup is not done yet is now hope­fully near the res­o­lu­tion point. We opened a prob­lem report for this in July, this week we where told that there is a patch for it avail­able. The bad part is, that it is avail­able since 3 weeks and nobody told us. The good part is, that we have it installed on a machine now to see if it helps (all zones there seem to be OK, but we have zones where it some­times works and some­times fails, so we are not 100% sure, but we hope the best). We where told that it will be included in Net­worker 7.5.1.8.

Our other issues are now at least not in a helpdesk-loop any­more, they seem to have reached the devel­op­ers now.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
Share

Tags: , ,
Dec
06

FreeNAS & Sen­sors for FreeBSD

This WE I was told that FreeNAS seems to want to move from FreeBSD to Linux (since then it seems there could be a linux and a FreeBSD ver­sion). One of the rea­sons seems to be a miss­ing sen­sors framework.

As I was com­mit­ting a port of the OpenBSD sen­sors frame­work (pro­duced as part of the Google Sum­mer of Code 2007) to FreeBSD and had to remove it after­wards because one com­mit­ter com­plained very loudly, I was asked what the sta­tus of this is.

The short sta­tus is: Nobody is doing some­thing about it.

Before I explain the long sta­tus, I give  a short overview what this sen­sors frame­work is:

  • a ker­nel API which allows to add sensors
  • an inter­face for the user­land to query the sen­sor data
  • some basic user­land code to show and log the sen­sor info

The API and the query inter­face are more or less inde­pen­dent. For the user­land code it was more a log­ging infra­struc­ture than a real mon­i­tor­ing solu­tion. The rea­son was the real mon­i­tor­ing solu­tions already exist (Nagios, snmpd, …) and can be adapted to query the sen­sors. Ide­ally a query in user­land should be han­dled by a library instead of directly access­ing the sysctl inter­face, this way the kernel<->userland inter­face would be abstracted away (and could b replaced as needs arise). This was not done, it was some­thing to be done later (Rome was not build in a day).

The user­land inter­face also only cared about dumb sen­sors (those which you need to query man­u­ally to get the infor­ma­tion), smart sen­sors (those which are able to send events them­self) where not taken care about in the sense of really send­ing sensor-triggered events, but the ker­nel API allowed to add such sen­sors. The sysctl inter­face has no way of send­ing events, but FreeBSD already has an event inter­face (devd is tak­ing care about it). It would have been not a prob­lem to send events via this chan­nel and let an user­land library take care about the deliv­ery together with other sensor-data in userland.

And now the long sta­tus is:

PHK com­plained loudly about it. First he said he did not look at it but he com­plained that is not good regard­less. After a lot of nag­ging from me he had a look at it and was not happy about the time stuff in it (short: the FreeBSD time­counter code is bet­ter). This was not a prob­lem in my opin­ion, we could have dis­abled this part with­out prob­lems. After such an offer from me, he com­plained that the sen­sors frame­work uses the sysctl inter­face instead of an entry in /dev.

At this point in time already sev­eral user­land util­i­ties used the sysctl frame­work to query for sta­tus data in the ker­nel. So there was already prece­dence for such an use of it. Later some more such uses where added too (e.g. the proc­stat stuff by core team mem­ber Robert Watson).

I saved some of the cor­re­spond­ing mails (to pub­lic mail­ing lists) in a mbox file, read the mess your­self if you want.

The bot­tom line is: Sev­eral com­mit­ters (even some which we could call high pro­file com­mit­ters) told me that they do not see a prob­lem in the use of the sysctl inter­face. They do not seem to want to tell it in pub­lic (nobody of them voiced their opin­ion in the thread, so do not ask me who those peo­ple are). I am not inter­ested in invest­ing more of my spare time into fight­ing wind­mills (it looks like this to me).

So, if some­one is inter­sted in the code, r172631 has it. In the per­force repos­i­tory you can maybe find some sen­sors. I think most of it can still be used with­out much changes.

If some­one tries it with a more recent FreeBSD, please drop me a note if it just applies fine, or a patch (or an URL to it) if it needs some mod­i­fi­ca­tions. Who knows, maybe in a future project it may be use­ful for me.

If there is enough inter­est by sev­eral peo­ple, I can even put up a wiki page where those peo­ple can coor­di­nate, but that is most prob­a­bly all I am will­ing to invest fur­ther into this (at least in my unpaid time).

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
Share

Tags: , , , , , , , , ,