ZFS and NFS /​ on-​disk-​cache

In the FreeBSD mailing­lists I stumbled over  a post which refers to a blog-​post which de­scribes why ZFS seems to be slow (on Sol­ar­is).

In short: ZFS guar­an­tees that the NFS cli­ent does not ex­per­i­ence si­lent cor­rup­tion of data (NFS serv­er crash and loss of data which is sup­posed to be already on disk for the cli­ent). A re­com­mend­a­tion is to en­able the disk-​cache for disks which are com­pletely used by ZFS, as ZFS (un­like UFS) is aware of disk–caches. This in­creases the per­form­ance to what UFS is de­liv­er­ing in the NFS case.

There is no in-​deep de­scrip­tion of what it means that ZFS is aware of disk-​caches, but I think this is a ref­er­ence to the fact that ZFS is send­ing a flush com­mand to the disk at the right mo­ments. Let­ting aside the fact that there are disks out there which lie to you about this (they tell the flush com­mand fin­ished when it is not), this would mean that this is sup­por­ted in FreeBSD too.

So every­one who is cur­rently dis­abling the ZIL to get bet­ter NFS per­form­ance (and ac­cept si­lent data cor­rup­tion on the cli­ent side): move your zpool to ded­ic­ated (no oth­er real FS than ZFS, swap and dump devices are OK) disks (hon­est ones) and en­able the disk-​caches in­stead of dis­abling the ZIL.

I also re­com­mend that people which have ZFS already on ded­ic­ated (and hon­est) disks have a look if the disk-​caches are en­abled.

6 thoughts on “ZFS and NFS /​ on-​disk-​cache”

    1. De­pends upon your hard­ware and OS ver­sion.

      Here in my 9-​current with ATA via CAM I can use e.g. “cam­con­trol identi­fy 0:0:0” to see the status (“write cache”). I do not know of a way to change the write cache set­ting from the com­mand line, but I think the load­er tun­able “hw.ata.wc” is honored by ATA via CAM too (set it to 0 to dis­able the write cache for all disks man­aged by the ATA part).

      For my USB memory stick I can get a list of “mod­e­pages” via “cam­con­trol mod­e­page 4:0:0 -l”. It shows 0×08 as the “Cach­ing Page”, so “cam­con­trol mod­e­page 4:0:0 -m 0×08” gives me the status for it (WCE means “Write Cache En­able”). To change this mod­e­page add “-e” to the pre­vi­ous com­mand line (I doubt any consumer-​grade USB memory stick has a write cache, but you can use this for USB at­tached disks).

      For ATA hard­ware not con­trolled via CAM, have a look at the atacontrol(8) man-​page how to de­term­ine the cur­rent status of your disks.

  1. Disk caches are al­ways en­abled on FreeBSD, un­less you manu­ally dis­able them. Due to the way GEOM works, this is even true for non-​dedicated disks (ie par­ti­tioned disks) us­ing ZFS.

    IOW, this blog post is not rel­ev­ant for FreeBSD ZFS users. 🙂

    1. I beg to dif­fer. This post is highly rel­ev­ant for people which work with FreeBSD since a not so short while and know about the re­com­mend­a­tion to dis­able the write caches of disk drives to have an ac­cept­able be­ha­vi­or in case of a power-​failure. There are many people out there which know about this and dis­able the disk-​cache as one of the first things.

      BTW: the hand­ling of the disk-​caches os not re­lated to GEOM at all, this is some­thing the ATA/​CAM sub­sys­tems are re­spons­ible for. There was even a short mo­ment in time where sos@ switched the de­fault in the ATA driver to dis­able the caches by de­fault (IIRC he had to re­vert it be­cause too much people had the opin­ion that performance(-reviews) are more im­port­ant than data-​consistency). AFAIK CAM (the SCSI side of it) does not even touch this set­ting, as it is a prop­erty of a drive (a set­ting in the drive), so you get whatever the drive vendor has set as factory-​defaults (most prob­ably the write cache is en­abled).

  2. Again, the caches are en­abled by ATA/​CAM and not by ZFS/​GEOM. ZFS is able to send the flush-​cache com­mand in all cases, but this only mat­ters if the cache is en­abled. For ATA the cach­ing is en­abled by de­fault, for SCSI disks it de­pends on the disks, see my cam­con­trol ex­plan­a­tion above.

Leave a Reply

Your email address will not be published. Required fields are marked *