Currently I play a little bit around with my ZFS setup. I want to make it faster, but I do not want to spend a lot of money.
The disks are connected to an ICH 5 controller, so an obvious improvement would be to either buy a controller for the PCI slot which is able to do NCQ with the SATA disks (a siis(4) based one is not cheap), or to buy a new system which comes with a chipset which knows how to do NCQ (this would mean new RAM, new CPU, new MB and maybe even a new PSU). A new controller is a little bit expensive for the old system which I want to tune. A new system would be nice, and reading about the specs of new systems lets me want to get a Core i5 system. The problem is that I think the current offers of mainboards for this are far from good. The system should be a little bit future proof, as I would like to use it for about 5 years or more (the current system is somewhere between 5 – 6 years old). This means it should have SATA‑3 and USB 3, but when I look at what is offered currently it looks like there are only beta-versions of hardware with SATA‑3 and USB 3 support available on the marked (according to tests there is a lot of variance of the max speed the controllers are able to achieve, bugs in the BIOS, or the controllers are attached to a slow bus which prevents to use the full bandwidth). So it will not be a new system soon.
As I had a 1GB USB-stick around, I decided to attach it to the one of the EHCI USB ports and use it as a cache device for ZFS. If someone wants to try this too, be careful with the USB ports. My mainboard has only 2 USB ports connected to an EHCI, the rest are UHCI ones. This means that only 2 USB ports are fast (sort of… 40 MBit/s), the rest is only usable for slow things like a mouse, keyboard or a serial line.
Be warned, this will not give you a lot of bandwidth (if you have a fast USB stick, the 40MBit/s of the EHCI are the limit which prevent a big streaming bandwidth), but the latency of the cache device is great when doing small random IO. When I do a gstat and have a look how long a read operation takes for each involved device, I see something between 3 msec and 20 msec for the harddisks (depending if they are reading something at the current head position, or if the harddisk needs to seek around a lot). For the cache device (the USB stick) I see something between around 1 mssec and 5 msec. That is 1/3th to 1/4th of the latency of the harddisks.
With a “zfs send” I see about 300 IOops per harddisk (3 disks in a RAIDZ). Obviously this is an optimum streaming case where the disks do not need to seek around a lot. You see this in the low latency, it is about 2 msec in this case. In the random-read case, like for example when you run a find, the disks can not keep this amount of IOops, as they need to seek around. And here the USB-stick shines. I’ve seen upto 1600 IOops on it during running a find (if the corresponding data is in the cache, off course). This was with something between 0.5 and 0.8 msec of latency.
This is the machine at home which is taking care about my mails (incoming and outgoing SMTP, IMAP and Webmail), has a squid proxy and acts as a file server. There are not many users (just me and my wife) and there is no regular usage pattern for all those services. Because of this I did not do any benchmark to see how much time I can gain with various workloads (and I am not interested in some artificial performance numbers of my webmail session, as the browsing experience is highly subjective in this case). For this system a 1 GB USB stick (which was just collecting dust before) seems to be a cheap way to improve the response time for often used small data. When I use the webmail interface now, my subjective impression is, that it is faster. I am talking about listing emails (subject, date, sender, size) and displaying the content of some emails. FYI, my maildir storage has 849 MB with 35000 files in 91 folders.
Bottom line is: do not expect a lot of bandwidth increase with this, but if you have a workload which generates random read requests and you want to decrease the read latency, it could be a cheap solution to add a (big) USB stick as a cache device.