Alexander Leidinger

Just another weblog

Feb
10

Mak­ing ZFS faster…

Cur­rently I play a lit­tle bit around with my ZFS setup. I want to make it faster, but I do not want to spend a lot of money.

The disks are con­nected to an ICH 5 con­troller, so an obvi­ous improve­ment would be to either buy a con­troller for the PCI slot which is able to do NCQ with the SATA disks (a siis(4) based one is not cheap), or to buy a new sys­tem which comes with a chipset which knows how to do NCQ (this would mean new RAM, new CPU, new MB and maybe even a new PSU). A new con­troller is a lit­tle bit expen­sive for the old sys­tem which I want to tune. A new sys­tem would be nice, and read­ing about the specs of new sys­tems lets me want to get a Core i5 sys­tem. The prob­lem is that I think the cur­rent offers of main­boards for this are far from good. The sys­tem should be a lit­tle bit future proof, as I would like to use it for about 5 years or more (the cur­rent sys­tem is some­where between 5–6 years old). This means it should have SATA-3 and USB 3, but when I look at what is offered cur­rently it looks like there are only beta-versions of hard­ware with SATA-3 and USB 3 sup­port avail­able on the marked (accord­ing to tests there is a lot of vari­ance of the max speed the con­trollers are able to achieve, bugs in the BIOS, or the  con­trollers are attached to a slow bus which pre­vents to use the full band­width). So it will not be a new sys­tem soon.

As I had a 1GB USB-stick around, I decided to attach it to the one of the EHCI USB ports and use it as a cache device for ZFS. If some­one wants to try this too, be care­ful with the USB ports. My main­board has only 2 USB ports con­nected to an EHCI, the rest are UHCI ones. This means that only 2 USB ports are fast (sort of… 40 MBit/s), the rest is only usable for slow things like a mouse, key­board or a ser­ial line.

Be warned, this will not give you a lot of band­width (if you have a fast USB stick, the 40MBit/s of the EHCI are the limit which pre­vent a big stream­ing band­width), but the latency of the cache device is great when doing small ran­dom IO. When I do a gstat and have a look how long a read oper­a­tion takes for each involved device, I see some­thing between 3 msec and 20 msec for the hard­disks (depend­ing if they are read­ing some­thing at the cur­rent head posi­tion, or if the hard­disk needs to seek around a lot). For the cache device (the USB stick) I see some­thing between around 1 mssec and 5 msec. That is 1/3th to 1/4th of the latency of the harddisks.

With a “zfs send” I see about 300 IOops per hard­disk (3 disks in a RAIDZ). Obvi­ously this is an opti­mum stream­ing case where the disks do not need to seek around a lot. You see this in the low latency, it is about 2 msec in this case. In the random-read case, like for exam­ple when you run a find, the disks can not keep this amount of IOops, as they need to seek around. And here the USB-stick shines. I’ve seen upto 1600 IOops on it dur­ing run­ning a find (if the cor­re­spond­ing data is in the cache, off course). This was with some­thing between 0.5 and 0.8 msec of latency.

This is the machine at home which is tak­ing care about my mails (incom­ing and out­go­ing SMTP, IMAP and Web­mail), has a squid proxy and acts as a file server. There are not many users (just me and my wife) and there is no reg­u­lar usage pat­tern for all those ser­vices. Because of this I did not do any bench­mark to see how much time I can gain with var­i­ous work­loads (and I am not inter­ested in some arti­fi­cial per­for­mance num­bers of my web­mail ses­sion, as the brows­ing expe­ri­ence is highly sub­jec­tive in this case). For this sys­tem a 1 GB USB stick (which was just col­lect­ing dust before) seems to be a cheap way to improve the response time for often used small data. When I use the web­mail inter­face now, my sub­jec­tive impres­sion is, that it is faster. I am talk­ing about list­ing emails (sub­ject, date, sender, size) and dis­play­ing the con­tent of some emails. FYI, my maildir stor­age has 849 MB with 35000 files in 91 folders.

Bot­tom line is: do not expect a lot of band­width increase with this, but if you have a work­load which gen­er­ates ran­dom read requests and you want to decrease the read latency, it could be a cheap solu­tion to add a (big) USB stick as a cache device.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
  • Share/Bookmark

Jan
19

Improv­ing the order of direc­to­ries to backup in tarsnap

I exper­i­mented a lit­tle bit with the order of direc­to­ries to backup in tarsnap.

Cur­rently I use the fol­low­ing sort­ing algorithm:

  1. least fre­quently changed directory-trees first
    Every change — even in meta-data — will affect the fol­low­ing data, as tarsnap is doing the de-duplication in fixed-width blocks (AFAIR 64k).
  2. for those directory-trees which change with the same fre­quency: list the big­ger ones first
    Implic­itly I assume that the smaller ones are much smaller than the big­ger ones so that the smaller part which will be backed up will not be noticed because of the big­ger change. For my use cases of tarsnap this is true.
  3. if changes in a directory-tree are much much big­ger than any­thing else, but the directory-tree has a medium change-frequency, put it even before less-frequently chang­ing stuff
    I do not want that a small change trig­gers a big backup, but a big backup can con­tain the remain­ing small part.
  4. if you backup home direc­to­ries (even root’s one) and they do not con­tain much data, put them before directory-trees which change a lot daily
    I do not want that a login trig­gers the trans­fer of data in other directory-trees which have not changed.
GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
  • Share/Bookmark

Nov
11

I have a VIP page now…

I had a look at all my posts and deter­mined which ones pro­vide a mid-term / long-term ben­e­fit to read­ers. Those I clas­si­fied for myself as Very Impor­tant Posts (for an appro­pri­ate def­i­n­i­tion of very and impor­tant…) and they are listed on the VIP page now.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
  • Share/Bookmark

Nov
04

Still hav­ing “fun” with Net­worker 7.5.1.4

Prob­lems with the back­ups of some machines? Try a shut­down of Net­worker (server), remove the con­tent of /nsr/tmp/, and restart. Unfor­tu­nately, some­times this helps / is needed.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
  • Share/Bookmark

Oct
07

The doc­u­men­ta­tion of CUPS is not very good (CUPS client setup)

Yes­ter­day evening I did setup a CUPS server at home. It was on my TODO list since years. Before I just went down­stairs and con­nected the printer via USB to the laptop/netbook for print­ing (to pickup the print­out I have to go there any­way). It is not the first time that I setup the server side of CUPS, but it was the first time that I wanted to use the CUPS com­mand line util­i­ties instead of the FreeBSD/Solaris printspooler and the native lpr/lp commands.

First I just had a look at some man-pages of the CUPS util­i­ties, in the hope to find some com­mand to tell that any print­ing should be done via a remote CUPS server. As I did not find any­thing, I went to the doc­u­men­ta­tion page of CUPS to search there. To me this is some sim­ple con­fig part if you want to print from more than one machine, so I had a look at the “Get­ting Started” part. This was a total fail­ure. I found noth­ing related to my prob­lem. After that I went to the “Man Pages” part to search for a com­mand which I may have over­looked. Again, a total fail­ure. The FAQ also does not con­tain any use­ful infor­ma­tion when you search for “client” or “remote”. In the end I stum­bled over the client.conf entry in the Ref­er­ences part. After I found this it was easy (and fast, I just added a line in client.conf with “Server­Name <server>” and every­thing worked as I wanted it to work).

The setup in Win­dows XP to use the CUPS server is easy, just add a net­work printer via http://<server>:631/printers/<printer> and use the cor­rect printer dri­ver for your printer model. Do not for­get to make the application/object-stream in the mime* con­fig files and allow remote print­ing in the server. No, I do not want to inte­grate it into Samba, the num­ber of Win­dows sys­tems is very lim­ited (2 Win­dows against 2 Unix machines with 14 light­weight vir­tual Unix machines), so I do not need this.

GD Star Rat­ing
load­ing…
GD Star Rat­ing
load­ing…
  • Share/Bookmark