Con­tac­ted by a law­yer re­gard­ing MP3

A while ago (end of Au­gust 2009) I was con­tac­ted by a law­yer be­cause of my par­ti­cip­a­tion in the LAME-​project. It was about the MP3-​patents. They searched an ex­pert wit­ness for a case.

I had the im­pres­sion that it is about the in­val­id­a­tion of at least parts of one of the pat­ents. May­be they have a cli­ent which was sued for in­fringe­ment. Un­for­tu­nately for them I have ab­so­lutely no clue what is in­side the MP3-​patents (I am/​was tak­ing care about the “glue” in LAME) and the phone call we had was just some hours be­fore I went in­to hol­i­day. I re­ferred him to two oth­er de­velopers of the LAME-​project which not only should have bet­ter know­ledge about the parts the law­yer is in­ter­ested in, but also where prob­ably not in hol­i­day.

We also had a little chat about pat­ents in gen­er­al, and my opin­ion was that soft­ware pat­ents are not that use­ful. In the IT world 3 years is a lot of time, tech­no­logy is already over­taken by new de­vel­op­ments most of the time af­ter this time. When as­sum­ing that de­vel­op­ing some­thing new de­pend­ing on some tech­no­logy seen at an­other place takes at least about 1 year (do not hit me be­cause of this rough es­tim­a­tion without spe­cify­ing the size of the pro­ject or the qual­ity re­quire­ments), a soft­ware pat­ent which is val­id longer than 5 years is more than enough in my opin­ion. Any com­pany which was not able to make some money with it dur­ing this time made some­thing wrong, and block­ing the com­pet­i­tion be­cause of this is not really a good idea from my point of an user of tech­no­logy. As an user I want ad­vance­ments. And as an open source de­veloper I try to pro­duce my own ad­vance­ments when I can not get them from some­where else. In this light soft­ware pat­ents are not do­ing good for the “ad­vance­ment of the hu­man race”.

The law­yer did not try to con­vince me to the op­pos­ite. Either has was too po­lite, did not care about it, or he si­lently agrees. He  told me he wants to stay in touch with me in some way re­gard­ing Open Source and pat­ents. I did not ob­ject to this.

As I was curi­ous about the state of this, I con­tac­ted the law­yer about it, and the cur­rent out­come is not bad. Pre­vi­ously a lot of tries (by oth­er law­yers in the same Ger­man court) failed to fight again­st the par­tic­u­lar pat­ent. This time the court did not fol­low his pre­vi­ous rul­ings but told that this is­sue needs to be in­vest­ig­ated again (at least this is how I un­der­stand this – be­ware, I am not a law­yer). May­be we can see a res­ult this year.

Mak­ing ZFS faster…

Cur­rently I play a little bit around with my ZFS setup. I want to make it faster, but I do not want to spend a lot of money.

The disks are con­nec­ted to an ICH 5 con­trol­ler, so an ob­vi­ous im­prove­ment would be to either buy a con­trol­ler for the PCI slot which is able to do NCQ with the SATA disks (a siis(4) based one is not cheap), or to buy a new sys­tem which comes with a chip­set which knows how to do NCQ (this would mean new RAM, new CPU, new MB and may­be even a new PSU). A new con­trol­ler is a little bit ex­pens­ive for the old sys­tem which I want to tune. A new sys­tem would be nice, and read­ing about the specs of new sys­tems lets me want to get a Core i5 sys­tem. The prob­lem is that I think the cur­rent of­fers of main­boards for this are far from good. The sys­tem should be a little bit fu­ture proof, as I would like to use it for about 5 years or more (the cur­rent sys­tem is some­where between 5 – 6 years old). This means it should have SATA-​3 and USB 3, but when I look at what is offered cur­rently it looks like there are only beta-​versions of hard­ware with SATA-​3 and USB 3 sup­port avail­able on the marked (ac­cord­ing to tests there is a lot of vari­ance of the max speed the con­trol­lers are able to achieve, bugs in the BIOS, or the  con­trol­lers are at­tached to a slow bus which pre­vents to use the full band­width). So it will not be a new sys­tem soon.

As I had a 1GB USB-​stick around, I de­cided to at­tach it to the one of the EHCI USB ports and use it as a cache device for ZFS. If someone wants to try this too, be care­ful with the USB ports. My main­board has only 2 USB ports con­nec­ted to an EHCI, the rest are UHCI ones. This means that only 2 USB ports are fast (sort of… 40 MBit/​s), the rest is only us­able for slow things like a mouse, key­board or a seri­al line.

Be warned, this will not give you a lot of band­width (if you have a fast USB stick, the 40MBit/​s of the EHCI are the lim­it which pre­vent a big stream­ing band­width), but the latency of the cache device is great when do­ing small ran­dom IO. When I do a gstat and have a look how long a read op­er­a­tion takes for each in­volved device, I see some­thing between 3 msec and 20 msec for the hard­disks (de­pend­ing if they are read­ing some­thing at the cur­rent head po­s­i­tion, or if the hard­disk needs to seek around a lot). For the cache device (the USB stick) I see some­thing between around 1 mssec and 5 msec. That is 1/​3th to 1/​4th of the latency of the hard­disks.

With a “zfs send” I see about 300 IOops per hard­disk (3 disks in a RAIDZ). Ob­vi­ously this is an op­tim­um stream­ing case where the disks do not need to seek around a lot. You see this in the low latency, it is about 2 msec in this case. In the random-​read case, like for ex­ample when you run a find, the disks can not keep this amount of IOops, as they need to seek around. And here the USB-​stick shines. I’ve seen up­to 1600 IOops on it dur­ing run­ning a find (if the cor­res­pond­ing data is in the cache, off course). This was with some­thing between 0.5 and 0.8 msec of latency.

This is the ma­chine at home which is tak­ing care about my mails (in­com­ing and out­go­ing SMTP, IMAP and Web­mail), has a squid proxy and acts as a file server. There are not many users (just me and my wife) and there is no reg­u­lar us­age pat­tern for all those ser­vices. Be­cause of this I did not do any bench­mark to see how much time I can gain with vari­ous work­loads (and I am not in­ter­ested in some ar­ti­fi­cial per­form­ance num­bers of my web­mail ses­sion, as the brows­ing ex­per­i­ence is highly sub­ject­ive in this case). For this sys­tem a 1 GB USB stick (which was just col­lect­ing dust be­fore) seems to be a cheap way to im­prove the re­spon­se time for of­ten used small data. When I use the web­mail in­ter­face now, my sub­ject­ive im­pres­sion is, that it is faster. I am talk­ing about list­ing emails (sub­ject, date, sender, size) and dis­play­ing the con­tent of some emails. FYI, my maildir stor­age has 849 MB with 35000 files in 91 folders.

Bot­tom line is: do not ex­pect a lot of band­width in­crease with this, but if you have a work­load which gen­er­ates ran­dom read re­quests and you want to de­crease the read latency, it could be a cheap solu­tion to add a (big) USB stick as a cache device.

Show­ing off some num­bers…

At work we have some per­form­ance prob­lems.

One ap­plic­a­tion (not off-​the-​shelf soft­ware) is not per­form­ing good. The prob­lem is that the design of the ap­plic­a­tion is far from good (auto-​commit is used, and the Or­acle DB is do­ing too much writes for what the ap­plic­a­tion is sup­posed to do be­cause of this). Dur­ing help­ing our DBAs in their per­form­ance ana­lys­is (the vendor of the ap­plic­a­tion is telling our hard­ware is not fast enough and I had to provide some num­bers to show that this is not the case and they need to im­prove the soft­ware as it does not com­ply to the per­form­ance re­quire­ments they got be­fore de­vel­op­ing the ap­plic­a­tion) I no­ticed that the filesys­tem where the DB and the ap­plic­a­tion are loc­ated (a ZFS if someone is in­ter­ested) is do­ing some­times 1.200 IO (write) op­er­a­tions per second (to write about 100 MB). Yeah, that is a lot of IOops our SAN is able to do! Un­for­tu­nately too ex­pens­ive to buy for use at home. 🙁

An­other ap­plic­a­tion (nagios 3.0) was gen­er­at­ing a lot of ma­jor faults (caused by a lot of fork()s for the checks). It is a Sun­Fire V890, and the highest num­ber of MF per second I have seen on this ma­chine was about 27.000. It nev­er went be­low 10.000. On av­er­age may­be some­where between 15.000 and 20.000. My Sol­ar­is–Desktop (an Ul­tra 20) is gen­er­at­ing may­be sev­er­al hun­dred MF if a lot is go­ing on (most of the time is does not gen­er­ate much). Nobody can say the V890 is not used… 🙂 Oh, yes, I sug­ges­ted to en­able the nagios con­fig set­ting for large sites, now the ma­jor faults are around 0 – 10.000 and the ma­chine is not that stressed any­more. The next step is prob­ably to have a look at the an­cient probes (mi­grated from the big brother setup which was there sev­er­al years be­fore) and re­duce the num­ber of forks they do.