Last week my ZFS cache device – an USB memory stick – showed xxxM write er­rors. I got this stick for free as a promo, so I do not ex­pect it to be of high qual­ity (or wear-​leveling or sim­il­ar life-​saving things). The stick sur­vived about 9 months, dur­ing which it provided a nice speed-​up for the ac­cess to the cor­res­pond­ing ZFS stor­age pool. I re­placed it by an­oth­er stick which I got for free as a promo. This new stick sur­vived… one long week­end. It has now 8xxM write er­rors and the USB sub­sys­tem is not able to speak to it any­more. 30 minutes ago I is­sued an “us­b­con­fig re­set” to this device, which is still not fin­ished. This leads me to the ques­tion if such sticks are really that bad, or if some prob­lem crept in­to the USB sub­sys­tem?

If this is a prob­lem with the memory stick it­self, I should be able to re­pro­duce such a prob­lem on a dif­fer­ent ma­chine with a dif­fer­ent OS. I could test this with FreeBSD 8.1, Sol­ar­is 10u9, or Win­dows XP. What I need is an auto­mated test. This rules out the Win­dows XP ma­chine for me, I do not want to spend time to search a suit­able test which is avail­able for free and al­lows to be run in an auto­mated way. For FreeBSD and Sol­ar­is it prob­ably comes down to use some disk-​I/​O bench­mark (I think there are enough to chose from in the FreeBSD Ports Col­lec­tion) and run it in a shell-loop.

    1. This is with an In­tel ICH5 chip­set. And the first stick was work­ing 9 months without prob­lems, so I doubt this is In­tel vs. AMD here.

  2. I’m in­clined to be­lieve it’s bad sticks, and not bad USB stack.

    I use a SanDisk U3 Cruzer Mi­cro 2 GB USB stick for the OS at home (FreeBSD 8.1-STABLE). Has sur­vived many in­stall­worlds as the sys­tem star­ted with 6.something, and I like to fiddle with ker­nel op­tions so there’s a lot of writes go­ing to it every 6 weeks or so. This stick was also used for doc­u­ments be­fore be­ing co-​opted for OS in­stalls.

    I also use a 4 GB Jet­Flash for L2ARC and swap in the above sys­tem. Been run­ning fine for sev­er­al months now, and sits at 90% full all the time. No prob­lems so far.

    How­ever, 2x 2 GB King­ston USB sticks died on me with­in weeks of each oth­er, and with­in a month of be­ing put in­to use on a serv­er. Even as L2ARC devices, they couldn’t keep up and would just drop off the USB con­trol­ler. Try­ing to use them as simple floppy re­place­ments didn’t work too well, either.

    I have an­oth­er 1 GB stick at home that worked fine for about 3 months, and then just stopped be­ing de­tec­ted in Win­dows XP, Kubuntu 9.something, or FreeBSD 8.x.

    It’s very much hit-​and-​miss on wheth­er or not a USB stick is go­ing to be fast (read or write) and how long it’s go­ing to last. Un­for­tu­nately, neither price nor name brand seem to be re­li­able de­term­in­a­tions of the USB stick’s qual­ity, en­dur­ance, or speed. 🙁

  3. Quote http://​en​.wiki​pe​dia​.org/​w​i​k​i​/​F​l​a​s​h​_​m​e​m​ory:

    > Memory wear

    > An­oth­er lim­it­a­tion is that flash memory has a fi­nite num­ber of program-​erase cycles (typ­ic­ally writ­ten as P/​E cycles). Most com­mer­cially avail­able flash products are guar­an­teed to with­stand around 100,000 P/​E cycles, be­fore the wear be­gins to de­teri­or­ate the in­teg­rity of the storage.[7]


    It’s how the tech­no­logy works

    1. And be­cause of this I thought the first stick was “fin­ished”. But the second stick which did not even sur­vive a long week­end, was “fresh”.

  4. I con­nec­ted the first USB memory stick which was fail­ing to a Sol­ar­is 10u9 ma­chine. I cre­ated a ZFS on it (to be able to de­tect si­lent cor­rup­tions), and am run­ning some tor­ture tests. 

    In the morn­ing I had fsx run­ning on it, but it did not eat up a lot of memory on the stick, so I switched to post­mark.

    Since about 4 hours I have post­mark run­ning on it to test for prob­lems. So far ZFS did not de­tect any cor­rup­tions.

    I will let it run the whole night, and if there are still no prob­lems to­mor­row, I as­sume the stick is OK (as it will have had much more traffic seen dur­ing this time than it would have had in an en­tire week as a ZFS cache device on my ma­chine at home).

  5. It seems that at least those sticks I have, are not that bad.

    I tested the first stick which failed first on Sol­ar­is 10u9 and the second stick with an­oth­er FreeBSD ma­chine, and both do not show traces of prob­lems there. The first stick at­tached back to the ma­chine which ex­hib­ited the prob­lems ini­tially shows prob­lems again.

    To de­term­ine if it is the USB hard­ware or the FreeBSD USB ker­nel sub­sys­tem, I will step by step up­date the oth­er FreeBSD ma­chine which did not ex­hib­it the prob­lems to more re­cent ver­sions of FreeBSD (bin­ary search) un­til I en­counter a prob­lem with the USB stick (or ar­rive at the same FreeBSD ver­sion as the ma­chine with the prob­lems).

  6. In FreeBSD 8.0 I was us­ing a 250GB USB Seag­ate drive as my mir­ror in ZFS. When I in­stalled 8.1 I found my device could no longer sync to the in­tern­al disk any­more. It al­ways had write er­rors and re­set the sync, over and over, nev­er fin­ish­ing.

    The drive works fine on 8.0 or on Linux, maybe you are on to some­thing…

  7. Did you con­sider writ­ting to usb@ with your prob­lem? I have the im­pres­sion that your prob­lem is a dif­fer­ent one (and can maybe solved with a little quirk-​entry). If you didn’t write to usb@ I sug­gest to do it. Provide them a copy&paste of your USB re­lated dmesg out­put and of the er­ror mes­sages.

    I only have some­times write er­rors, not al­ways. And I do not have sync-​resets for sure.

  8. After a lot of test­ing with two ma­chines, I am now at a point where I think the EHCI part of the ICH5 chip­set of this ma­chine is dy­ing (and the USB memory sticks are still work­ing cor­rectly when at­tached to an­oth­er ma­chine).

  9. I’ve used two 1GB King­max Su­per Stick on my desktop as cache devices (I run a zRAID2 on it). They both gave up at a few days in­ter­val after some 6 – 7 months. So used this way, I’d say they are bad enough.

