Last week my ZFS cache device – an USB memory stick – showed xxxM write errors. I got this stick for free as a promo, so I do not expect it to be of high quality (or wear-leveling or similar life-saving things). The stick survived about 9 months, during which it provided a nice speed-up for the access to the corresponding ZFS storage pool. I replaced it by another stick which I got for free as a promo. This new stick survived… one long weekend. It has now 8xxM write errors and the USB subsystem is not able to speak to it anymore. 30 minutes ago I issued an “usbconfig reset” to this device, which is still not finished. This leads me to the question if such sticks are really that bad, or if some problem crept into the USB subsystem?
If this is a problem with the memory stick itself, I should be able to reproduce such a problem on a different machine with a different OS. I could test this with FreeBSD 8.1, Solaris 10u9, or Windows XP. What I need is an automated test. This rules out the Windows XP machine for me, I do not want to spend time to search a suitable test which is available for free and allows to be run in an automated way. For FreeBSD and Solaris it probably comes down to use some disk‑I/O benchmark (I think there are enough to chose from in the FreeBSD Ports Collection) and run it in a shell-loop.
Yes they are – from my experience, most sticks won’t survive that long..
Curious if you are running amd or intel? I have heard similar stories on several amd chipsets.
This is with an Intel ICH5 chipset. And the first stick was working 9 months without problems, so I doubt this is Intel vs. AMD here.
Try the fsx test from src tools – it does a random mix of various IO.
I’m inclined to believe it’s bad sticks, and not bad USB stack.
I use a SanDisk U3 Cruzer Micro 2 GB USB stick for the OS at home (FreeBSD 8.1‑STABLE). Has survived many installworlds as the system started with 6.something, and I like to fiddle with kernel options so there’s a lot of writes going to it every 6 weeks or so. This stick was also used for documents before being co-opted for OS installs.
I also use a 4 GB JetFlash for L2ARC and swap in the above system. Been running fine for several months now, and sits at 90% full all the time. No problems so far.
However, 2x 2 GB Kingston USB sticks died on me within weeks of each other, and within a month of being put into use on a server. Even as L2ARC devices, they couldn’t keep up and would just drop off the USB controller. Trying to use them as simple floppy replacements didn’t work too well, either.
I have another 1 GB stick at home that worked fine for about 3 months, and then just stopped being detected in Windows XP, Kubuntu 9.something, or FreeBSD 8.x.
It’s very much hit-and-miss on whether or not a USB stick is going to be fast (read or write) and how long it’s going to last. Unfortunately, neither price nor name brand seem to be reliable determinations of the USB stick’s quality, endurance, or speed. 🙁
Quote http://en.wikipedia.org/wiki/Flash_memory:
> Memory wear
> Another limitation is that flash memory has a finite number of program-erase cycles (typically written as P/E cycles). Most commercially available flash products are guaranteed to withstand around 100,000 P/E cycles, before the wear begins to deteriorate the integrity of the storage.[7]
Dudes!!
It’s how the technology works
And because of this I thought the first stick was “finished”. But the second stick which did not even survive a long weekend, was “fresh”.
I connected the first USB memory stick which was failing to a Solaris 10u9 machine. I created a ZFS on it (to be able to detect silent corruptions), and am running some torture tests.
In the morning I had fsx running on it, but it did not eat up a lot of memory on the stick, so I switched to postmark.
Since about 4 hours I have postmark running on it to test for problems. So far ZFS did not detect any corruptions.
I will let it run the whole night, and if there are still no problems tomorrow, I assume the stick is OK (as it will have had much more traffic seen during this time than it would have had in an entire week as a ZFS cache device on my machine at home).
It seems that at least those sticks I have, are not that bad.
I tested the first stick which failed first on Solaris 10u9 and the second stick with another FreeBSD machine, and both do not show traces of problems there. The first stick attached back to the machine which exhibited the problems initially shows problems again.
To determine if it is the USB hardware or the FreeBSD USB kernel subsystem, I will step by step update the other FreeBSD machine which did not exhibit the problems to more recent versions of FreeBSD (binary search) until I encounter a problem with the USB stick (or arrive at the same FreeBSD version as the machine with the problems).
In FreeBSD 8.0 I was using a 250GB USB Seagate drive as my mirror in ZFS. When I installed 8.1 I found my device could no longer sync to the internal disk anymore. It always had write errors and reset the sync, over and over, never finishing.
The drive works fine on 8.0 or on Linux, maybe you are on to something…
Did you consider writting to usb@ with your problem? I have the impression that your problem is a different one (and can maybe solved with a little quirk-entry). If you didn’t write to usb@ I suggest to do it. Provide them a copy&paste of your USB related dmesg output and of the error messages.
I only have sometimes write errors, not always. And I do not have sync-resets for sure.
After a lot of testing with two machines, I am now at a point where I think the EHCI part of the ICH5 chipset of this machine is dying (and the USB memory sticks are still working correctly when attached to another machine).
I’ve used two 1GB Kingmax Super Stick on my desktop as cache devices (I run a zRAID2 on it). They both gave up at a few days interval after some 6 – 7 months. So used this way, I’d say they are bad enough.