At work we have the situation of a slow application. The vendor of the custom application insists that the ZFS (Solaris 10u8) and the Oracle DB are badly tuned for the application. Part of their tuning is to limit the ARC to 1 GB (our max size is 24 GB on this machine). One problem we see is that there are many write operations (rounded values: 1k ops for up to 100 MB) and the DB is complaining that the logwriter is not able to write out the data fast enough. At the same time our database admins see a lot of commits and/or rollbacks so that the archive log grows very fast to 1.5 GB. The funny thing is… the performance tests are supposed to only cover SELECTs and small UPDATEs.
I proposed to reduce the zfs_txg_timeout from the default value of 30 to some seconds (and as no reboot is needed like for the max arc size, this can be done fast instead of waiting some minutes for the boot-checks of the M5000). The first try was to reduce it to 5 seconds and it improved the situation. The DB still complained about not being able to write out the logs fast enough, but it did not do it as often as before. To make the vendor happy we reduced the max arc size and tested again. First we have not seen any complains from the DB anymore, which looked strange to me because my understanding of the ARC (and the description of the ZFS Evil Tuning Guide regarding the max size setting) suggest that this should not show this behavior we have seen, but the machine was also rebooted for this, so there could also be another explanation.
Luckily we found out that our testing infrastructure had a problem so that only a fraction of the performance test was performed. This morning the people responsible for that made some changes and now the DB is complaining again.
This is what I expected. To make sure I fully understand the ARC, I had a look at the theory behind it at the IBM research center. There are some papers which explain how to extend a cache which uses the LRU replacement policy with some lines of code to an ARC. It looks like it would be an improvement to have a look at which places in FreeBSD a LRU policy is used to test if an ARC would improve the cache hit rate. From reading the paper it looks like there are a lot of places where this should be the case. The authors also provide two adaptive extensions to the CLOCK algorithm (used in various OS in the VM subsystem) which indicate that such an approach could be beneficial for a VM system. I already contacted Alan (the FreeBSD one) and asked if he knows about it and if it could be beneficial for FreeBSD.
GD Star Rating
loading…
GD Star Rating
loading…
I just committed a patch which makes WITH_CTF usable now.
Yes, you could use it before, but you had to remember to specify it at each build. Now you can add it to your kernel config (via makeoptions), and then you can forget about it.
Thanks to jhb and imp for review and suggestions.
GD Star Rating
loading…
GD Star Rating
loading…
After putting the disks of the 7-stable system which exhibited stability problems into a completely different system (it is a rented root-server, not our own hardware), the system now survived more than a day (and still no trace of problems) with the UFS setup. Previously it would crash after some minutes.
The ZFS setup with the changed hardware had a problem during the night before (like always after all my ZFS related changes on this machine), but on this machine I changed all locks in ZFS from shared locks to exclusive locks (this extended the uptime from 4–6 hours to “until I rebooted the morning after because of hanging processes”), so this may be because of this. I do not know yet if we will test the ZFS setup with the pure 7-stable source we use now or not (the goal was to get back a stable system, instead of playing around with unrelated stuff).
It looks like some kind of hardware problem was uncovered by updating from 7.1 to 7.2 (and 7-stable subsequently). This new machine has a completely different chipset, a new CPU and RAM and PSU and … so I do not really know what caused this (but the fact that the previous system did not recognize the CPU after replacing it with a bigger one and the observation that only shared locks with a specific usage pattern where affected lets me point towards missing microcode updates…).
GD Star Rating
loading…
GD Star Rating
loading…
During the last weeks I identified 64 patches for ZFS which are in 8-stable but not in 7-stable. For 56 of them I had a deeper look and most of them are commited now to 7-stable. The ones of those 56 which I did not commit are not applicable to 7-stable (infrastructure differences between 8 and 7).
Unfortunately this did not solve the stability problems I have on a 7-stable system.
I also committed a diff reduction (between 8-stable and 7-stable) patch which also fixed some not so harmless mismerges (mem-leak and initializing the same mutex twice at different places). No idea yet if it helps in my case.
I also want to merge the new arc reclaim logic from head to 8-stable and 7-stable. Maybe I can do this tomorrow.
Currently I run a test with a kernel where the shared locks for ZFS are switched to exclusive locks.
GD Star Rating
loading…
GD Star Rating
loading…
Due to the problems with a 7-stable machine, I had a look at some unmerged fixes for ZFS (58 changes not merged).
I backported some of those changes from 8-stable to 7-stable, I have this running on one 7-stable machine. I would like to get some more feedback for it (even an “it works for me” would be great). The main part of this change is that the FreeBSD taskqueue is used now instead of the opensolaris one (and some other changes which may improve the ZFS experience).
It would also be nice if someone could have a look at the FIRST_THREAD_IN_PROC part. Can there be more than one thread at this place (I do not think so) and I should use FOREACH_THREAD_IN_PROC_instead?
How to apply:
- cd /usr/src/
- fetch http://www.Leidinger.net/FreeBSD/test/releng7_zfs_merge3.diff
- fetch http://www.Leidinger.net/FreeBSD/test/opensolaris_taskq.c
- fetch http://www.Leidinger.net/FreeBSD/test/taskq.h
- mv taskq.h sys/cddl/contrib/opensolaris/uts/common/sys/taskq.h
- mv opensolaris_taskq.c sys/cddl/compat/opensolaris/kern/opensolaris_taskq.c
- patch –p 0 –quiet <releng7_zfs_merge3.diff
- ignore the 2 .rej files
- rm –f sys/cddl/compat/opensolaris/sys/taskq_impl.h*
- rm –f sys/cddl/compat/opensolaris/sys/taskq.h*
- rm –f sys/cddl/contrib/opensolaris/uts/common/os/taskq.c*
- rebuild kernel
I do not list all of those 16 of 58 outstanding patches which are covered here, a detailed list can be found on the stable and fs mailinglists.
GD Star Rating
loading…
GD Star Rating
loading…