Linuxu­lat­or D-​Trace probes com­mit­ted to cur­rent

A while ago I com­mit­ted the linuxu­lat­or D-​Trace probes I talked about earli­er. I waited a little bit for this an­nounce­ment to make sure I have not broken any­thing. Nobody com­plained so far, so I as­sume noth­ing ob­vi­ously bad crept in.

The >500 probes I com­mit­ted do not cov­er the en­tire linuxu­lat­or, but are a good start. Adding new ones is straight for­ward, if someone is in­ter­ested in a ju­ni­or–ker­nel–hack­er task, this would be one. Just ask me (or ask on emu­la­tion@), and I can guide you through it.

Linuxu­lat­or pro­gress

This week­end I made some pro­gress in the linuxu­lat­or:

  • I MFCed the re­port­ing of some linux-​syscalls to 9–stable and 8-​stable.
  • I up­dated my linuxu­lat­or-dtrace patch to a re­cent –cur­rent. I already com­piled it on i386 and arundel@ has it com­piled on amd64. I coun­ted more than 500 new DTrace probes. Now that DTrace res­cans for SDT probes when a ker­nel mod­ule is loaded, there is no ker­nel pan­ic any­more when the linux mod­ule is loaded af­ter the DTrace mod­ules and you want to use DTrace. I try to com­mit this at a morn­ing of a day where I can fix things dur­ing the day in case some prob­lems show up which I did not no­tice dur­ing my test­ing.
  • I cre­ated a PR for portmgr@ to re­po­copy a new linux_​base port.
  • I set the ex­pir­a­tion date of linux_​base-​fc4 (only used by 7.x and up­stream way past its EoL) and all de­pend­ent ports. It is set to the EoL of the last 7.x re­lease, which can not use a later linux_​base port. I also ad­ded a com­ment which ex­plains that the date is the EoL of the last 7.x re­lease.

DTrace probes for the Linuxu­lat­or up­dated

If someone had a look at the earli­er post about DTrace probes for the Linuxu­lat­or: I up­dated the patch at the same place. The dif­fer­ence between the pre­vi­ous one is that some D–scripts are fixed now to do what I meant, spe­cially the ones which provide stat­ist­ics out­put.

New DTrace probes for the linuxu­lat­or

I for­ward por­ted my DTrace probes for the FreeBSD linuxu­lat­or from a 2008-​current to a re­cent –cur­rent. I have not the com­plete FreeBSD linuxu­lat­or covered, but a big part is already done. I can check the ma­jor locks in the linuxu­lat­or, trace fu­texes, and I have a D-​script which yells at a lot of er­rors which could hap­pen but should not.

Some of my D-​scripts need some changes, as real-​world test­ing showed that they are not really work­ing as ex­pec­ted. They can get over­whelmed by the amount of spec­u­la­tion and dy­nam­ic vari­ables (er­ror mes­sage: dy­nam­ic vari­able drops with non-​empty dirty list). For the dy­nam­ic vari­ables prob­lem I found a dis­cus­sion on the net with some sug­ges­tions. For the spec­u­la­tion part I ex­pect sim­il­ar tuning-​possibilities.

Un­for­tu­nately the D-​script which checks the in­tern­al locks fails to com­pile. Seems there is a little mis­un­der­stand­ing on my side how the D-​language is sup­posed to work.

I try to get some time later to have a look at those prob­lems.

Dur­ing my de­vel­op­ment I stumbled over some gen­er­ic DTrace prob­lems with the SDT pro­vider I use for my probes:

  • If you load the Linux mod­ule af­ter the SDT mod­ule, your sys­tem will pan­ic as soon as you want to ac­cess some probes, e.g. “dtrace –l” will pan­ic the sys­tem. Load­ing the Linux mod­ule be­fore the SDT mod­ule pre­vents the pan­ic.
  • Un­load­ing the SDT mod­ule while the Linux mod­ule with the SDT probes is still loaded pan­ics the sys­tem too. Do not un­load the Linux mod­ule if you run with my patch.

Ac­cord­ing to avg@ those are known prob­lems, but I think nobody is work­ing on this. This is bad, be­cause this means I can not com­mit my cur­rent patch­set.

If someone wants to try the new DTrace probes for the linuxu­lat­or, feel free to go to http://​www​.Leidinger​.net/​F​r​e​e​B​S​D​/​c​u​r​r​e​n​t​-​p​a​t​c​h​es/ and down­load linuxulator-dtrace.diff. I do not of­fer a work­ing hy­per­link here on pur­pose, the SDT bugs can hurt if you are not care­ful, and I want to make the use of this patch a strong opt-​in be­cause of this. If the patch hurts you, it is your fault, you have been warned.

Un­der­stand­ing latency

Brendan Gregg of Sun Or­acle fame made a good ex­plan­a­tion how to visu­al­ize latency to get a bet­ter un­der­stand­ing of what is go­ing on (and as such about how to solve bot­tle­necks). I have seen all this already in vari­ous posts in his blog and in the Ana­lyt­ics pack­age in an Open­Stor­age present­a­tion, but the ACM art­icle sum­mar­izes it very good.

Un­for­tu­nately Ana­lyt­ics is AFAIK not avail­able in OpenSol­ar­is, so we can not go out and ad­apt it for FreeBSD (which would prob­ably re­quire to port/​implement some ad­di­tion­al dtrace stuff/​probes). I am sure some­thing like this would be very in­ter­est­ing to all those com­pan­ies which use FreeBSD in an ap­pli­ance (re­gard­less if it is a stor­age ap­pli­ance like Net­App, or a net­work ap­pli­ance like a Cis­co/​Juniper router, or any­thing else which has to per­form good).