Com­plete net­work loss on Sol­ar­is 10u10 CPU 2012-​10 on vir­tu­al­ized T4-​2

The prob­lem I see at work: A T4-​2 with 3 guest LDOMs, vir­tu­al­ized disks and net­works lost the com­plete net­work con­nectiv­ity “out of the blue” once, and maybe “sporad­ic” dir­ectly after a cold boot. After a lot of dis­cus­sion with Or­acle, I have the im­pres­sion that we have two prob­lems here.

1st prob­lem:
Total net­work loss of the ma­chine (no zone or guest LDOM or the primary LDOM was able to have re­ceive or send IP pack­ets). This happened once. No idea how to re­pro­duce it. In the logs we see the mes­sage “[ID 920994 kern.warning] WARNING: vnetX: ex­ceeded num­ber of per­mit­ted hand­shake at­tempts (5) on chan­nel xxx”. Ac­cord­ing to Or­acle this is sup­posed to be fixed in 148677 – 01 which will come with Sol­ar­is 10u11. They sug­ges­ted to use a vsw in­ter­face in­stead of a vnet in­ter­face on the primary do­main to at least lower the prob­ab­il­ity of this prob­lem hit­ting us. They were not able to tell us how to re­pro­duce the prob­lem (seems to be a race con­di­tion, at least I get this im­pres­sion based upon the de­scrip­tion of the Or­acle en­gin­eer hand­ling the SR). Only a re­boot helped to get the prob­lem solved. I was told we are the only cli­ent which re­por­ted this kind of prob­lem, the patch for this prob­lem is based upon an in­tern­al bu­gre­port from in­tern­al tests.

2nd prob­lem:
After cold boots some­times some ma­chines (not all) are not able to con­nect to an IP on the T4. A re­boot helps, as does re­mov­ing an in­ter­face from an ag­greg­ate and dir­ectly adding it again (see be­low for the sys­tem con­fig). To try to re­pro­duce the prob­lem, we did a lot of warm re­boots of the primary do­main, and the prob­lem nev­er showed up. We did some cold re­boots, and the prob­lem showed up once.

In case someone else sees one of those prob­lems on his ma­chines too, please get in con­tact with me to see what we have in com­mon to try to track this down fur­ther and to share info which may help in maybe re­pro­du­cing the prob­lems.

Sys­tem setup:

  • T4-​2 with 4 HBAs and 8 NICs (4 * igb on-​board, 4 * nxge on ad­di­tion­al net­work card)
  • 3 guest LDOMs and one io+control do­main (both in the primary do­main)
  • the guest LDOMs use SAN disks over the 4 HBAs
  • the primary do­main uses a mirrored zpool on SSDs
  • 5 vswitch in the hy­per­visor
  • 4 ag­greg­ates (aggr1 – aggr4 with L2-​policy), each one with one igb and one nxge NIC
  • each ag­greg­ate is con­nec­ted to a sep­ar­ate vswitch (the 5th vswitch is for machine-​internal com­mu­nic­a­tion)
  • each guest LDOM has three vnets, each vnets con­nec­ted to a vswitch (1 guest LDOM has aggr1+2 only for zones (via vnets), 2 guest LDOMs have ag­gr 3+4 only for zones (via vnets), and all LDOMs have aggr2+3 (via vnets) for global-​zone com­mu­nic­a­tion, all LDOMs are ad­di­tion­ally con­nec­ted to the machine-​internal-​only vswitch via the 3rd vnet)
  • primary do­main uses 2 vnets con­nec­ted to the vswitch which is con­nec­ted to aggr2 and aggr3 (con­sist­ency with the oth­er LDOMs on this ma­chine) and has no zones
  • this means each en­tity (primary do­main, guest LDOMs and each zone) has two vnets in and those two vnets are con­figured in a link-​based IPMP setup (vnet-linkprop=phys-state)
  • each vnet has VLAN tag­ging con­figured in the hy­per­visor (with the zones be­ing in dif­fer­ent VLANs than the LDOMs)

The pro­posed change by Or­acle is to re­place the 2 vnet in­ter­faces in the primary do­main with 2 vsw in­ter­faces (which means to do VLAN tag­ging in the primary do­main dir­ectly in­stead of in the vnet con­fig). To have IPMP work­ing this means to have vsw-linkprop=phys-state. We have two sys­tems with the same setup, on one sys­tem we already changed this and it is work­ing as be­fore. As we don’t know how to re­pro­duce the 1st prob­lem, we don’t know if the prob­lem is fixed or not, re­spect­ively what the prob­ab­il­ity is to get hit again by this prob­lem.

Ideas /​ sug­ges­tions /​ info wel­come.

In­com­pat­ible WP plu­gins

The geosmart plu­gin is in­com­pat­ible with the one-​time-​password (OTP) plu­gin of Word­Press. The prob­lem is that the OTP plu­gin does not dis­play the chal­lenge on the lo­gin page any­more when the geosmart plu­gin is ac­tiv­ated.

A work around may be to make sure the geosmart plu­gin does not do some­thing on the lo­gin page, but this in­com­pat­ib­il­ity could also cause prob­lems some­where else.

The prob­lem could be re­lated to the way the geosmart plu­gin uses jquery. I found a bug re­port for OTP where the prob­lem was the jquery hand­ling in an­oth­er plu­gin. The spe­cif­ic prob­lem men­tioned there does not seem to be the same as in the geosmart plu­gin, at least on the very quick look I had.

So… for now I dis­abled the geosmart plu­gin, most of the time I guessed the se­quence num­ber right, but some­times I did not.

Email app from An­droid 3.1 in An­droid 3.2?

As pre­vi­ously re­por­ted, I tried the up­date to An­droid 3.2 on my Tab and was not happy about the new EMail app. At the week­end I had a little bit of time, so I tried to get the Email.apk from An­droid 3.1 in­to An­droid 3.2.

Long story short, I failed.

Ti­tani­um­Backup PRO was restor­ing or hours (the op­tion to mi­grate from a dif­fer­ent ROM ver­sion was en­abled) un­til I killed the app, and it did not get any­where (I just emailed their sup­port if I did some­thing com­pletely stu­pid, or of this is a bug in TB). And a copy by hand in­to /​system/​apps did not work (app fails to start).

Ideas wel­come.

Strange per­form­ance prob­lem with the IBM HTTP Serv­er (mod­i­fied apache)

Re­cently we had a strange per­form­ance prob­lem at work. A web ap­plic­a­tion was hav­ing slow re­sponse times from time to time and users com­plained. We did not see an un­com­mon CPU/​mem/​swap us­age on any in­volved ma­chine. I gen­er­ated heat-​maps from per­form­ance meas­ure­ments and there where no ob­vi­ous traces of slow be­ha­vi­or. We did not find any reas­on why the ap­plic­a­tion should be slow for cli­ents, but ob­vi­ously it was.

Then someone men­tioned two re­cent apache DoS prob­lems. Num­ber one – the cook­ie hash is­sue – did not seem to be the cause, we did not see a huge CPU or memory con­sump­tion which we would ex­pect to see with such an at­tack. The second one – the slow reads prob­lem (no max con­nec­tion dur­a­tion timeout in apache, can be ex­ploited by a small re­ceive win­dow for TCP) – looked like it could be an is­sue. The slow read DoS prob­lem can be de­tec­ted by look­ing at the server-​status page.

What you would see on the server-​status page are a lot of work­er threads in the ‘W’ (write data) state. This is sup­posed to be an in­dic­a­tion of slow reads. We did see this.

As our site is be­hind a re­verse proxy with some kind of IDS/​IPS fea­ture, we took the re­verse proxy out of the pic­ture to get a bet­ter view of who is do­ing what (we do not have X-​Forwarded-​For con­figured).

At this point we no­ticed still a lot of con­nec­tion in the ‘W’ state from the rev-​proxy. This was strange, it was not sup­posed to do this. After re­start­ing the rev-​proxy (while the cli­ents went dir­ectly to the web­serv­ers) we had those ‘W’ entries still in the server-​status. This was get­ting really strange. And to add to this, the dur­a­tion of the ‘W’ state from the rev-​proxy tells that this state is act­ive since sev­er­al thou­sand seconds. Ugh. WTF?

Ok, next step: killing the of­fend­ers. First I veri­fied in the list of con­nec­tions in the server-​status (extended-​status is ac­tiv­ated) that all work­er threads with the rev-​proxy con­nec­tion of a giv­en PID are in this strange state and no cli­ent re­quest is act­ive. Then I killed this par­tic­u­lar PID. I wanted to do this un­til I do not have those strange con­nec­tions any­more. Un­for­tu­nately I ar­rived at PIDs which were lis­ted in the server-​status (even after a re­fresh), but not avail­able in the OS. That is bad. Very bad.

So the next step was to move all cli­ents away from one web­serv­er, and then to re­boot this web­serv­er com­pletely to be sure the en­tire sys­tem is in a known good state for fu­ture mon­it­or­ing (the big ham­mer ap­proach).

As we did not know if this strange state was due to some kind of mis-​administration of the sys­tem or not, we de­cided to have the rev-​proxy again in front of the web­serv­er and to mon­it­or the sys­tems.

We sur­vived about one and a half day. After that all work­er threads on all web­serv­ers where in this state. DoS. At this point we where sure there was some­thing ma­li­cious go­ing on (some days later our man­age­ment showed us a mail from a com­pany which offered se­cur­ity con­sult­ing 2 months be­fore to make sure we do not get hit by a DDoS dur­ing the hol­i­day sea­son… a co­in­cid­ence?).

Next step, veri­fic­a­tion of miss­ing se­cur­ity patches (un­for­tu­nately it is not us who de­cides which patches we ap­ply to the sys­tems). What we no­ticed is, that the rev-​proxy is miss­ing a patch for a DoS prob­lem, and for the web­serv­ers a new fix­pack was sched­uled to be re­leased not far in the fu­ture (as of this writ­ing: it is avail­able now).

Since we ap­plied the DoS fix for the rev-​proxy, we do not have a prob­lem any­more. This is not really con­clus­ive, as we do not really know if this fixed the prob­lem or if the at­tack­er stopped at­tack­ing us.

From read­ing what the DoS patch fixes, we would as­sume we should see some con­tinu­ous traffic go­ing on between the rev-​rpoxy and the web­serv­er, but there was noth­ing when we ob­served the strange state.

We are still not al­lowed to ap­ply patches as we think we should do, but at least we have a bet­ter mon­it­or­ing in place to watch out for this par­tic­u­lar prob­lem (ac­tiv­ate the ex­ten­ded status in apache/​IHS, look for lines with state ‘W’ and a long dur­a­tion (column ‘SS’), raise an alert if the dur­a­tion is high­er than the max. possible/​expected/​desired dur­a­tion for all pos­sible URLs).

A phoronix bench­mark cre­ates a huge bench­mark­ing dis­cus­sion

The re­cent Phoronix bench­mark which com­pared a re­lease can­did­ate of FreeBSD 9 with Or­acle Linux Serv­er 6.1 cre­ated a huge dis­cus­sion in the FreeBSD mailing­lists. The reas­on was that some people think the num­bers presen­ted there give a wrong pic­ture of FreeBSD. Partly be­cause not all bench­mark num­bers are presen­ted in the most prom­in­ent page (as linked above), but only at a dif­fer­ent place. This gives the im­pres­sion that FreeBSD is in­feri­or in this bench­mark while it just puts the fo­cus (for a reas­on, ac­cord­ing to some people) on a dif­fer­ent part of the bench­mark (to be more spe­cif­ic, blo­g­bench is do­ing disk reads and writes in par­al­lel, FreeBSD gives high­er pri­or­ity to writes than to reads, FreeBSD 9 out­per­forms OLS 6.1 in the writes while OLS 6.1 shines with the reads, and only the reads are presen­ted on the first page). Oth­er com­plaints are that it is told that the de­fault in­stall was used (in this case UFS as the FS), when it was not (ZFS as the FS).

The au­thor of the Phoronix art­icle par­ti­cip­ated in parts of the dis­cus­sion and asked for spe­cif­ic im­prove­ment sug­ges­tions. A FreeBSD com­mit­ter seems to be already work­ing to get some is­sues re­solved. What I do not like per­son­ally, is that the art­icle is not up­dated with a re­mark that some things presen­ted do not re­flect the real­ity and a retest is ne­ces­sary.

As there was much talk in the thread but not much ob­vi­ous activ­ity from our side to re­solve some is­sues, I star­ted to im­prove the FreeBSD wiki page about bench­mark­ing so that we are able to point to it in case someone wants to bench­mark FreeBSD. Oth­ers already chimed in and im­proved some things too. It is far from per­fect, some more eyes – and more im­port­antly some more fin­gers which add con­tent – are needed. Please go to the wiki page and try to help out (if you are afraid to write some­thing in the wiki, please at least tell your sug­ges­tions on a FreeBSD mailing­list so that oth­ers can im­prove the wiki page).

What we need too, is a wiki page about FreeBSD tun­ing (a first step would be to take the man-​page and con­vert it in­to a wiki page, then to im­prove it, and then to feed back the changes to the man-​page while keep­ing the wiki page to be able to cross ref­er­ence parts from the bench­mark­ing page).

I already told about this in the thread about the Phoronix bench­mark: every­one is wel­come to im­prove the situ­ation. Do not talk, write some­thing. No mat­ter if it is an im­prove­ment to the bench­mark­ing page, tun­ing ad­vise, or a tool which in­spects the sys­tem and sug­gests some tun­ing. If you want to help in the wiki, cre­ate a First­nameLast­name ac­count and ask a FreeBSD comit­ter for write ac­cess.

A while ago (IIRC we have to think in months or even years) there was some frame­work for auto­mat­ic FreeBSD bench­mark­ing. Un­for­tu­nately the au­thor run out of time. The frame­work was able to in­stall a FreeBSD sys­tem on a ma­chine, run some spe­cified bench­mark (not much bench­marks where in­teg­rated), and then in­stall an­oth­er FreeBSD ver­sion to run the same bench­mark, or to re­in­stall the same ver­sion to run an­oth­er bench­mark. IIRC there was also some DB be­hind which col­lec­ted the res­ults and maybe there was even some way to com­pare them. It would be nice if someone could get some time to talk with the au­thor to get the frame­work and set it up some­where, so that we have a con­trolled en­vir­on­ment where we can do our own bench­marks in an auto­mat­ic and re­peat­able fash­ion with sev­er­al FreeBSD ver­sions.

An­droid Wish­list: The EMail-​App

Things I do not like with An­droid, and what I would ex­pect in­stead.

The EMail-​App

 This is about the nor­mal EMail App on a stock An­droid 3.1, not about GMail. The con­nec­tion is via IMAP.

  • By de­fault the most re­cent EMail is on top. If I switch to oldest-​first or­der, the old­est EMail is shown when en­ter­ing a folder. I want to see the most re­cent one. If I let the most-​recent or­der, I see dir­ectly the new­est mes­sages, but when I read the old­est un­read mes­sage and de­lete it, the more old mes­sage is dis­played, in­stead of the more re­cent one. KMail is do­ing this bet­ter (it is able to go “up” in­stead of only “down” to go to the more re­cent mes­sage when the most re­cent mail is sor­ted on top), but in KMail I can not se­lect mul­tiple mes­sages as easy as with the nor­mal EMail App (it seems I have to long-​tap on a mail and then tell to se­lect the EMail, and just then I can just quickly tab on the very small area on the left of the sub­ject to se­lect more EMails).
  • Se­lect­ing mes­sages could also be im­proved. When I se­lect sev­er­al mes­sages, and then – by ac­ci­dent – enter a mes­sage and want to go back, the first “go back” un­se­lects all mes­sages and the second “go back” goes back to the folder-​view. I would like to go back dir­ectly to the folder-​view, in­stead of un­se­lect­ing all mes­sages.
  • The folder-​view it­self is also not nice. I have a lot of folders. IMAP sub­folders are sep­ar­ated by a dot, e.g. FreeBSD.arch and FreeBSD.mul­ti­me­dia are the sub­folders arch and mul­ti­me­dia in the folder FreeBSD. Each desktop EMail pro­gram I used so far was in­tel­li­gent enough to cre­ate a folder-​hierarchy out of this, with the pos­sib­il­ity to col­lapse the dis­play of all sub­folders of FreeBSD (I have a lot there, not only those 2) in­to one entry. If I want to field some­thing in my hier­arch­ic­al folder struc­ture, I have to scroll a long list of folders, in­stead of just (auto-)opening (dur­ing drag&drop) the cor­rect hier­archy.
  • I do not find an op­tion to tell that the App shall have a look at more than the de­fault in­box to look for new mes­sages. If I want to know if there is a new mes­sage in one of the oth­er folders, I must have a look in­to the folder(s).
  • It looks like I can only pro­duce TOFU-​replies, I have not found a way to do a prop­er interleaved-​style reply.
  • The er­ror mes­sages when set­ting up an out­go­ing (and I as­sume in­com­ing) serv­er are too brief for my taste. As a de­fault it is OK, but there should be an op­tion to en­able more verbose/​technical er­ror mes­sages for those which are able to un­der­stand them.
  • It does not op­tion­ally save send mails to a spe­cif­ic (con­fig­ur­able) folder.
  • It does not al­low to cryptocally sign mes­sages (at the mo­ment I do not really care if it is via S/​MIME or via PGP).
  • I also would like to tab with two fin­gers, and the text in-​between is se­lec­ted (not only in the EMail-​App).

For­cing a route in Sol­ar­is?

I have a little prob­lem find­ing a clean solu­tion to the fol­low­ing prob­lem.

A ma­chine with two net­work in­ter­faces and no de­fault route. The first in­ter­face gets an IP at boot time and the cor­res­pond­ing stat­ic route is in­ser­ted dur­ing boot in­to the rout­ing table without prob­lems. The second in­ter­face only gets an IP ad­dress when the shared-​IP zones on the ma­chine are star­ted, dur­ing boot the in­ter­face is plumbed but without any ad­dress. The net­works on those in­ter­faces are not con­nec­ted and the ma­chine is not a gate­way (this means we have a machine-​administration net­work and a production-​network). The stat­ic routes we want to have for the ad­dresses of the zones are not ad­ded to the rout­ing table, be­cause the next hop is not reach­able at the time the routing-​setup is done. As soon as the zones are up (and the in­ter­face gets an IP), a re-​run of the routing-​setup adds the miss­ing stat­ic routes.

Un­for­tu­nately I can not tell Sol­ar­is to keep the stat­ic route even if the next hop is not reach­able ATM (at least I have not found an op­tion to the route com­mand which does this).

One solu­tion to this prob­lem would be to add an ad­dress at boot to the in­ter­face which does not have an ad­dress at boot-​time ATM (prob­ably with the de­prec­ated flag set). The prob­lem is, that this sub­net (/​28) has not enough free ad­dresses any­more, so this is not an op­tion.

An­oth­er solu­tion is to use a script which re-​runs the routing-​setup after the zones are star­ted. This is a prag­mat­ic solu­tion, but not a clean solu­tion.

As I un­der­stand the in.routed man-​page in.routed is not an op­tion with the de­fault con­fig, be­cause the ma­chine shall not route between the net­works, and shall not change the rout­ing based upon RIP mes­sages from oth­er ma­chines. Un­for­tu­nately I do not know enough about it to be sure, and I do not get the time to play around with this. I have seen some in­ter­st­ing op­tions re­gard­ing this in the man-​page, but play­ing around with this and sniff­ing the net­work to see what hap­pens, is not an op­tion ATM. Any­one with a config/​tutorial for this “do not broad­cast any­thing, do not ac­cept any­thing from outside”-case (if pos­sible)?