The problem I see at work: A T4-2 with 3 guest LDOMs, virtualized disks and networks lost the complete network connectivity “out of the blue” once, and maybe “sporadic” directly after a cold boot. After a lot of discussion with Oracle, I have the impression that we have two problems here.
Total network loss of the machine (no zone or guest LDOM or the primary LDOM was able to have receive or send IP packets). This happened once. No idea how to reproduce it. In the logs we see the message “[ID 920994 kern.warning] WARNING: vnetX: exceeded number of permitted handshake attempts (5) on channel xxx”. According to Oracle this is supposed to be fixed in 148677 – 01 which will come with Solaris 10u11. They suggested to use a vsw interface instead of a vnet interface on the primary domain to at least lower the probability of this problem hitting us. They were not able to tell us how to reproduce the problem (seems to be a race condition, at least I get this impression based upon the description of the Oracle engineer handling the SR). Only a reboot helped to get the problem solved. I was told we are the only client which reported this kind of problem, the patch for this problem is based upon an internal bugreport from internal tests.
After cold boots sometimes some machines (not all) are not able to connect to an IP on the T4. A reboot helps, as does removing an interface from an aggregate and directly adding it again (see below for the system config). To try to reproduce the problem, we did a lot of warm reboots of the primary domain, and the problem never showed up. We did some cold reboots, and the problem showed up once.
In case someone else sees one of those problems on his machines too, please get in contact with me to see what we have in common to try to track this down further and to share info which may help in maybe reproducing the problems.
- T4-2 with 4 HBAs and 8 NICs (4 * igb on-board, 4 * nxge on additional network card)
- 3 guest LDOMs and one io+control domain (both in the primary domain)
- the guest LDOMs use SAN disks over the 4 HBAs
- the primary domain uses a mirrored zpool on SSDs
- 5 vswitch in the hypervisor
- 4 aggregates (aggr1 — aggr4 with L2-policy), each one with one igb and one nxge NIC
- each aggregate is connected to a separate vswitch (the 5th vswitch is for machine-internal communication)
- each guest LDOM has three vnets, each vnets connected to a vswitch (1 guest LDOM has aggr1+2 only for zones (via vnets), 2 guest LDOMs have aggr 3+4 only for zones (via vnets), and all LDOMs have aggr2+3 (via vnets) for global-zone communication, all LDOMs are additionally connected to the machine-internal-only vswitch via the 3rd vnet)
- primary domain uses 2 vnets connected to the vswitch which is connected to aggr2 and aggr3 (consistency with the other LDOMs on this machine) and has no zones
- this means each entity (primary domain, guest LDOMs and each zone) has two vnets in and those two vnets are configured in a link-based IPMP setup (vnet-linkprop=phys-state)
- each vnet has VLAN tagging configured in the hypervisor (with the zones being in different VLANs than the LDOMs)
The proposed change by Oracle is to replace the 2 vnet interfaces in the primary domain with 2 vsw interfaces (which means to do VLAN tagging in the primary domain directly instead of in the vnet config). To have IPMP working this means to have vsw-linkprop=phys-state. We have two systems with the same setup, on one system we already changed this and it is working as before. As we don’t know how to reproduce the 1st problem, we don’t know if the problem is fixed or not, respectively what the probability is to get hit again by this problem.
Ideas / suggestions / info welcome.
GD Star Rating
GD Star Rating
The recent security incident triggered a discussion how to secure ssh/gpg keys.
One way I want to focus on here (because it is the way I want to use at home), is to store the keys on a crypto card. I did some research for suitable crypto cards and found one which is called Feitian PKI Smartcard, and one which is called OpenPGP card. The OpenPGP card also exists in a USB version (basically a small version of the card is already integrated into a small USB card reader).
The Feitian card is reported to be able to handle RSA keys upto 2048 bits. They do not seem to handle DSA (or ECDSA) keys. The smartcard quick starter guide they have (the Tuning smartcard file system part) tells how to change the parameters of the card to store upto 9 keys on it.
The spec of the OpenPGP card tells that it supports RSA keys upto 3072 bits, but there are reports that it is able to handle RSA keys upto 4096 bits (you need to have at least GPG 2.0.18 to handle that big keys on the crypto card). It looks to me like the card is not handle DSA (or ECDSA) cards. There are only slots for upto 3 keys on it.
If I go this way, I would also need a card reader. It seems a class 3 one (hardware PIN pad and display) would be the most “future-proof” way to go ahead. I found a Reiner SCT cyberJack secoder card reader, which is believed to be supported by OpenSC and seems to be a good balance between cost and features of the Reiner SCT card readers.
If anyone reading this can suggest a better crypto card (keys upto 4096 bits, more than 3 slots, and/or DSA/ECDSA support), or a better card reader, or has any practical experience with any of those components on FreeBSD, please add a comment.
GD Star Rating
GD Star Rating
I have a little problem finding a clean solution to the following problem.
A machine with two network interfaces and no default route. The first interface gets an IP at boot time and the corresponding static route is inserted during boot into the routing table without problems. The second interface only gets an IP address when the shared-IP zones on the machine are started, during boot the interface is plumbed but without any address. The networks on those interfaces are not connected and the machine is not a gateway (this means we have a machine–administration network and a production-network). The static routes we want to have for the addresses of the zones are not added to the routing table, because the next hop is not reachable at the time the routing-setup is done. As soon as the zones are up (and the interface gets an IP), a re-run of the routing-setup adds the missing static routes.
Unfortunately I can not tell Solaris to keep the static route even if the next hop is not reachable ATM (at least I have not found an option to the route command which does this).
One solution to this problem would be to add an address at boot to the interface which does not have an address at boot-time ATM (probably with the deprecated flag set). The problem is, that this subnet (/28) has not enough free addresses anymore, so this is not an option.
Another solution is to use a script which re-runs the routing-setup after the zones are started. This is a pragmatic solution, but not a clean solution.
As I understand the in.routed man-page in.routed is not an option with the default config, because the machine shall not route between the networks, and shall not change the routing based upon RIP messages from other machines. Unfortunately I do not know enough about it to be sure, and I do not get the time to play around with this. I have seen some intersting options regarding this in the man-page, but playing around with this and sniffing the network to see what happens, is not an option ATM. Anyone with a config/tutorial for this “do not broadcast anything, do not accept anything from outside”-case (if possible)?
GD Star Rating
GD Star Rating
Tags: administration network
, boot time
, clean solution
, default config
, default route
, network interfaces
, pragmatic solution
, routing table
, static route
, static routes
At the weekend a friend visited me. We have not seen since each other since a long time. As we studied both computer science, parts of our discussion where off course technology related. Parts of the discussion where about current TV’s and game consoles (he participated in the design of the PS3 CPU, so he is well aware about the technical limitations of the hardware the current game consoles use).
During our discussion we talked about the software limitations of such hardware.
Current TV’s come for example with some predefined internet channels, but not with a real web browser. We think that people which keep a TV for 10 years or longer (like for example our parents and probably both of us too) this will result in a loss of features after some years, because those channels will get less attention of case to exist at all. There is also no way to switch to alternatives then, except by buying a new TV (we expect that there will be no firmware update in such a case). With a real web browser this would not be an issue (it may be more easy to enter URL’s with a real keyboard than with a remote control, but let us do small steps here). Game consoles are a bit better in this regard, but there we have the problem that some websites are too much memory hungry (they do not include the user agent of the game console browsers in the same class as smart phones or tablet PCs… from the size aspect they are not, but from the memory and computing power aspect they are more similar).
I would expect that the TV stations do not want to have TVs with really good browsers, because then you may not need a TV station anymore. But this is what users would use if it would be there.
Another deficit is that there is not a mail program in game consoles and TV’s. For writing mails you need a real keyboard, but for a quick check if there is mail (e.g. X unread mails, or maybe even displaying the subject line of the emails) or maybe to just read without answering a solution without a keyboard connected would already be enough.
I expect that console manufacturers do not want to spend money for something people are not willing to give much money for, respectively for something where they can not make money with (an email service from the console company would be another mail service additional to the one for the PC and maybe additional to the one of the smart phone… people do not need 10 email accounts, one is enough).
Another overlooked feature is some kind of VoIP+Video feature (at least for the game consoles which have optionally a camera, but IMO this is also possible for the next generation of TV’s with build-in webcams). At least the offerings from Sony and Microsoft are powerful enough to come with some kind of video conferencing software. It does not matter much if this is Skype or the Google version of this, or some other widespread one (MS surely wants to use their own stuff), it just has to be one which is in widespread use to be adopted by the people.This does not need to be in HD, even a small video would already be much more than what is available ATM.
Basically I gave the answer to my question (the title of this posting) myself (except for the video conferencing stuff)… but on the other hand this would be something which could set a product apart from others. For the PS3 this may be now one of the things which could show up in the Homebrew scene, now that the security of the PS3 is compromised. For the Wii at least the email part could be easily done. The rest… would have to catch up in case something like this shows up for the PS3 and is used extensively.
GD Star Rating
GD Star Rating
Tags: buying a new tv
, course technology
, current tv
, firmware update
, game consoles
, internet channels
, mail program
, small steps
, software limitations
, unread mails
After moving our secondary management site (our team is split up into 2 different locations) to a new building, we decided to clean-up some things. One of those things involves moving the LDAP to a different machine (more or less a new server for the new site, it is independent regarding LDAP/homes/… from the primary site). While I am at it, I take the opportunity to move from DSEE5 to DSEE7 (my previous post about the DSEE6 migration was at the primary site). This time I took the package distribution instead of the zip distribution (the main reason is that I can get patch-listings with an automatic tool, and the secondary management site has no disaster-recovery requirements for the applications… we just will setup a new secondary site somewhere else if necessary).
Here my experiences with the installation instructions of DSEE7.
- The install instructions refer to the web interface for the DSEE7 management, but I have not seen something which tells you first have to setup an application server (this was better in the DSEE6 instructions).
- When using the Glassfish application server which comes with Solaris 10 for the web interface, you will get an exception after deploying the dscc7.war, as it is using an outdated JVM. After some fighting and Googling, I found that I have to change the AS_JAVA value in /usr/appserver/config/asenv.conf to a more recent JVM as it is pointing to the very outdated j2se 1.4.x. I pointed it to /usr/java (which is a symlink to the most recent version installed as a package). Instead of the original exception I got another one now (after a redirection in the web-browser), something that it can not find the AntMain class (Glassfish uses ANT from /usr/sfw, this is the one which comes with Solaris 10 update 9). I tried with Java 5 instead of Java 6, but I get the same error. In the net there are some discussions about such errors (it is even a FAQ at the ANT site), but this Glassfish/DSEE7 thing is a black box for me, so what am I supposed to do here (I do not want to put the system into an unofficial state by installing my own ANT for Glassfish/DSEE7)?
It was not mentioned in the Appendix of the DSEE7 install instructions which explains how to install the .war in Glassfish that you have to change to a more recent JVM, and I still fight with the AntMain problem (hey Oracle, there is room for improvement in the product compatibility testing and documentation verification process).
I will update this posting when I make some advancements. For now I let the web interface in the bad state as it is and concentrate on finishing the LDAP move to the new system (installing an DSEE on a backup system, configuring replication, switching the clients to them). The web interface is independent enough to handle it later (hints welcome, that is the main purpose why I write this posing in the middle of the work).
GD Star Rating
GD Star Rating
, automatic tool
, directory server
, oracle directory
, package distribution
, solaris 10
, web interface
, zip distribution