Solaris: script to check var­i­ous set­tings of a sys­tem if they com­ply to some pre-defined settings

Prob­lem

If you set­up a sys­tem, you want to make sure that it com­plies to a pre-defined con­fig. You can do that with some con­fig­u­ra­tion man­age­ment sys­tem, but there are cas­es where it use­ful to do that out­side of this context.

Solu­tion

The shell script below I start­ed to write in 2008. Over time (until 2016) it extend­ed into some­thing which is able to out­put a report of over 1000 items. You can con­fig­ure it via ${HOME}/.check_host.cfg and /etc/check_host.cfg (it reads both in this order, first con­fig wins and oth­er con­fig is not read). You can use option “-h” to see the usage text. Option “-n” sup­press­es mes­sages which help to fix issues, “-a” prints out sim­ple HTML instead of text.

Solaris: script to cre­ate com­mands to set­up LDOMs based upon out­put from “ldm ls”

Prob­lem

You have a LDOM which you want to clone to some­where else and all you have to per­form that is the ldm com­mand on the tar­get system.

Solu­tion

Down­load the AWK script below. Use the out­put of “ldm ls ‑l ‑p <ldom>” as the input of this AWK script. The out­put will be a list of com­mands to re-create the con­fig for VDS, VDISK, VSW and NETWORK.

I wrote this in 2013, so changes to the out­put of “ldm ls” since then are not account­ed for.

Solaris: remove unus­able SAN disks

Prob­lem

Your Solaris sys­tem may have “lost” some SAN disks (for what­ev­er rea­son). If you can not get them back and arrive at the stage where you want to cleanup (if the sys­tem can not do it auto­mat­i­cal­ly), you want to have a solu­tion which does not need much think­ing about rarely exe­cut­ed tasks.

Solu­tion

for i in $(luxadm -e port | cut -d : -f 1); do
  luxadm -e forcelip $i
  sleep 10
done

for i in $(cfgadm -al -o show_FCP_dev | awk '/unusable/ {print $1}' | cut -d , -f 1); do
  cfgadm -c unconfigure -o unusable_SCSI_LUN $i
done

devfsadm -Cv

If you still have some devices here, you may have to reboot to release some locks or such.

Solaris 10/11(.3) boot panic/crash after mov­ing rpool to a new stor­age system

Sit­u­a­tion

The boot disks of some Solaris LDOMs were migrat­ed from one stor­age sys­tem to anoth­er one via ZFS mir­ror­ing the rpool to the new sys­tem and detach­ing the old LUN.

Issue

After reboot with on the new stor­age sys­tem Solaris 10 and 11(.3) pan­ic at boot.

Cause

  • rpool not on slice 0 but on slice 2
  • bug in Solaris when doing such a mir­ror and “just” doing a reboot <- this is the real issue, it seems Solaris can not han­dle a change of the name of the under­ly­ing device for a rpool, as just mov­ing the par­ti­tion­ing to slice 0 is not fix­ing the panic.

Fix

# boot from net­work (or an alter­nate pool which was not yet moved), import/export the pools, boot from the pools
boot net -
# go to shell
# if need­ed: change the par­ti­tion­ing so that slice 0 has the same val­ues as slice 2 (respec­tive­ly make sure the rpool is in slice 0)
zpool import ‑R /tmp/yyy rpool
zpool export rpool
reboot

 

New users in Solaris 10 brand­ed zones on Solaris 11 not han­dled automatically

A col­league noticed that on a Solaris 11 sys­tem a Solaris 10 brand­ed zone “gains” two new dae­mons which are run­ning with UID 16 and 17. Those users are not auto­mat­i­cal­ly added to /etc/passwd, /etc/shadow (and /etc/group)… at least not when the zones are import­ed from an exist­ing Solaris 10 zone.

I added the two users (netadm, netcfg) and the group (netadm) to the Solaris 10 brand­ed zones by hand (copy&paste of the lines in /etc/passwd, /etc/shadow, /etc/group + run pwconv) for our few Solaris 10 brand­ed zones on Solaris 11.