Solaris: remove unusable SAN disks

Problem

Your Solaris system may have “lost” some SAN disks (for whatever reason). If you can not get them back and arrive at the stage where you want to cleanup (if the system can not do it automatically), you want to have a solution which does not need much thinking about rarely executed tasks.

Solution

for i in $(luxadm -e port | cut -d : -f 1); do
  luxadm -e forcelip $i
  sleep 10
done

for i in $(cfgadm -al -o show_FCP_dev | awk '/unusable/ {print $1}' | cut -d , -f 1); do
  cfgadm -c unconfigure -o unusable_SCSI_LUN $i
done

devfsadm -Cv

If you still have some devices here, you may have to reboot to release some locks or such.

Share/Save

VMware: setting the storage device queue depth (HDS fibre channel disks)

Problem

In 2017 we replaced an IBM storage sysdem with an Hitachi Vantara storage system (actually, we replaced the complete SAN infrastructure). We handled it by attaching both storage systems to VMware (v5.5) and migrating the datastores. A recommendation from Hitachi Vantara was to set the queue depth for fibre channel disks to 64.

Solution

Here is a little script which does that. Due to issues as described in a previous post which caused HA/FA (High Availability / Fault Tolerance) reactions in VMware to trigger, we played it safe and added a little sleep after each change. The script also checks of the queue depth is already set to the desired value and does nothing in this case. It’s small enough to just copy&paste it directly into a shell on the host.

SLEEPTIME=210 # 3.5 minutes  !!!! only if all RDMs on the host are reserved!!!
TARGET_DEPTH=64
for LDEV in $(esxcli storage core device list | grep "HITACHI Fibre Channel Disk" | awk '{gsub(".*\\(",""); gsub("\\).*",""); print}'); do
  echo $LDEV
  DEPTH="$(esxcli storage core device list -d $LDEV | awk '/outstanding/ {print $8}')"
  if [ "$DEPTH" -ne $TARGET_DEPTH ]; then
    echo "   setting queue depth $TARGET_DEPTH"
    esxcli storage core device set -d $LDEV -O $TARGET_DEPTH
    echo "   sleeping $SLEEPTIME"
    sleep $SLEEPTIME
  else
    echo "    queue depth OK"
  fi
done

Share/Save

VMware: blank performance graph issue

Problem / Story

In 2017 we replaced a storage system with another storage system (actually, we replaced the complete SAN infrastructure). We handled it by attaching both storage systems to VMware (v5.5) and migrating the datastores. In this process we stumbled upon issues which made some hosts unresponsive in VCenter (while the VMs were running without issues). Before the hosts went unresponsive, the performance graphs of them started to blank out. So at the moment the issue appeared until it was resolved, any graph continued to advance, but had no values listed in the corresponding timeframe (left = colorful lines, middle = white space, and after the issue was resolved the colorful lines appeared again). Some times the issue of the blank performance graph resolved itself, sometimes the hosts became unresponsive and VCenter greyed them out and triggered a HA/FT (High Availability / Fault Tolerance) reaction.

Root cause

On the corresponding hosts we had RDMs (Raw Device Mappings) which are used by Microsoft Cluster Service (there is a knowledge-base article). The issues showed up when we did some SAN operations in VMware (like (automatic) scanning) of new disks after having presented new disks to VMware. VMware tried to do something clever with the disks (also during the boot of a host, so if you use RDMs and booting the host takes a long time, you are in the situation I describe here). If only a small amount of changes happened at the same time, the issues fixed itself. A large amount of changes caused a HA/FT reaction.

Workaround when the issue shows up

When you see that the performance graphs start to show blank space and your VMs are still working, go to the cluster settings and disable vSphere HA (High Availability): cluster -> “Edit Settings” -> “Cluster Features” -> remove checkmark in front of “Turn On vSphere HA”. Wait until the graph shows some values again (for all involved hosts) and then enable vSphere HA again.

Solution

To not have this issue show up at all, you need to change some settings for the devices on which you have the RDMs. Here is a little script (small enough to jsut copy&paste it into a shell on the host) which needs the IDs of the devices which are used for the RDMs (attention, letters need to be lowercase) in the “RDMS” variable. As we did that on the running systems, and each change of the settings caused some action in in he background which made the perfromance graph issue to show up, there is a “little” sleep between the making the changes. The amount of sleep depends upon your situation, the more RDMs are configured, the bigger it needs to be. For us we had 15 of such devices and a sleep of 20 minutes between each change was enough to not trigger a HA/FT reaction. The amount of time needed in the end is much lower than in the beginning, but as this was more or less an one-off task, this simple version was good enough (it checks if the setting is already active and does nothing in this case).

For our use case it was also beneficial to the the path selection policy to fixed, so this is also included in this script. Your use case may be different.

SLEEPTIME=1200              # 20 minutes per LDEV!
# REPLACE THE FOLLOWING IDs   !!! lower case !!!
RDMS="1234567890abcdef12345c42000002a2 1234567890abcdef12345c42000003a3 \
1234567890abcdef12345c42000003a4 1234567890abcdef12345c42000002a5 \
1234567890abcdef12345c42000002a6 1234567890abcdef12345c42000002a7 \
1234567890abcdef12345c42000003a8 1234567890abcdef12345c42000002a9 \
1234567890abcdef12345c42000002aa 1234567890abcdef12345c42000003ab \
1234567890abcdef12345c42000002ac 1234567890abcdef12345c42000003ad \
1234567890abcdef12345c42000002ae 1234567890abcdef12345c42000002af \
1234567890abcdef12345c42000002b0"

for i in $RDMS; do
  LDEV=naa.$i
  echo $LDEV
  RESERVED="$(esxcli storage core device list -d $LDEV | awk '/Perennially/ {print $4}')"
  if [ "$RESERVED" = "false" ]; then
    echo "   setting prerennially reserved to true"
    esxcli storage core device setconfig -d $LDEV --perennially-reserved=true
    echo "   sleeping $SLEEPTIME"
    sleep $SLEEPTIME
    echo "   setting fixed path"
    esxcli storage nmp device set --device $LDEV --psp VMW_PSP_FIXED                    
  else
    echo "    perennially reserved OK"
  fi
done

Share/Save

Solaris 10/11(.3) boot panic/crash after moving rpool to a new storage system

Situation

The boot disks of some Solaris LDOMs were migrated from one storage system to another one via ZFS mirroring the rpool to the new system and detaching the old LUN.

Issue

After reboot with on the new storage system Solaris 10 and 11(.3) panic at boot.

Cause

rpool not on slice 0 but on slice 2
bug in Solaris when doing such a mirror and “just” doing a reboot <- this is the real issue, it seems Solaris can not handle a change of the name of the underlying device for a rpool, as just moving the partitioning to slice 0 is not fixing the panic.

Fix

# boot from network (or an alternate pool which was not yet moved), import/export the pools, boot from the pools
boot net -
# go to shell
# if needed: change the partitioning so that slice 0 has the same values as slice 2 (respectively make sure the rpool is in slice 0)
zpool import ‑R /tmp/yyy rpool
zpool export rpool
reboot

Share/Save

iocage: HOWTO create a basejail from src (instead of from an official release)

Background

So far I have used ezjail to manage FreeBSD jails. I use jails since years to have different parts of a software stack in some kind of a container (in a ZFS dataset for the filesystem side of the container). On one hand to not let dependencies of one part of the software stack have influence of other parts of the software stack. On the other hand to have the possibility to move parts of the software stack to a different system if necessary. Normally I run ‑stable or ‑current or more generally speaking, a self-compiled FreeBSD on those systems. In ezjail I like the fact that all jails on a system have one common basejail underlying, so that I update one place for the userland and all jails get the updated code.

Since a while I heard good things about iocage and how it integrates ZFS, so I decided to give it a try myself. As iocage does not come with an official way of creating a basejail (respectively a release) from a self-compiled FreeBSD (at least documented in those places I looked, and yes, I am aware that I can create a FreeBSD release myself and use it, but I do not like to have to create a release additionally to the buildworld I use to update the host system) here now the short HOWTO achieve this.

Invariants

In the following I assume the iocage ZFS parts are already created in dataset ${POOLNAME}/iocage which is mounted on ${IOCAGE_BASE}/iocage. Additionally the buildworld in /usr/src (or wherever you have the FreeBSD source) should be finished.

Pre-requisites

To have the necessary dataset-infrastructure created for own basejails/releases, at least one official release needs to be fetched before. So run the command below (if there is no ${IOCAGE_BASE}/iocage/releases directory) and follow the on-screen instructions.

iocage fetch

HOWTO

Some variables:

POOLNAME=mpool
SRC_REV=r$(cd /usr/src; svnliteversion)
IOCAGE_BASE=""

Creating the iocage basejail-datasets for this ${SRC_REV}:

zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/bin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/boot
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/lib
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/libexec
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/rescue
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/sbin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/bin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/include
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/lib
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/lib32
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/libdata
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/libexec
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/sbin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/share
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/src

Install from /usr/src (the executable “chown” is hardlinked across an iocage basejail dataset boundary, this fails in the normal installworld, so we have to ignore this error and install a copy of the chown binary to the place where the hardlink normally is):

cd /usr/src
make -i installworld DESTDIR=${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root >&! iocage_installworld_base.log
cp -pv ${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root/usr/sbin/chown ${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root/usr/bin/chgrp
make distribution DESTDIR=${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root >>& iocage_installworld_base.log

While we are here, also create a release and not only a basejail:

zfs create -o compression=lz4 ${POOLNAME}/iocage/releases/${SRC_REV}-RELEASE
zfs create -o compression=lz4 ${POOLNAME}/iocage/releases/${SRC_REV}-RELEASE/root
make installworld DESTDIR=${IOCAGE_BASE}/iocage/releases/${SRC_REV}-RELEASE/root >&! iocage_installworld_release.log
make distribution DESTDIR=${IOCAGE_BASE}/iocage/releases/${SRC_REV}-RELEASE/root >>& iocage_installworld_release.log

And finally make this the default release which iocage uses when creating new jails (this is optional):

iocage set release=${SRC_REV}-RELEASE default

Now the self-build FreeBSD is available in iocage for new jails.

Share/Save

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Prob­lem

Solu­tion

Prob­lem

Solu­tion

Prob­lem / Story

Root cause

Workaround when the issue shows up

Solu­tion

Sit­u­a­tion

Issue

Cause

Fix

Back­ground

Invari­ants

Pre-requisites

HOWTO

Problem

Solution

Problem

Solution

Problem / Story

Solution

Situation

Background

Invariants