iocage: HOWTO cre­ate a base­jail from src (instead of from an offi­cial release)

Back­ground

So far I have used ezjail to man­age FreeB­SD jails. I use jails since years to have dif­fer­ent parts of a soft­ware stack in some kind of a con­tain­er (in a ZFS dataset for the filesys­tem side of the con­tain­er). On one hand to not let depen­den­cies of one part of the soft­ware stack have influ­ence of oth­er parts of the soft­ware stack. On the oth­er hand to have the pos­si­bil­i­ty to move parts of the soft­ware stack to a dif­fer­ent sys­tem if nec­es­sary. Nor­mal­ly I run ‑sta­ble or ‑cur­rent or more gen­er­al­ly speak­ing, a self-compiled FreeB­SD on those sys­tems. In ezjail I like the fact that all jails on a sys­tem have one com­mon base­jail under­ly­ing, so that I update one place for the user­land and all jails get the updat­ed code.

Since a while I heard good things about iocage and how it inte­grates ZFS, so I decid­ed to give it a try myself. As iocage does not come with an offi­cial way of cre­at­ing a base­jail (respec­tive­ly a release) from a self-compiled FreeB­SD (at least doc­u­ment­ed in those places I looked, and yes, I am aware that I can cre­ate a FreeB­SD release myself and use it, but I do not like to have to cre­ate a release addi­tion­al­ly to the build­world I use to update the host sys­tem) here now the short HOWTO achieve this.

Invari­ants

In the fol­low­ing I assume the iocage ZFS parts are already cre­at­ed in dataset ${POOLNAME}/iocage which is mount­ed on ${IOCAGE_BASE}/iocage. Addi­tion­al­ly the build­world in /usr/src (or wher­ev­er you have the FreeB­SD source) should be finished.

Pre-requisites

To have the nec­es­sary dataset-infrastructure cre­at­ed for own basejails/releases, at least one offi­cial release needs to be fetched before. So run the com­mand below (if there is no ${IOCAGE_BASE}/iocage/releases direc­to­ry) and fol­low the on-screen instructions.

iocage fetch

HOWTO

Some vari­ables:

POOLNAME=mpool
SRC_REV=r$(cd /usr/src; svnliteversion)
IOCAGE_BASE=""

Cre­at­ing the iocage basejail-datasets for this ${SRC_REV}:

zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/bin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/boot
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/lib
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/libexec
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/rescue
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/sbin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/bin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/include
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/lib
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/lib32
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/libdata
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/libexec
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/sbin
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/share
zfs create -o compression=lz4 ${POOLNAME}/iocage/base/${SRC_REV}-RELEASE/root/usr/src

Install from /usr/src (the exe­cutable “chown” is hardlinked across an iocage base­jail dataset bound­ary, this fails in the nor­mal install­world, so we have to ignore this error and install a copy of the chown bina­ry to the place where the hardlink nor­mal­ly is):

cd /usr/src
make -i installworld DESTDIR=${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root >&! iocage_installworld_base.log
cp -pv ${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root/usr/sbin/chown ${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root/usr/bin/chgrp
make distribution DESTDIR=${IOCAGE_BASE}/iocage/base/${SRC_REV}-RELEASE/root >>& iocage_installworld_base.log

While we are here, also cre­ate a release and not only a basejail:

zfs create -o compression=lz4 ${POOLNAME}/iocage/releases/${SRC_REV}-RELEASE
zfs create -o compression=lz4 ${POOLNAME}/iocage/releases/${SRC_REV}-RELEASE/root
make installworld DESTDIR=${IOCAGE_BASE}/iocage/releases/${SRC_REV}-RELEASE/root >&! iocage_installworld_release.log
make distribution DESTDIR=${IOCAGE_BASE}/iocage/releases/${SRC_REV}-RELEASE/root >>& iocage_installworld_release.log

And final­ly make this the default release which iocage uses when cre­at­ing new jails (this is optional):

iocage set release=${SRC_REV}-RELEASE default

Now the self-build FreeB­SD is avail­able in iocage for new jails.

HOWTO: “Blind” re­mote in­stall of FreeB­SD via tiny disk im­age (ZFS edition)

In a past post I described how to install a FreeB­SD remote­ly via a tiny UFS based disk image over a lin­ux sys­tem. In this post I describe how to do it with a ZFS based disk image.

Invari­ants

Giv­en a unix based remote sys­tem (in this case a lin­ux sys­tem) from which you know what kind of hard­ware it runs on (e.g. PCI IDs) and what the cor­re­spond­ing FreeB­SD dri­vers are.

HOWTO

In the title of this post I wrote “via a tiny disk im­age”. This is true for a suit­able defin­i­tion of tiny.

What we have in the root­server are two ~900 GB hard­disks. They shall be used in a soft­ware mir­ror. The ma­chine has 8 GB of RAM. I do not ex­pect much ker­nel pan­ics (= crash dumps) there, so we do not real­ly need >8 GB of swap (for­get the rule of hav­ing twice as much swap than RAM, with the cur­rent amount of RAM in a ma­chine you are in “trou­ble” when you need even the same amount of swap than RAM). I de­cided to go with 2 GB of swap.

Pushing/pulling a 900 GB im­age over the net­work to in­stall a sys­tem is not real­ly some­thing I want to do. I am OK to trans­fer 5 GB (that is 0.5% of the en­tire disk) to get this job done, and this is feasible.

First let us define some vari­ables in the shell, this way you just need to change the val­ues in one place and the rest is copy&paste (I use the SVN revi­sion of the source which I use to install the sys­tem as the name of the sysutils/beadm com­pat­i­ble boot-dataset in the rootfs, as such I also have the revi­sion num­ber avail­able in a variable):

ROOTFS_SIZE=5G
ROOTFS_NAME=root
FILENAME=rootfs
POOLNAME=mpool
VERSION=r$(cd /usr/src; svnliteversion)
SWAPSIZE=2G

Then change your cur­rent dir­ect­ory to a place where you have enough space for the im­age. There we will cre­ate a con­tainer for the im­age, and make it ready for partitioning:

truncate -s ${ROOTFS_SIZE} ${FILENAME}
mdconfig -a -t vnode -f ${FILENAME}
# if you want to fully allocate
# dd if=/dev/zero of=/dev/md0 bs=1m

Cre­ate the par­ti­tion table and the rootfs (in a sysutils/beadm com­pat­i­ble way – as I install FreeBSD-current there – and mount it tem­po­rary to /temppool):

gpart create -s GPT /dev/md0
gpart add -s 512K -t freebsd-boot -l bootcode0 /dev/md0
gpart add -a 4k -t freebsd-swap -s ${SWAPSIZE} -l swap0 /dev/md0
gpart add -a 1m -t freebsd-zfs -l ${POOLNAME}0 /dev/md0
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 /dev/md0
# if not already the case and you want to have 4k physical sector size of the pool
# syscl vfs.zfs.min_auto_ashift=12
zpool create -o cachefile=/boot/zfs/zpool.cache_temp -o altroot=/temppool -O compress=lz4 -O atime=off -O utf8only=on ${POOLNAME} /dev/gpt/${POOLNAME}0
zfs create -o mountpoint=none ${POOLNAME}/ROOT
zfs create -o mountpoint=/ ${POOLNAME}/ROOT/${VERSION}
zfs create -o mountpoint=/tmp -o exec=on -o setuid=off ${POOLNAME}/tmp
zfs create -o mountpoint=/usr -o canmount=off ${POOLNAME}/usr
zfs create -o mountpoint=/home ${POOLNAME}/home
zfs create -o setuid=off ${POOLNAME}/usr/ports
zfs create ${POOLNAME}/usr/src
zfs create -o mountpoint=/var -o canmount=off ${POOLNAME}/var
zfs create -o exec=off -o setuid=off ${POOLNAME}/var/audit
zfs create -o exec=off -o setuid=off ${POOLNAME}/var/crash
zfs create -o exec=off -o setuid=off ${POOLNAME}/var/log
zfs create -o atime=on ${POOLNAME}/var/mail
zfs create -o setuid=off ${POOLNAME}/var/tmp
zfs create ${POOLNAME}/var/ports
zfs create -o exec=off -o setuid=off -o mountpoint=/shared ${POOLNAME}/shared
zfs create -o exec=off -o setuid=off ${POOLNAME}/shared/distfiles
zfs create -o exec=off -o setuid=off ${POOLNAME}/shared/packages
zfs create -o exec=off -o setuid=off -o compression=lz4 ${POOLNAME}/shared/ccache
zfs create ${POOLNAME}/usr/obj
zpool set bootfs=${POOLNAME}/ROOT/${VERSION} ${POOLNAME}

In­stall FreeB­SD (from source):

cd /usr/src
#make buildworld >&! buildworld.log
#make buildkernel -j 8 KERNCONF=GENERIC >&! buildkernel_generic.log
make installworld DESTDIR=/temppool/ >& installworld.log
make distribution DESTDIR=/temppool/ >& distrib.log
make installkernel KERNCONF=GENERIC DESTDIR=/temppool/ >& installkernel.log

Copy the tem­po­rary zpool cache cre­at­ed above in the pool-creation part to the image (I have the impres­sion it is not real­ly need­ed and will work with­out, but I have not tried this):

cp /boot/zfs/zpool.cache_temp /temppool/boot/
cp /boot/zfs/zpool.cache_temp /temppool/boot/zpool.cache

Add the zfs mod­ule to loader.conf:

zfs_load="yes"
opensolaris_load="yes"

Now you need to cre­ate /temppool/etc/rc.conf (set the de­faultrouter, the IP ad­dress via ifconfig_IF (and do not for­get to use the right IF for it), the host­name, set sshd_enable to yes, zfs_enable=“YES”)  /temppool/boot/loader.conf (zfs_load=“yes”, opensolaris_load=“yes”, vfs.root.mountfrom=“zfs:${POOLNAME}/ROOT/r${VERSION}”)
/temppool/etc/hosts, /temppool/etc/resolv.conf and maybe /temppool/etc/sysctl.conf and /temppool/etc/periodic.conf.

Do not allow password-less root logins in single-user mode on the phys­i­cal con­sole, cre­ate a resolv.conf and an user:

cd /temppool/etc
sed -ie 's:console.*off.:&in:' ttys
cat >resolv.conf <<EOT
search YOURDOMAIN
nameserver 8.8.8.8
EOT
pw -V /temppool/etc groupadd YOURGROUP -g 1001
pw -V /temppool/etc useradd YOURUSER -u 1001 -d /home/YOURUSER -g YOURUSER -G wheel -s /bin/tcsh
pw -V /temppool/etc usermod YOURUSER -h 0
pw -V /temppool/etc usermod root -h 0
zfs create mpool/home/YOURUSER
chown YOURUSER:YOURGROUP /temppool/home/YOURUSER

Now you can make some more mod­i­fi­ca­tions to the sys­tem if want­ed, and then export the pool and detach the image:

zpool export ${POOLNAME}

mdconfig -d -u 0

Depend­ing on the upload speed you can achieve, it is ben­e­fi­cial to com­press the image now, e.g. with bzip2. Then trans­fer the image to the disk of the remote sys­tem. In my case I did this via:

ssh –C –o CompressionLevel=9 root@remote_host dd of=/dev/hda bs=1m < /path/to/${FILENAME}

Then reboot/power-cycle the remote system.

Post-install tasks

Now we have a new FreeB­SD sys­tem which uses only a frac­tion of the the hard­disk and is not resilient against harddisk-failures.

FreeB­SD will detect that the disk is big­ger than the image we used when cre­at­ing the GPT label and warn about it (cor­rupt GPT table). To fix this and to resize the par­ti­tion for the zpool to use the entire disk we first mir­ror the zpool to the sec­ond disk and resize the par­ti­tion of the first disk, and when the zpool is in-sync and then we resize the boot disk (atten­tion, you need to change the “-s” part in the fol­low­ing to match your disk size).

First back­up the label of the first disk, this makes it more easy to cre­ate the label of the sec­ond disk:

/sbin/gpart backup ada0 > ada0.gpart

Edit ada0.gpart (give dif­fer­ent names for the labels, main­ly change the num­ber 0 on the label-name to 1) and then use it to cre­ate the par­ti­tion of the sec­ond disk:

gpart restore -Fl ada1 < ada0.gpart
gpart resize -i 3 -a 4k -s 929g ada1
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 ada1
zpool set autoexpand=on mpool

Fix the warn­ing about the GPT label and resize the partition:

gpart recover ada0
gpart resize -i 3 -a 4k -s 929g ada0

After­wards it should look sim­i­lar to this:

gpart show -l
=>        40  1953525088  ada0  GPT  (932G)
          40        1024     1  bootcode0  (512K)
        1064     4194304     2  swap0  (2.0G)
     4195368         984        - free -  (492K)
     4196352  1948254208     3  mpool0  (929G)
  1952450560     1074568        - free -  (525M)

=>        40  1953525088  ada1  GPT  (932G)
          40        1024     1  bootcode1  (512K)
        1064     4194304     2  swap1  (2.0G)
     4195368         984        - free -  (492K)
     4196352  1948254208     3  mpool1  (929G)
  1952450560     1074568        - free -  (525M)

Add the sec­ond disk to the zpool:

zpool attach mpool gpt/mpool0 gpt/mpool1

When the mir­ror is in sync (zpool sta­tus mpool), we can extend the size of the pool itself:

zpool offline mpool /dev/gpt/mpool0
zpool online mpool /dev/gpt/mpool0

As a last step we can add now an encrypt­ed swap (depend­ing on the impor­tance of the sys­tem maybe a gmirror-ed one – not explained here), and spec­i­fy where to dump (text-dumps) on.

/boot/loader.conf:

dumpdev="/dev/ada0p2"

/etc/rc.conf:

dumpdev="/dev/gpt/swap0"
crashinfo_enable="YES"
ddb_enable="yes"
encswap_enable="YES"
geli_swap_flags="-a hmac/sha256 -l 256 -s 4096 -d"

/etc/fstab:

# Device        Mountpoint      FStype  Options                 Dump    Pass#
/dev/ada1p2.eli none    swap    sw      0       0

Now the sys­tem is ready for some applications.