Installation of a FreeBSD 9.2 system with ZFS-on-root over GELI
This article explains how to install a FreeBSD 9.2 system on a remote server known as a dedibox. It is closely related to my previous HOWTO on 8.2 and HOWTO on 8.2/dedibox but with several changes due to FreeBSD 9.1 & 9.2.
!! Work has basically stopped on this one, we are at 11.1 now and I need a different setup !! Please go to this article on the new 11.1-based setup.
Table of content
- Installation of a FreeBSD 9.2 system with ZFS-on-root over GELI
- Table of content
- Prerequisites
- Installed system
- Hardware
- Notes on ZFS, disks and all that
- Constraints
- Custom
mfsbsd
images - Booting off the mfsbsd image
- Creation of the customized distribution
- Actual FreeBSD installation
- Finishing up
- Things to remember
- Resources
- Feedback
- History
- Credits
Prerequisites
Like many dedicated servers in an hosting datacenter, access to a console (such as iLO or iDRAC) is mandatory to be able to manipulate BIOS settings, access to some kind of rescue mode where you can upload an ISO image and boot from it. The example used here will be the Dedibox rescue system (such as described there and used in this howto)
You must have an mfsbsd generated image or the ability to generate one (which means an entire /usr/src
tree or the files from a release). See the above mfsbsd
URL or below for details.
Installed system
- ZFS-only system with Root-on-ZFS
- two disks are in the machine, running with ZFS/mirror
- the main zpool is encrypted with
geli(8)
so that if a disk needs to be replaced, data on it will be secure (like there) - Zpool v28/5000 along with ZFS v5 to get deduplication and performance fixes (as standard in 9.1-9.2).
- two ZFS pools are defined, one encrypted to contain the minimal booting system and the real one that is mounted over as /
Hardware
The hardware of choice is the Dedibox PRO R210 system (See there for reference), a rather powerful system with the following characteristics:
- L3426 Nehalem quad-core CPU running at 1.86 GHz
- 16 GB of RAM
- 2 disks of 2 TB each (Hitachi HUA72202 or WDC WD2003FYYS-1) on a LSI2008/H200 HBA
The mps
driver that we will be using for the H200 HBA controller does now support the RAID1 option installed by default by the Online people. Still, I intend to keep on using the disks in JBOD mode to get the most benefits from using ZFS. The first step is to break that RAID1 setup and configure the BIOS in passthrough mode to get the drives “back” as separate devices (da0
and da1
). You will have to open a ticket if you are using the same Online.net hosting provider as I do. Procedure maybe completely different elsewhere if not fully manual.
Do not be confused by the “SCSI” naming of the drives. The H200 controller, made by LSI and also called the LSI SAS 2009, is a SAS controller but SATA2 drives are compatible and will appear as SAS drives (hence the da
name).
NOTE: This is the old dedibox model, now they have the new HP-based DL-120G7 system with new CPUs but I do not know how compatible they are with FreeBSD yet. From various discussions, the controller used by the HP boxes is a P410/P420 driven by the ciss
driver and if you want to use ZFS, you have to fiddle with the RAID settings. The main point is that the ciss
driver is less versatile and you will probably have to create a single volume RAID for each disk, adding a lot of overhead for nothing.
UPDATE: I got my hands on a new dedibox model and the NOTE above applies. The main interest of this new machine for us is that its CPU (Xeon E3-1220@3.1 GHz) in addition to being faster (Sandy Bridge family), it also has the AES-NI instructions in the CPU! Which means that we need to load another driver module, called aesni
which does allow geli(8)
to use the hardware instructions for AES. Expect at least a 2x performance enhancement. Even more with another patch by jmg (soon to be committed to head).
Here are the characteristics of the new dedibox model:
- HP DL120G7 1U system
- Intel Core E5-1220 quad-core CPU running at 3.1 GHz (up to 3.5 in Turbo Boost mode)
- 16 GB of RAM
- 2 disks of 2 TB (no way to know the brand due to the note above.
Notes on ZFS, disks and all that
Please go read this article to find useful information on ZFS, disks and how to use it. It is not specific to FreeBSD ZFS but will apply as well for most things.
Constraints
The fact that we want to use encryption to protect our data is a major constraint on the on-disk architecture and how we lay down/use partitions. /boot/loader
has the ability to ask for encryption keys at boot-time but that also means that it must be unencrypted… The alternative would be to have a more complete system installed in the encrypted pool and using ssh
to connect to the system to load the encrypted pool.
Main choice to be made is whether we use a plain UFS booting partition then mount everything from the ZFS pool or we use ZFS pools for both partitions.
We choose to have two separate ZFS pools to be able to use the other ZFS features such as snapshots, mirroring and al.
Custom mfsbsd
images
If you choose to build your own mfsbsd
images (to add a missing driver or equivalent, please see this tutorial (which was at the beginning inside this howto but it makes sense to separate the two).
The regular mfsbsd
generated from 9.2 should have more things than the one I used before but it will still lack the crypto stuff needed by geli(8)
.
Installation of the mfsbsd image
Put the dedibox server in “rescue mode”. Typically it will be running some form of Linux system like Ubuntu. Now get the image and install it:
sudo -s
wget -O - <url>/mfsimage.img | dd of=/dev/sda bs=1048576
An alternate way of booting the mfsbsd
image uses qemu
to run the mfsbsd
image through the kvm
system:
- boot in rescue mode
- install kvm (through
apt-get(8)
if you use Debian)
then
kvm -hda /dev/sda -cdrom openbsd.iso -boot d -curses
If you have the smaller version of the dedibox (named Dedibox SC), you’ll have to use yet another way because the Ubuntu-based rescue system would crash with plain kvm
(suggested by iMil based on this page:
- boot in rescue mode with Ubuntu
- apt-get update
- apt-get install qemu
then:
qemu-system-x86_64 -no-kvm -hda /dev/sda -cdrom FreeBSD-9.2-RELEASE-amd64-bootonly.iso -curses -boot d
Booting off the mfsbsd image
Reboot your dedibox normally by exiting the recue mode, it should now boot off the mfsbsd
image. A few minutes later you should be able to access the system through ssh
.
Creation of the customized distribution
In parallel to what we are going to do on the target machine, you can generate your custom distribution (incl. your modified source and kernel configuration) by visiting the release
directory and running the following command after creating a big enough space somewhere to hold the result:
mkdir /data/work/release
make release EXTSRCDIR=/data/work/freebsd/9 EXTPORTSDIR=/usr/ports \
CHROOTDIR=/data/work/release NODOC=yes NOPORTSATALL=yes NOPORTS=yes
Now, you can find the result in /data/work/release
under the snapshot’s name:
ls /data/work/release/R/cdrom
bootonly/ disc1/ disc2/ dvd1/ livefs/
and in the one we are most interested in (dvd1
):
ls /data/work/release/R/cdrom/dvd1
.cshrc etc/ sbin/
.profile lib/ stand@
9.2-20130309-SNAP/ libexec/ sys@
COPYRIGHT media/ tmp/
bin/ mnt/ usr/
boot/ proc/ var/
rescue/ dev/ root/
cdrom.inf
The main distribution is located even further below in the 9.2-*
snapshot:
base/ doc/ kernels/ proflibs/
catpages/ games/ lib32/ src/
dict/ info/ manpages/
That will need to be copied over to your shiny new server somewhere.
From what I have been recently experiencing, you can even use a regular distribution disk like the -dvd1
or the -memstick
one. Files have been moved slightly around in 9.2 so the base distribution files are now in /usr/freebsd-dist
and are in tar/xz (aka .txz
) format.
Actual FreeBSD installation
We will more or less follow the instructions in there. Things that we will change are not essential but reflect our special requirements.
We will later be using the dvd1
ISO image to install the system. For now, the mfsbsd has a subset of the installation disk with enough commands to get you going.
If you have used the generic mfsbsd
image described earlier, you will need to get the kernel modules out of the kernel.txz
file you got earlier in /boot.
cd /tmp
fetch ftp://ftp.fr.freebsd.org/pub/FreeBSD/releases/amd64/amd64/9.1-RELEASE/kernel.txz
tar xfj kernel.txz
Now inside /tmp/boot/kernel
, you have all the modules from a standard 9.1 installation. You can now load the missing modules:
cd boot/kernel
kldload ./zlib.ko
kldload ./crypto.ko
kldload ./geom_eli.ko
kldload ./aesni.ko
NOTE: in recent versions of mfsbsd (the so-called Special Edition named -se
), there is a script called zfsinstall
that will do most of what is explained there. The reason we are not using it is that it does not support encrypted partitions at all. If you don’t need that, just use the script.
Partitioning the drives
As we will be using the two disks in mirror mode, we will be replicating the commands we do on da0
on da1
. That way, if either disk is broken at some point, the system will be able to find all the information it needs to boot.
Later on, when finished with the partitioning and encryption phases, we will be transferring the dvd1
image in memory and mount it as a directory with the mdconfig
command. In the meantime, let’s begin.
Before we install our own GPT partition table, we must wipe out the previous partition table installed by the Dedibox installation system.
dd if=/dev/zero of=/dev/da0 bs=512 count=10
dd if=/dev/zero of=/dev/da1 bs=512 count=10
then install our own:
gpart create -s gpt da0
gpart create -s gpt da1
scoite# gpart show
=> 34 3907029101 da0 GPT (1.8T)
34 3907029101 - free - (1.8T)
=> 34 3907029101 da1 GPT (1.8T)
34 3907029101 - free - (1.8T)
Now create the boot, 1st freebsd-zfs
, swap then the 2nd freebsd-zfs
partition. The first ZFS partition is not big because we need only what is necessary for booting (but we still need some space because of all the kernel modules and the symbols). As we will be also encrypting the swap (it makes no sense encrypting the data and not the swap as well) and also mirror it for safety reasons, swap will be twice the RAM (32 GB in our case).
An issue you want to look for is that now that we have really big hard drives (2 and 3 TB now and soon more), these have been getting 4 KB sectors and run really slowly in 512-bytes sectors so you want your partitions aligned on 4 KB boundaries so, from now on, I will be adding “-a 4k” to the gpart(8)
command lines:
gpart add -s 64K -a 4k -t freebsd-boot da0
gpart add -s 2G -a 4k -t freebsd-zfs -l boot0 da0
gpart add -s 32G -a 4k -t freebsd-swap -l swap0 da0
gpart add -a 4k -t freebsd-zfs -l tank0 da0
We can just do the alignment part because we are using geli(8)
on the disks below ZFS, geli
use 4k sectors anyway for ZFS will pick that up at zpool creation time and set the right ashift
value (12 in this case). For plain disks zpool, you will have to use the gnop(8)
trick to present 4k sectors to ZFS regardless of the actual sector size.
It is probably only needed on the first call as the rest will be aligned due to their sizes. You may also encounter smaller drives that protest when the partitions are not aligned. In these cases (recently seen on a Seagate Barracuda 500 GB), you can specify -b 40
which will give a 4k-aligned partition (128+40 = 168
which is dividable by 4 instead of 34+128 = 162
).
We mirror that configuration on da1
:
gpart add -s 64K -a 4k -t freebsd-boot da1
gpart add -s 2G -a 4k -t freebsd-zfs -l boot1 da1
gpart add -s 32G -a 4k -t freebsd-swap -l swap1 da1
gpart add -a 4k -t freebsd-zfs -l tank1 da1
NOTE: You can not use something like gpart backup da0 | gpart restore -F da1
to copy the entire partition table in one go because the name of the labels would be the same.
You can check that the different partitions and labels do now exist in /dev/gpt
:
scoite# ls /dev/gpt
boot0 boot1 swap0 swap1 tank0 tank1
As we want to be able to boot from either disk, we mark the 2nd partition as a boot candidate:
gpart set -a bootme -i 2 da0
gpart set -a bootme -i 2 da1
You should end up with something like this:
=> 34 3907029101 da0 GPT (1.8T)
34 128 1 freebsd-boot (64K)
162 4194304 2 freebsd-zfs [bootme] (2.0G)
4194466 67108864 3 freebsd-swap (32G)
71303330 3835725805 4 freebsd-zfs (1.8T)
=> 34 3907029101 da1 GPT (1.8T)
34 128 1 freebsd-boot (64K)
162 4194304 2 freebsd-zfs [bootme] (2.0G)
4194466 67108864 3 freebsd-swap (32G)
71303330 3835725805 4 freebsd-zfs (1.8T)
We will load the bootcode in place on both disks, remember that if the first drive fail, you want to be able to boot on the second one.
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0
gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da1
Encrypting the disks
Creating the keyfile for the partitions, using the same passphrase for convenience.
mkdir /root/keys
dd if=/dev/random of=/root/keys/boot.key bs=128k count=1
Now, you have to choose a passphrase (as usual, not too short, not guessable, remember you need it only at bootime). You could choose a different passphrase for each disk but I’d not recommend it because that would be giving two ciphertexts for the same cleartext (as the two partitions are going to be mirrored thus contain the exact same data).
geli init -b -K /root/keys/boot.key -s 4096 -l 256 /dev/gpt/tank0
geli init -b -K /root/keys/boot.key -s 4096 -l 256 /dev/gpt/tank1
You will find backups of the metadata in /var/backups
, it is probably a good idea not to forget to copy them elsewhere for security just in case.
Attach both drives:
geli attach -k /root/keys/boot.key /dev/gpt/tank0
geli attach -k /root/keys/boot.key /dev/gpt/tank1
NOTE: remember that we did load the aesni
kernel module? If it is used, the above commands should display something about “hardware” crypto being used for AES-XTS mode like the following (run dmesg(8)
):
cryptosoft0: <software crypto> on motherboard
aesni0: <AES-CBC,AES-XTS> on motherboard
GEOM_ELI: Device gpt/tank0.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI: Crypto: hardware
GEOM_ELI: Device gpt/tank1.eli created.
GEOM_ELI: Encryption: AES-XTS 256
GEOM_ELI: Crypto: hardware
Pool creation
We will also use a different way to create/mount the datasets to make the last part of the install (switching mountpoints) much easier.
Now, create the 1st ZFS partition in mirror mode for the unencrypted part:
zpool create zboot mirror gpt/boot0 gpt/boot1
and the encrypted also mirrored 2nd ZFS partition:
zpool create -o altroot=/mnt -O mountpoint=none tank mirror gpt/tank0.eli gpt/tank1.eli
NOTE: if you used a regular distribution boot disk (like -dvd1
) instead of a mfsbsd
one, you will find that /boot
is read-only meaning that you will not be able to create a proper /boot/zfs/zpool.cache
file. In this case, use the -o cachefile=/tmp/zpool.cache
at zpool creation. You will move this file in its proper place before reboot.
The two pools should be appearing like this:
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
tank 1.78T 1.76G 1.78T 0% 1.01x ONLINE -
zboot 1.98G 92.5K 1.98G 0% 1.00x ONLINE -
When we will have created some filesystems on the disks, we will set the bootfs
property on both pools. I use a separate root
filesystem on my pools, it makes changing the /
fs much more simple and allow to have different ones.
Now that the pools have been created, we switch the algorithm used to checksum disk blocks. “fletcher4” is only slightly slower but better (just like CRC16 vs CRC32).
zfs set checksum=fletcher4 zboot
zfs set checksum=fletcher4 tank
Encrypted swap
Swap is slightly different, we will use the “onetime” command to geli(8)
through the way we declare swap in /etc/fstab
, that way we do not need to enter any passphrase because it is not needed to know it after attaching the partitions (see geli(8)
for details).
As I said earlier, we will use encrypted swap and geli
has automatic setup for that by adding the .eli
suffix in /etc/fstab
. So let’s create the gmirror
configuration for swap.
gmirror label swap gpt/swap0 gpt/swap1
The /etc/fstab
entry will look like the following:
/dev/mirror/swap.eli none swap sw 0 0
In this schema, we do not encrypt each swap partition but the mirror itself (swap over geli over gmirror) to avoid doing things twice.
Boot Environments (BE)
There is an interesting program called beadm, directly inspired by the Illumos utility of the same name that will be very handy to manage multiple versions of the OS (called boot environments) and upgrades. It will be interesting to see whether using beadm
is possible in our two pool environment.
What we will do is create our datasets according to the naming scheme of beadm
to ease migration to BE later.
Filesystems
We will globally follow the filesystem layout we used in HOWTO on 8.2 with more or less the same options.
Compression seems to create issues for kernel loading so we will avoid that on tank/root
. All other FS will inherit the compression
property though.
As for compression scheme, we can select different algorithms for compression. lz4
is the fastest available (it replaces lzjb which is still available) but gzip
is better compression-wise. FreeBSD9.2 has a new version of ZFS, using a feature-based numbering scheme instead of a single version number (see the New features in 10 page).
I have not done any benchmarking yet on this pool-wide compression. It may be too slow to use the default gzip compression (-6), please feel free to experiment there. You may wish to only enable compression on selected filesets. For now, I will use lz4
everywhere and disable it for specific cases like distfiles
.
If you are installing 9.1 (which I do not really recommend), replace lz4
by lzjb
.
zfs set compression=lz4 tank
zfs create -o compression=off tank/root
zfs create -o mountpoint=/tank/root/usr tank/usr
zfs create -o mountpoint=/tank/root/usr/obj tank/usr/obj
zfs create -o mountpoint=/tank/root/usr/local tank/usr/local
The reason why I create a separate /usr
fileset is that I want to have different
policies for compression, atime
property, snapshots and all that. You can also create another fileset for /usr/local
, once again to be able to snapshot separately the base system and the ports you will be using.
To complete what we want under /usr
, we will create /usr/src
to hold the system’s sources. We will need these to recompile a new trimmed-down kernel.
zfs create -o mountpoint=/tank/root/usr/src tank/usr/src
Now /var
and a few useful filesets here with special properties we care about to avoid security issues. Do not set exec=off
on /tmp
as it would prevent things like installworld
to run properly.
zfs create -o mountpoint=/tank/root/var tank/var
zfs create -o exec=off -o setuid=off tank/var/empty
zfs create -o exec=off -o setuid=off tank/var/named
zfs create -o exec=off -o setuid=off tank/var/run
zfs create -o mountpoint=/tank/root/var/tmp tank/var/tmp
zfs set exec=off tank/var/tmp
zfs set setuid=off tank/var/tmp
chmod 1777 /tank/root/var/tmp
zfs create -o mountpoint=/tank/root/tmp tank/tmp
zfs set setuid=off tank/tmp
chmod 1777 /tank/root/tmp
I would also recommend to put users’ home in a separate fileset for the same reason, if not a fileset per user if you want to limit users area to specific sizes.
zfs create -o mountpoint=/tank/root/home tank/home
Later, you will want to create tank/usr/ports/{distfiles,packages}
w/o compression as well. Properties like snapdir
can be changed later on so we are not forced to set them right now. If you are planning to use the new pkg(1)
command to deal with binary packages (aka pkgng
) then /usr/ports
is not needed.
zfs create -o mountpoint=/tank/root/usr/ports -o setuid=off tank/usr/ports
zfs create -o mountpoint=/tank/root/usr/ports/distfiles -o compression=off -o exec=off -o setuid=off tank/usr/ports/distfiles
zfs create -o mountpoint=/tank/root/usr/ports/packages -o compression=off -o exec=off -o setuid=off tank/usr/ports/packages
If you plan to have jails on this machine, it is a good idea to create a /jails
as well:
zfs create -o mountpoint=/tank/root/jails tank/jails
One of the nice things about ZFS is that for many things, you can use zfs create
instead of mkdir
. It won’t take that much diskspace and will allow you to specify different policies for backups/snapshots/compression for every filesystem.
One thing you want to know about ZFS is that it uses the Copy-On-Write principle and never overwrite data. Anytime you rewrite a block, a fresh one is written elsewhere and pointers updated (very fast summary, see the ZFS docs for more details). The main result is that when you have a completely filled up fileset, you can not remove files to make space as it would require some free space first. A way to mitigate that is ensuring you do not filled up a fileset and you can reserve some space in the “root” fileset.
zfs set reservation=512m tank
Deduplication
ZFSv28 and later support one interesting feature among many called deduplication (see deduplication on WP for more details). It needs to be enabled on all filesets you want to have deduplication on. Beware though that enabling deduplication will make ZFS use much more memory than before and that you can’t really go back.
zfs set dedup=on tank/usr/src
Afterwards, when you have put some files in there, you can check the deduplication status with zpool. On a 16 GB system with “only” 2 TB of data, deduplication can be enabled without too much trouble.
1008 [15:05] roberto@centre:~> zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
tank 1.78T 23.0G 1.76T 1% 1.09x ONLINE -
zboot 1.98G 390M 1.60G 19% 1.00x ONLINE -
In v5000 of ZFS, dedup can be achieved through byte-per-byte comparison or through the sha256
hash function. The latter is much faster.
Installing the system
There are several ways to extract the various parts of the distribution. You can get the -memstick
images, one of the cd9660
images or just download the *.txz
files from a FTP site.
Just like in HOWTO on 8.2, we will extract all distributions manually and fetch everything else from the ‘net.
In the example below, I have retrieved one of the -memstick
version, mounted it as an md
device in /mnt
and therefore I can find the *.txz
files in /mnt/usr/freebsd-dist
directory. Direct download from where you got the kernel.txz
works fine as well of course.
-rw-r--r-- 1 root wheel 782 May 5 07:34 MANIFEST
-rw-r--r-- 1 root wheel 66513440 May 5 07:34 base.txz
-rw-r--r-- 1 root wheel 1442744 May 5 07:34 doc.txz
-rw-r--r-- 1 root wheel 1117756 May 5 07:34 games.txz
-rw-r--r-- 1 root wheel 81061820 May 5 07:34 kernel.txz
-rw-r--r-- 1 root wheel 12291244 May 5 07:34 lib32.txz
-rw-r--r-- 1 root wheel 35504984 May 5 07:34 ports.txz
-rw-r--r-- 1 root wheel 98151876 May 5 07:34 src.txz
Extract all distributions
By default, root
is using /bin/csh
as login shell, you need to type sh
now in order to cut&paste the examples below.
cd /mnt/usr/freebsd-dist
for i in base doc games kernel lib32 src; do \
xz -d -c $i.txz | tar -C /tank/root/ -xf - ; \
done
Install configuration variables at proper places
We need to add variables to several files needed for the boot phase, you can use echo(1)
or even vi(1)
.
In /boot/loader.conf
:
# fs modules
zfs_load="YES"
geom_mirror_load="YES"
fdescfs_load="YES"
nullfs_load="YES"
# Crypto stuff
geom_eli_load="YES"
crypto_load="YES"
aesni_load="YES"
cryptodev_load="YES"
# pf stuff
pf_load="YES"
pflog_load="YES"
# tuning
vm.kmem_size="32G"
Using the following should not be necessary anymore as the loader is able to find which dataset will be used automatically:
vfs.root.mountfrom="zfs:tank/root"
For the first boot in order to verfiy everything is correct, you use the following variable to make the passphrase appear:
kern.geom.eli.visible_passphrase="1"
to be removed in production of course.
Then you can add some tunables for your ZFS installation:
# http://lists.freebsd.org/pipermail/freebsd-stable/2011-February/061388.html
vfs.zfs.txg.timeout="5"
You also need to point geli(4)
at the right bits for it:
geli_da0p4_keyfile0_load="YES"
geli_da0p4_keyfile0_type="da0p4:geli_keyfile0"
geli_da0p4_keyfile0_name="/boot/keys/boot.key"
geli_da1p4_keyfile0_load="YES"
geli_da1p4_keyfile0_type="da1p4:geli_keyfile0"
geli_da1p4_keyfile0_name="/boot/keys/boot.key"
Current recommendations for FS tuning includes setting kmem_size
at between 1.5x and 2x the available RAM. Be careful to use the right device name above for the geli_*
lines or you will not be able to attach the encrypted partitions. Do not use geli_tank0_
with gpt/tank0
for example, that will NOT work.
Do not forget to add the following (or another value) to /etc/sysctl.conf
or you will have issues at boot-time with vnode depletion:
##-- tuning
kern.maxvnodes=260000
Your /etc/rc.conf
should have some variables defined to properly boot:
zfs_enable="YES"
sshd_enable="YES"
hostname="hostname.example.com"
ntpd_enable="YES"
ntpd_sync_on_start="YES"
# or ifconfig_em0="DHCP"
ifconfig_em0="inet a.b.c.d netmask 0xffffff00" # or "DHCP"
geli_swap_flags="-e aes -l 256 -s 4096 -d"
(do not forget things like defaultrouter
and all that).
Exit the chroot area if you were in one for easy editing of the previous files.
Finishing up
There are several steps to follow before even rebooting the first time (you do remember that every time you reboot, you have to log on the console and enter the encryption passphrase, right?).
Generate zpool.cache
cd /
mkdir /boot/zfs
zpool export tank && zpool import tank
cp /boot/zfs/zpool.cache /tank/root/boot/zfs/
Copy the /boot bits into place for real boot
cp -pR /root/keys /tank/root/boot/
cd /tank/root
mkdir /zboot/boot
cp -Rp boot/* /zboot/boot/
Configuring the encrypted swap in /tank/root/etc/fstab
scoite# cat /tank/root/etc/fstab
/dev/mirror/swap.eli none swap sw 0 0
Another issue to look for is that by default, you won’t be able to have kernel crash dumps on a gmirror device (see gmirror(8)
for the details and solution). We need to use two special scripts used in the boot process to work around that limitation (as we do not want to always use the prefer
setting for mirrored swap):
echo 'gmirror configure -b prefer swap'>>/tank/root/etc/rc.early
echo 'gmirror configure -b round-robin swap'>>/tank/root/etc/rc.local
Fixing mount points
cd /
zfs umount -a
zfs set mountpoint=legacy tank
zfs set mountpoint=/jails tank/jails
zfs set mountpoint=/tmp tank/tmp
zfs set mountpoint=/var tank/var
zfs set mountpoint=/var/empty tank/var/empty
zfs set mountpoint=/var/named tank/var/named
zfs set mountpoint=/var/run tank/var/run
zfs set mountpoint=/var/tmp tank/var/tmp
zfs set mountpoint=/usr tank/usr
zfs set mountpoint=/usr/local tank/usr/local
zfs set mountpoint=/usr/obj tank/usr/obj
zfs set mountpoint=/usr/ports tank/usr/ports
zfs set mountpoint=/usr/ports/distfiles tank/usr/ports/distfiles
zfs set mountpoint=/usr/ports/packages tank/usr/ports/packages
…and for all other filesets you added back above without forgetting to set the bootfs
property on the right fileset:
zpool set bootfs=tank/root tank
Things to remember
Whe you system is up and working, you will at some point want to update, either to stay close to the branch or because of a security issue or whataver.
Some things to keep in mind:
-
zboot
is the main source for early stage of booting, if you update/boot
(which is in the other, encrypted pooltank
), you will need to updatezboot/boot
with the new binaries BUT do not replace without copying theGELI
keys which are in/zboot/boot/keys
or you will not be able to boot. -
Be careful when booting, most of the KVM devices have issues with some special keyboard (like iDRAC6 with a Mac one) so when typing your GELI encryption keys, check before that the keys are the good ones :)
- you still need to create users, install packages, create your jails (if needed) and so on. Do not forget to also assign a password to the
root
account (personally I almost never use that account, I prefer using my own Calife orsudo(8)
for that).
Resources
I have begun writing this script to put everything into a single .sh.
NOTE: This is a work-in-progress, please check it regularly for updates.
After discovering Ansible as an automation tool, I’m writing an Ansible playbook to make everything mentioned above easier to configure.
You can find this work in progress on Github. All feedback is welcome, patches or pull requests even more so :)
Feedback
Please send any comment, addition, correction to my FreeBSD mail or my personal mail. Thanks to all who have done it already.
History
1.0 Creation
1.1 Update for recentmfsbsd
images
1.2 Update after recent experiment on 9.1/mfsbsd
1.3 Updated with boot environments and easier way to deal with initial mountpoints by Thomas Quinot
1.4 Mention the Ansible playbook as work in progress.
1.5 Mention thegnop(8)
trick
Credits
Thanks to these people for input and corrections.
Stéphane “KingBug” Clodic
Paul Guyot paul@semiocast.com
Thomas Quinot thomas@quinot.org
iMil on Twitter