All things ZFS

Custom Samba Sizing

After reorganizing my ZFS datasets a bit, I suddenly noted I couldn’t copy any file larger than a few MB. A bit of investigation later and I figured why it was so.

My ZFS data sets were as follows:

zfs list
 NAME                            USED  AVAIL  REFER  MOUNTPOINT
 Data                           2.06T   965G    96K  none
 Data/Users                      181G   965G    96K  none
 Data/Users/User1               44.3G  19.7G  2.23G  /Data/Users/User1
 Data/Users/User2               14.7G  49.3G   264K  /Data/Users/User2
 Data/Users/User3                224K  64.0G    96K  /Data/Users/User3

And my Samba share was pointing to /Data/Users/.

Guess what? Path /Data/Users was not pointing to any dataset as my parent dataset for Data/Users was not mounted. Instead it pointed to memory disk md0 which had just a few MB free. Samba doesn’t check full path for disk size but only its root share.

The easiest way to workaround this would be to simply mount parent dataset. But why go for easy?

A bit more complicated solution is getting Samba to use custom script to determine free space. We can then use this script to return available disk space for our parent dataset instead of built-in samba calculation.

To do this, we first create script /myScripts/sambaDiskFree:

#!/bin/sh
DATASET=`pwd | cut -c2-`
zfs list -H -p -o available,used $DATASET | awk '{print $1+$2 " " $1}'

This script will check current directory, map its name to dataset (in my case it is as easy as stripping first slash character) and return two numbers. First is total disk space, followed by available diskspace - both in bytes.

Once script is saved and marked as executable (chmod +x), we just need to reference it in Services > CIFS/SMB > Settings under Additional parameters:

dfree command = /myScripts/sambaDiskFree

This will tell Samba to use our script for disk space determinations.

Adding Mirrored Disk to Existing ZFS Pool

Great thing about ZFS is that even with a single disk you get some benefits - data integrity being the most important. And all ZFS commands work perfectly well, for example status:

zpool status
   pool: Data.Tertiary``
  state: ONLINE``
 config:``
         NAME                   STATE     READ WRITE CKSUM``
         Data.Tertiary          ONLINE       0     0     0``
           diskid/DISK-XXX.eli  ONLINE       0     0     0``

However, what if one disk is not sufficient any more? It is clear zpool add can be used to create striped pool for higher speeds. And it is clear we can add another device to make a three way mirror. But what if we want to convert solo disk to mirror configuration?

Well, in that case we can get creative with attach command giving it both disks as an argument:

zpool attach Data.Tertiary ^^diskid/DISK-XXX.eli^^ ^^diskid/DISK-YYY.eli^^

After a few seconds, our mirror is created with all our data intact:

zpool status
   pool: Data.Tertiary
  state: ONLINE
 status: One or more devices is currently being resilvered.  The pool will
         continue to function, possibly in a degraded state.
 action: Wait for the resilver to complete.
 config:
         NAME                     STATE     READ WRITE CKSUM
         Data.Tertiary            ONLINE       0     0     0
           mirror-0               ONLINE       0     0     0
             diskid/DISK-XXX.eli  ONLINE       0     0     0
             diskid/DISK-YYY.eli  ONLINE       0     0     0  (resilvering)

PS: Yes, I use encrypted disks from /dev/diskid/ as I used them in previous ZFS examples. If you want plain devices, just use ada0 and companions instead.

Encrypted ZFS for My Backup Machine

I already wrote about my ZFS setup. However, for my new machine I made a few changes. However, setup is still NAS4Free based.

The very first thing I forgot last time is randomizing the disks upfront. While not increasing security of new data, it does remove any old unencrypted bits you might have laying around. Even if disk is fresh, you don’t want zeros showing where your data is. Dangerous utility called dd comes handy here (once for each disk):

dd if=/dev/urandom of=/dev/ada0 bs=1M
dd if=/dev/urandom of=/dev/ada1 bs=1M

This takes a while but fortunately it is possible to see current progress with Ctrl+T. Do use tmux to keep session alive as this will take long time (with a big disk, more than a day is not unexpected).

Next, instead of using glabel, I decided to use the whole disk. That makes it easier to move disk later to other platform. No, I am not jumping BSD ship but I think having setup that can change environments is really handy for emergency recovery.

While ZFS can handle using device names like ada0 and ada1 and all shenanigans that come with their dynamic order, I decided to rely on serial number of drive. Normally device labels containing serial number are found under /dev/diskid/ directory. However, NAS4Free doesn’t have them on by default.

To turn them on, we go to System, Advanced, and loader.conf tab. There we add kern.geom.label.disk_ident.enable=1 and reboot. After this, we can use /dev/diskid/* for drive identification.

Those drives I then encrypt and attach each drive:

geli init -e AES-XTS -l 128 -s 4096 /dev/diskid/^^DISK-WD-WCC7KXXXXXXX^^
geli init -e AES-XTS -l 128 -s 4096 /dev/diskid/^^DISK-WD-WCC7KYYYYYYY^^

geli attach /dev/diskid/^^DISK-WD-WCC7KXXXXXXX^^
geli attach /dev/diskid/^^DISK-WD-WCC7KYYYYYYY^^

Finally, I can create the pool. Notice that I put quota around 80% of the total pool capacity. Not only this helps performance but it also prevents me from accidentally filling the whole pool. Dealing with CoW file system when it is completely full is something you want to avoid. And also, do not forget .eli suffix.

zpool create -o autoexpand=on -m none -O compression=on -O atime=off -O utf8only=on -O normalization=formD -O casesensitivity=sensitive -O quota=3T Data mirror /dev/diskid/^^DISK-WD-WCC7KXXXXXXX^^.eli /dev/diskid/^^DISK-WD-WCC7KYYYYYYY^^.eli

zdb | grep ashift
            ashift: 12

Once pool was created, I snapshotted each dataset on old machine and sent it over network. Of course, this assumes your pool is named Data, you are working from “old” machine, and new machine is at 192.168.1.2:

zfs snapshot -r ^^Data^^@Migration
zfs send -Rv ^^Data^^@Migration | ssh ^^192.168.1.2^^ zfs receive -Fs ^^Data^^

This step took a while (more than a day) as all datasets had to be recursively sent. Network did die a few times but resumable send saved my ass.

First I would get token named receive_resume_token from the destination:

zfs get receive_resume_token

And resume sending with:

zfs send -v -t ^^<token>^^ | ssh ^^192.168.1.2^^ zfs receive -Fs ^^Data/dataset^^

Unfortunately resume token does not work with recursion so each dataset will have to be separately specified from that moment onward.

Once bulk of migration was done, I shut every single service on old server. After that I took another (much smaller) snapshot and sent it over network:

zfs snapshot -r ^^Data^^@MigrationFinal
zfs send -Ri ^^Data^^@Migration ^^Data^^@MigrationFinal | ssh ^^192.168.1.2^^ zfs receive -F ^^Data^^

And that is it - shutdown the old machine and bring services up on the new one.

PS: If newly created machine goes down, it is enough to re-attach GELI disks followed by restart of ZFS daemon:

geli attach /dev/diskid/^^DISK-WD-WCC7KXXXXXXX^^
geli attach /dev/diskid/^^DISK-WD-WCC7KYYYYYYY^^
/etc/rc.d/zfs onestart

[2018-07-22: NAS4Free has been renamed to XigmaNAS as of July 2018]

Creating a ZFS Backup Machine

With my main ZFS machine completed, time is now to setup a remote backup. Unlike main server with two disks and an additional SSD, this one will have just a lonely 2 TB disk inside. Main desire is to have a cheap backup machine that we’ll hopefully never use for recovery.

OS of a choice is NAS4Free and I decided to install it directly on HD without a swap partition. Installing on a data drive is a bit controversial but it does simplify setup quite a bit if you move drive from machine to machine. And the swap partition is pretty much unnecessary if you have more than 2 GB of RAM. Remember, we are just going to sync to this machine - nothing else.

After NAS4Free got installed (option 4: Install embedded OS without swap), disk will contain a single boot partition with the rest of space flopping in the breeze. What we want is to add a simple partition on the 4K boundary for our data:

gpart add -t freebsd -b 1655136 -a 4k ada0
 ada0s2 added

Partition start location was selected to be the first one on a 4KB boundary after the 800 MB boot partition. We cannot rely on gpart as it would select the next available location and that would destroy performance on a 4K drives (pretty much any spinning drive these days). And we cannot use freebsd-zfs for partition type since we are playing with MBR partitions and not GPT.

To make disk easier to reach, we label that partition:

glabel label -v disk0 ada0s2

And we of course encrypt it:

geli init -e AES-XTS -l 128 -s 4096 /dev/label/disk0
geli attach /dev/label/disk0

Last step is to actually create our backup pool:

zpool create -O readonly=on -O canmount=off -O compression=on -O atime=off -O utf8only=on -O normalization=formD -O casesensitivity=sensitive -O recordsize=32K -m none Backup-Data label/disk0.eli

To backup data we can then use zfs sync for initial sync:

DATASET="Data/Install"
zfs snapshot ${DATASET}@$Backup_Next
zfs send -p $DATASET@$Backup_Next | ssh $REMOTE_HOST zfs receive -du Backup-Data
zfs rename $DATASET@$Backup_Next Backup_Prev

And similar one for incremental from then on:

DATASET="Data/Install"
zfs snapshot ${DATASET}@$Backup_Next
zfs send -p -i $DATASET@$Backup_Prev $DATASET@$Backup_Next | ssh $REMOTE_HOST zfs receive -du Backup-Data
zfs rename $DATASET@$Backup_Next Backup_Prev

There is a lot more details to think about so I will share script I am using - adjust at will.

Other ZFS posts in this series:

[2018-07-22: NAS4Free has been renamed to XigmaNAS as of July 2018]

Adding Cache to ZFS Pool

To increase performance of a ZFS pool I decided to use read cache in the form of SSD partition. Ss always with ZFS, certain amount of micromanagement is needed for optimal benefits.

Usual recommendation is to have up to 10 GB of cache for each 1 GB of available RAM since ZFS keeps headers for cached information always in RAM. As my machine had total of 8 GB, this pretty much restricted me to the cache size in 60es range.

To keep things sane, I decided to use 48 GB myself. As sizes go, this is quite unusual one and I doubt you can even get such SSD. Not that it mattered as I already had leftover 120 GB SSD laying around.

Since I already had Nas4Free installed on it, I checked partition status

gpart status
  Name  Status  Components
  da1s1      OK  da1
 ada1s1      OK  ada1
 ada1s2      OK  ada1
 ada1s3      OK  ada1
 ada1s1a     OK  ada1s1
 ada1s2b     OK  ada1s2
 ada1s3a     OK  ada1s3

and deleted the last partition:

gpart delete -i 3 ada1
 ada1s3 deleted

Then we have to create partition and label it (optional):

gpart add -t freebsd -s 48G ada1
 ada1s3 added

glabel label -v cache ada1s3

As I had encrypted data pool, it only made sense to encrypt cache too. For this it is very important to check physical sector size:

camcontrol identify ada1 | grep "sector size"
 sector size           logical 512, ^^physical 512^^, offset 0

Whichever physical sector size you see there is one you should give to geli as otherwise you will get permanent ZFS error status when you add cache device. It won’t hurt the pool but it will hide any real error going on so it is better to avoid. In my case, physical sector size was 512 bytes:

geli init -e AES-XTS -l 128 -s ^^512^^ /dev/label/cache
geli attach /dev/label/cache

Last step is adding encrypted cache to our pool:

zpool add Data cache label/cache.eli

All left is to enjoy the speed. :)

Other ZFS posts in this series:

[2018-07-22: NAS4Free has been renamed to XigmaNAS as of July 2018]

My Encrypted ZFS Setup

For my Nas4Free-based NAS I wanted to use full-disk encrypted ZFS in a mirror configuration across one SATA and one USB drive. While it might not be optimal for performance, ZFS does support this scenario.

On booting Nas4Free I discovered my disk devices were all around the place. To identify which one is which, I used diskinfo:

 diskinfo -v ada0
 ada0
         512             # sectorsize
         2000398934016   # mediasize in bytes (1.8T)
         3907029168      # mediasize in sectors
         4096            # stripesize
         0               # stripeoffset
         3876021         # Cylinders according to firmware.
         16              # Heads according to firmware.
         63              # Sectors according to firmware.
         S34RJ9AG212718  # Disk ident.

Once I went through all drives (USB drives are named da*), I found my data disks were at ada0 and da2. To avoid any confusion in the future and/or potential re-enumeration if I add another drive, I decided to give them a name. SATA disk would be known as disk0 and USB one as disk1:

glabel label -v disk0 ada0
 Metadata value stored on /dev/ada0.
 Done.

glabel label -v disk1 da2
 Metadata value stored on /dev/da2.
 Done.

Do notice that you lose the last drive sector for the device name. In my opinion, a small price to pay.

On top of the labels we need to create encrypted device. Beware to use labels and not the whole disk:

geli init -e AES-XTS -l 128 -s 4096 /dev/label/disk0
geli init -e AES-XTS -l 128 -s 4096 /dev/label/disk1

As initialization doesn’t really make devices readily available, both have to be manually attached:

geli attach /dev/label/disk0
geli attach /dev/label/disk1

With all things dealt with, it was time to create the ZFS pool. Again, be careful to attach inner device (ending in .eli) instead of the outer one:

zpool create -f -O compression=on -O atime=off -O utf8only=on -O normalization=formD -O casesensitivity=sensitive -m none Data mirror label/disk{0,1}.eli

While both SATA and USB disk are advertised as the same size, they do differ a bit. Due to this we need to use -f to force ZFS pool creation (otherwise we will get “mirror contains devices of different sizes” error). Do not worry for data as maximum available space will be restricted to a smaller device.

I decided that pool is going to have the compression turned on by default, there will be no access time recording, it will use UTF8, it will be case sensitive (yes, I know…) and it won’t be “mounted”.

Lastly I created a few logical datasets for my data. Yes, you could use a single dataset, but quotas make handling of multiple ones worth it:

zfs create -o mountpoint=/mnt/Data/Family -o quota=768G Data/Family
zfs create -o mountpoint=/mnt/Data/Install -o quota=256G Data/Install
zfs create -o mountpoint=/mnt/Data/Media -o quota=512G Data/Media

As I am way too lazy to login after every reboot, I also saved my password into the password.zfs file on the TmpUsb self-erasable USB drive. A single addition to System->Advanced->Command scripts as a postinit step was need to do all necessary initialization:

/etc/rc.d/zfs onestop ; mkdir /tmp/TmpUsb ; mount_msdosfs /dev/da1s1 /tmp/TmpUsb ; geli attach -j /tmp/TmpUsb/password.zfs /dev/label/disk0 ; geli attach -j /tmp/TmpUsb/password.zfs /dev/label/disk1 ; umount -f /tmp/TmpUsb/ ; rmdir /tmp/TmpUsb ; /etc/rc.d/zfs onestart

All this long command does is mounting of the FAT12 drive containing the password (since it was recognized as da1 its first partition was at da1s1) and uses file found there for attaching encrypted devices. Small restart of ZFS subsystem is all that is necessary for pool to reappear.

As I wanted my TmpUsb drive to be readable under Windows, it is not labeled and thus manual script correction might be needed if further USB devices are added.

However, for now, I had my NAS box data storage fully up and running.

Other ZFS posts in this series:

Growing a ZFS Pool

Illustration

As I was building my media server, I decided I must have support for both network file sharing and DAAP protocol. That way I could (comfortably) listen to my media collection using both my PC and my mobile phone. That simple requirement and fact that I had some experience with FreeNAS 0.7 drove me to NAS4Free.

I installed it in VirtualBox with two virtual drives; one 512 MB and another 2 GB. On smaller one I did embedded install of NAS4Free. Second one I formatted as ZFS (a.k.a. single *nix file system that doesn’t suck on power loss) and assigned to a single ZFS virtual device which was in turn assigned to a single ZFS pool. Both DAAP and samba were then forwarded to this share for their consumables.

Since I naively allocated only 2 GB I soon stumbled upon topic of this blog post. How do I increase my ZFS volume?

Increasing virtual disk is really easy and there is probably handful tools for every disk type (VHD in my case). I decided to use VirtualBox built-in vboxmanage tool (found in VirtualBox’s directory; don’t forget to turn off virtual machine):

VBoxManage.exe modifyhd "D:\Media.vhd" --resize 8192
 0%...10%...20%...30%...40%...50%...60%...70%...80%...90%...100%

While this was simple enough, it didn’t resolve anything. Next boot-up showed that ZFS still assumed 2 GB as size of a pool (at Disks, ZFS, Pools). If there was only a way for ZFS to recognize disk was bigger…

Fortunately there was just such a command. I went to Advanced, Command and executed following line:

zpool online -e Media-Pool ada1

Media-Pool was name of ZFS pool I was increasing and ada1 was actual physical disk that pool was located on. And that was it, my disk was increased painlessly and without any need for copying data around (except for backup purpose, of course). While it was nowhere close to comfort of using mouse to perform this task in Disk Management, it wasn’t too bad either.

PS: This post assumes that you know how to use NAS4Free. If you don’t use it, do try it out. You’ll love it.

PPS: Essentially same procedure works for FreeNAS or any other situation where you have virtualized disk drive with ZFS pool on it.

[2018-07-22: NAS4Free has been renamed to XigmaNAS as of July 2018]