Recovering ZFS

Illustration

Well, after using ZFS for years, it was only a matter of time before I encountered an issue. It all started with me becoming impatient with my Ubuntu 20.04 desktop. The CPU was at 100% and the system was actively writing to disk (courtesy of ffmpeg), but I didn’t want to wait. So, I decided to do a hard reset. What’s the worst that could happen?

Well, boot process took a while and I was presented with bunch of entries looking like this:

INFO: task z_wr_iss blocked for more than 122 seconds.
      Tainted: P      0E      6.8.0-35-generic #35-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task z_metaslab blocked for more than 122 seconds.
      Tainted: P      0E      6.8.0-35-generic #35-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
INFO: task vdev_autotrim blocked for more than 122 seconds.
      Tainted: P      0E      6.8.0-35-generic #35-Ubuntu
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
…

At first, I thought the system would recover on its own as this wasn’t the first time I had mistreated it. However, leaving it alone did nothing. So, it was time for recovery.

The first step was getting to the GRUB menu. Pressing <ESC> multiple times during boot simply dropped me to a prompt. And yes, you can load initramfs manually from there, but it’s a bit tedious. However, in this case, I just typed normal to get the menu, followed by <E> to edit the line starting with “linux”. There, I appended break, telling the system to drop me into initramfs prompt so that I could manually load ZFS.

From here, there was another hurdle to bypass. While this stopped before ZFS was loaded, it also stopped before my disks were decrypted. Had I used native ZFS encryption, this wouldn’t be a problem, but I wanted LUKS, so now I had to load them manually. As I had a mirror, I used the following to open both:

DISK1=/dev/disk/by-id/nvme-<disk1>
DISK2=/dev/disk/by-id/nvme-<disk2>

cryptsetup luksOpen $DISK1-part4 ${DISK1##*/}-part4
cryptsetup luksOpen $DISK2-part4 ${DISK2##*/}-part4

Finally, I was ready to import the pool and decided to do it in read-only mode:

zpool import -o readonly=on <pool>

And surprisingly, it worked. That also meant that my recovery efforts didn’t need to go too far. So, I decided to try importing it again but in read/write mode:

zpool export <pool>
zpool import <pool>

And then I was greeted with an ominous message:

PANIC: zfs: adding existent segment to range tree

However, the import didn’t get stuck as such, and my data was still there. So, I decided to give it a good scrub:

zpool scrub <pool>

While the scrub didn’t find any errors, going over all data seemed to have resulted in data structures “straightening out” and thus everything looked as good as before.

One reboot later, and I got into my desktop just fine.


PS: If that failed, I would have probably gone with zpool import -F <pool>.

PPS: If that also failed, disabling replays would be my next move.

echo 1 > /sys/module/zfs/parameters/zil_replay_disable
echo 1 > /sys/module/zfs/parameters/zfs_recover

PPPS: You can also add those parameters to “linux” grub line (zfs.zil_replay_disable=1 zfs.zfs_recovery=1).

Rendering KiCAD PCB to PNG

Illustration

I have pretty much automated creation of KiCAD export files thanks to its command line interfacehttps://docs.kicad.org/8.0/en/cli/cli.html). However, there was always one thing missing - image for each PCB. Yes, not really necessary for manufacturing but it’s really handy when you want to just quickly check things without starting KiCAD and/or for any inventory system.

Before KiCAD 8, getting a PCB image required exporting to .step file and then converting stuff around. It got complicated enough with all the prerequisites that I have essentially given up. Fortunately, that’s not true anymore, and now we can get .wrl files that can be often used directly.

kicad-cli pcb export vrml --output "board.wrl" "board.kicad_pcb"

While exported files don’t include lights, this usually doesn’t matter to viewers who add their own. However, for our processing, we want to add light that is just slightly to the top-left (at 0.1 -0.1 -1) of the camera (at 0 0 -1).

head -1 "board.wrl" > "board.front.wrl"
cat <<EOF >> "board.front.wrl"
Transform {
    children [
        DirectionalLight {
            on TRUE
            intensity 0.63
            ambientIntensity 0.21
            color 1.0 1.0 1.0
            direction 0.1 -0.1 -1
        }
EOF
cat "board.wrl" >> "board.front.wrl"
echo "] }" >> "board.front.wrl"

And yes, this method is a bit crude but it does work.

In order to reduce this 3D model into 2D image, ray tracing comes in handy. However, for that we need an external tool. I found that RayHunter worked great from the command line. While you only need that tool, you might want to check Castle Model Viewer as it can show you parameters in a bit more interactive way. Please note you also need libpng-dev package for PNG output.

cd ~/Downloads
wget https://master.dl.sourceforge.net/project/castle-engine/rayhunter/rayhunter-1.3.4-linux-x86_64.tar.gz
tar xzvf rayhunter-1.3.4-linux-x86_64.tar.gz
sudo install -m 0755 ~/Downloads/rayhunter/rayhunter /usr/local/bin/rayhunter

sudo apt install --yes libpng-dev

With prerequisites out of way, now we can finally export our board, looking directly at its front side. Camera position is a bit further away than I usually need it for my board but resolution of 4320x4320 pixels is large enough to then later crop the unneeded pixels.

rayhunter classic 7 \
    4320 4320 \
    "board.front.wrl" \
    "board.front.png" \
    --camera-pos 0 0 6 \
    --camera-dir 0 0 -1 \
    --scene-bg-color 1 1 1

You can see what each parameter does in rayhunter documentation

If all went fine, this will give you PCB board nicely rendered but with quite a lot of whitespace. In order to remove those extra pixels, I like to use ImageMagick.

sudo apt install -y imagemagick

Using convert command I trim all the whitespace, resize it to 1060x1060 pixels, add 10 pixels boarder on each side, and finally extend it to the full 1080x1080 size.

convert \
    "board.front.png" \
    -trim \
    -resize 1060x1060 -normalize -density 600 \
    -bordercolor white -border 10 \
    -gravity center -extent 1080x1080 \
    "board.front.png"

Congrats, your PCB image should be looking good just about now.


PS: You can do this from any PCB orientation by just adjusting camera and light position.

Supermicro IPMI Password Reset

For a long while I had issues with my AMD Epyc Supermicro board. While the issues “smelled” like memory, it wasn’t easy to pinpoint why exactly the system would get stuck couple times a week. It all pointed either toward the motherboard or, more likely, the processor itself. Either way, I decided to obtain an equivalent board.

Being cheap bastard, I didn’t want to buy it new but decided to peruse local Craigslist (no luck) and eBay. Being my lucky day, I got a really good price on a similar but not exactly the same board. Yes, the new board was Intel, used slightly more watts, less memory bandwidth, and fewer LAN ports. That said, it allowed me to use the same ECC memory I used on the AMD board, and those 2 LAN ports were of the 10G variant which does make me want to buy a new switch. Suffice to say, I don’t consider it a downgrade.

But, as always when used boards are bought, one doesn’t necessarily have all the passwords. In my case, the missing password was for the remote management interface (IPMI). Clearing CMOS was of no help.

I did search the Internet for solutions but none worked exactly as written. After slight modifications, I managed to erase my IPMI configuration so I might as well share what worked for me in 2024. I am sure that in a couple of years, these instructions will be incomplete too but that’s a problem for a future me.

First of all, you need IPMI tools. I downloaded them directly from Supermicro as these low-level operations tend not to work properly when using generic tools you might have with your Linux distribution. And yes, these instructions will be Linux-based. If you have something else, check the tools regardless as they contain executables for other OS’ too.

For Linux, we want to extract the IPMICFG-Linux.x86_64 file and then allow for its execution by setting the x bit:

chmod +x ./IPMICFG-Linux.x86_64

An excellent way to check if the tool is working (and to double-check if you’re on the correct host) is to see what the current IPMI settings actually are. Run:

./IPMICFG-Linux.x86_64 -m

Finally, we can fully reset IPMI by completely erasing its configuration (-fde) followed by a factory reset of the actual use Some guides would have you just perform a user reset but that didn’t work for me. I had to reset both and in this order:

./IPMICFG-Linux.x86_64 -fde -d
./IPMICFG-Linux.x86_64 -fd 3 -d

Finally, the usual ADMIN/ADMIN can be used to get into IPMI.

ZFS Encryption Speed (Ubuntu 24.04)

Well, another Ubuntu version, another set of encryption performance tests. Here are the results for Ubuntu 24.04 on kernel 6.8 using ZFS 2.2.2. As I’m doing this for quite a few versions now, you can find older tests for Ubuntu 23.10, 23.04, 22.10, 22.04, 20.10, and 20.04.

Testing was done on a Framework laptop with an i5-1135G7 processor and 64GB of RAM. Once booted into installation media, I execute the script that creates a 42 GiB RAM disk that hosts all data for six 6 GiB files. Those files are then used in a RAIDZ2 configuration to create a ZFS pool. The process is repeated multiple times to test all different native ZFS encryption modes in addition to a LUKS-based test. This whole process is repeated again with AES disabled. As before, the test is a simple DD copy of 4 GB files; however, this time I included FIO tests for sequential and random read/write. One thing absent for the 24.04 round is a 2-core run. Relative performance between a 2-core and 4-core setup remained about the same over many years I’ve been doing this testing and thus it doesn’t really seem worth the effort.

Illustration

Since I am testing on the same hardware as previously, I expected little to no difference in performance but I was pleasantly surprised as performance did significantly increase across the board by about 20%. Considering 23.10 decreased performance by 10%, it’s nice to see we have that performance recovered with a bit of improvement on top. If you need more disk performance out of your existing hardware, you should really consider upgrading to Ubuntu 24.04.

When it comes to the relative performance, nothing really changed. ZFS encryption is still more performant than LUKS on writes and LUKS exhibits slightly higher performance when it comes to reads. CCM modes are still atrocious but, if your processor doesn’t have AES support, might be useful.

Illustration

As, going forward, I plan to use FIO instead of a simple dd copy, it’s as good time to analyze those numbers too. Unsurprisingly, the sequential performance numbers as compared to the simple DD copy are about the same. The only outlier seems to be read performance that drops a bit more than other readings. My best guess is that this is due to higher parallel IO demands FIO makes.

Illustration

Since I am using FIO, I decided to add random I/O too. I expected results to be lower but numbers surprised me still. Write performance dropped to 50 MB/s without encryption. With encryption performance drops even further to 30 MB/s. Fortunately, real loads are not as unforgiving as FIO so you can expect much better performance in real-life.

In future, there are a few things I plan to change. First of all, I plan to switch onto using FIO instead of DD. While I will probably still collect DD data, it will just be there so one can compare it more easily to older tests and not as a main tool. Secondly, I plan to switch LUKS to 4K blocks and not bother measuring 512-byte sector size at all. Most of drives these days have 4K sectors and thus it makes sense that any proper LUKS installation would match that sector size. Making it default just makes sense. Performance-wise, they’re not a huge improvement but the do bring LUKS numbers closer to the native encryption.


PS: Raw data is available in Google Sheets.

AMD processor temperature under Ubuntu 24.04

I often like to check my laptop’s temperature when I am doing something that requires a lot of power. I found knowing temperature really helps with understanding where the limits lie. However, my old scripts that worked on Intel systems doesn’t work on AMD. So I went to research it a bit.

After a bit of snooping around, all the data can be found under /sys/class/hwmon/. It’s there where we can find multiple _label files which describe a temperature source. The one we’re after is Tctl. Once we look over all of these, THERMAL_SOURCES variable should contain the file path (or more of them) for the temperature expressed in thousands of ℃.

for THERMAL_LABEL_FILE in `find /sys/class/hwmon/hwmon?/ -type f -name "temp*_label" -print`; do
    THERMAL_LABEL=`cat "$THERMAL_LABEL_FILE"`
    if [ "$THERMAL_LABEL" = "Tctl" ]; then
        THERMAL_SOURCES="`echo $THERMAL_LABEL_FILE | sed 's/_label$/_input/g'`"
    fi
done

Knowing which file contains a temperature is only the first part. What I like to do next is to fold all temperatures (if multiple sources exist) into a single figure by selecting the maximum value. Then, it’s just a matter of moving the decimal point around to get a while number reading.

TEMP_ALL="$(cat $THERMAL_SOURCES | awk '{print $1}' | sort -n)"
TEMP_MAX="$(echo "$TEMP_ALL" | tail -n 1 | awk '{print int(($1 + 500) / 1000) }')"