Basic XigmaNAS Stats for InfluxDB

My home monitoring included pretty much anything I wanted to see with one exception - my backup NAS. You see, I use embedded XigmaNAS for my backup server and getting telegraf client onto it is problematic at best. However, who needs Telegraf client anyhow?

Collecting stats themselves is easy enough. Basic CPU stats you get from Telegraf client usually can be easily read via command line tools. As long as you keep the same tags and fields as what Telegraf usually sends you can nicely mingle our manually collected stats with what proper client sends.

And how do we send it? Telegram protocol is essentially just a set of lines pushed using HTTP POST. Yes, if you have a bit more secure system, it’s probably HTTPS and it might even be authenticated. But it’s still POST in essence.

And therein lies XigmaNAS’ problem. There is no curl or wget tooling available. And thus sending HTTP POST on embedded XigmaNAS is not possible. Or is it?

Well, here is the beauty of HTTP - it’s just a freaking text over TCP connection. And ancient (but still beloved) nc tool is good at exactly that - sending stuff over network. As long as you can “echo” stuff, you can redirect it to nc and pretend you have a proper HTTP client. Just don’t forget to set headers.

To cut the story short - here is my script using nc to push statistics from XigmaNAS to my Grafana setup. It’ll send basic CPU, memory, temperature, disk, and ZFS stats. Enjoy.

Drive Shucking

Diskar morghulis.

All drives must die. That was the thought coming through my mind as I noticed remaps pending statistics for one of my backup NAS drives increasing. And it wasn’t just statistics, ZFS was showing checksum errors too. No doubt, it was a time for a new drive.

Even thought I was lucky enough to have older generation CMR Red drive, I was also unlucky enough to be out of warranty - by 2 months. Since my needs increased since, I also didn’t want to just get the same drive again. Nope, I wanted to do a first step toward more capacity in my backup mirror.

I checked prices of drives and saw that they’re not where I wanted them to be. So, after giving it some thought I went looking into alternatives. Finally I decided to go the same route I went when I created my very first NAS located in USA. Shucking.

For those wondering, shucking drives is just a funny name for buying an external drive, removing it from enclosure, and using it just as you would any internal drive. One major advantage is cost. These drives are significantly cheaper than special NAS drives. I got 12 TB priced at $195 when same size Red was around $300. Those are a significant savings.

Downside is that you have no idea what you’re gonna get. Yes, if you order drive larger than 8 TB, you can count on CMR but anything else is an unknown. Most of times you’re gonna end up with a “white label” drive. This seems to be enterprise-class drive with power disable feature (which causes issues with some desktop power supplies) spinning at 5,400 instead of 7,200. Essentially, there is a good chance you got a drive that couldn’t pass internal tests at the full speed.

This is also reflected in warranty. My drive only came with 2-year warranty. Even worse, there is a decent chance manufacturer will simply refuse a service unless you send the whole enclosure. If enclosure gets damaged while getting the drive out - you might be out of luck.

Regardless, savings were too tempting to refuse so I got myself one. It’s a backup machine after all.

To minimize any risk of dead-on-arrival, I actually used it in its intended form - as an USB drive. The first step was to encrypt the whole drive thus essentially writing over each byte. This took ages, reminding me why larger drives might not be the best choice. Once whole disk was filled with random data, I placed it into my ZFS mirror and let resilvering do its magic.

Only once the drive was fully accepted into a 3-way mirror, I performed the shucking as shown YouTube video. Once it was out of its enclosure I powered off the server (no hot-swap) and replaced the failing drive with it. ZFS was smart enough to recognize it’s the same drive and only remaining task was to manually remove now removed old drive.

No issues so far. Let’s see how it goes.


PS: I am not above buying used drive on eBay either but these days asking prices for used drives are just ridiculous…

Mikrotik SNMP via Telegraf

As I moved most of my home to Grafana/InfluxDB monitoring, I got two challenges to deal with. One was monitoring my XigmaNAS servers and the other was properly handling Mikrotik routers. I’ll come back to XigmaNAS in one of later posts but today let’s see what can be done for Miktorik.

Well, Miktorik is a router and essentially all routers are meant to be monitored over SNMP. So, the first step is going to be turning it on from within System/SNMP. You want it read-only and you want to customize community string. You might also want SHA1/AES authentication/encryption but that has to be configured on both sides and I generally skip it for my home network.

Once you’re done you can turn on SNMP input plugin and data will flow. But data that flows will not include Mikrotik-specific stuff. Most notably, I wanted simple queues. And, once you know the process, it’s actually reasonably easy.

At heart of SNMP we have OIDs. Mikrotik is really shitty with documenting them but they do provide MIB so one can take a look. However, there is an easier approach. Just run print oid for any section, e.g.:

/queue simple print oid
 0
  name=.1.3.6.1.4.1.14988.1.1.2.1.1.2.1
  bytes-in=.1.3.6.1.4.1.14988.1.1.2.1.1.8.1
  bytes-out=.1.3.6.1.4.1.14988.1.1.2.1.1.9.1
  packets-in=.1.3.6.1.4.1.14988.1.1.2.1.1.10.1
  packets-out=.1.3.6.1.4.1.14988.1.1.2.1.1.11.1
  queues-in=.1.3.6.1.4.1.14988.1.1.2.1.1.12.1
  queues-out=.1.3.6.1.4.1.14988.1.1.2.1.1.13.1

This can than be converted into telegraf format looking something like this:

[[inputs.snmp.table.field]]
  name = "mtxrQueueSimpleName"
  oid = ".1.3.6.1.4.1.14988.1.1.2.1.1.2"
  is_tag = true
[[inputs.snmp.table.field]]
  name = "mtxrQueueSimpleBytesIn"
  oid = ".1.3.6.1.4.1.14988.1.1.2.1.1.8"
[[inputs.snmp.table.field]]
  name = "mtxrQueueSimpleBytesOut"
  oid = ".1.3.6.1.4.1.14988.1.1.2.1.1.9"
[[inputs.snmp.table.field]]
  name = "mtxrQueueSimplePacketsIn"
  oid = ".1.3.6.1.4.1.14988.1.1.2.1.1.10"
[[inputs.snmp.table.field]]
  name = "mtxrQueueSimplePacketsOut"
  oid = ".1.3.6.1.4.1.14988.1.1.2.1.1.11"
[[inputs.snmp.table.field]]
  name = "mtxrQueueSimplePCQQueuesIn"
  oid = ".1.3.6.1.4.1.14988.1.1.2.1.1.12"
[[inputs.snmp.table.field]]
  name= "mtxrQueueSimplePCQQueuesOut"
  oid= ".1.3.6.1.4.1.14988.1.1.2.1.1.13"

Where did I get the name from? Technically, you can use whatever you want, but I usually look them up from oid-info.com. Once you restart telegraf daemon, data will flow into Grafana and you can chart it to your heart’s desire.

You can see my full SNMP input config for Mikrotik at GitHub.

Better Pseudorandom Numbers

Browsing Internet out of boredom usually brings a lot of nonsense. However, it occasionally also brings a gem. This time I accidentally stumbled upon a family of random algorithm called xoshiro/xoroshiro.

Pseudo-random generators fell out of favor lately as it proper cryptographically-secure algorithms became ubiquitous on modern computers (and often supported by processors RNG). For cases where pseudo-random generators are better fit, most programming languages already include Mersenne twister allowing generation of reasonable randomness.

But that doesn’t mean research into a better (pseudo)randomness has stopped. From that research comes whitepaper named Scrambled linear pseudorandom number generators. Paper alone goes over the algorithms in detail but authors were also kind enough to provide PRNG shootout page giving a practical advise.

After spending quite a few hours with these, I decided that only thing missing is a C# variant of the same. So I created it.

Links to source and NuGet package are here.

Monitoring Home Network

Illustration

While monitoring home network is not something that’s really needed, I find it always comes in handy. If nothing else, you get to see lot of nice colors and numbers flying around. For people like me, I need nothing more as encouragement.

Over time I tried many different systems but lately I fell in love with Grafana combined with InfluxDB. Grafana gives really nice and simple GUI while InfluxDB serves as the database for all the metrics.

I find Grafana hits just a right balance of being simple enough to learn basics but powerful enough that you can get into advanced stuff if you need it. Even better, it fits right into a small network without any adjustments needed to the installation. Yes, you can make it more complex later but starting point is spot on.

InfluxDB makes it really easy to push custom metrics from command line or literally anything that can speak HTTP and I find that really useful in heterogeneous network filled with various IoT devices. While version 2.0 is available, I actually prefer using 1.8 as it’s simpler in setup, lighter on resources (important if you run it in virtual machine), and it comes without GUI. Since I only use it as backend, that actually means I have less things to secure.

Installing Grafana on top of Ubuntu Server 20.04 is easy enough.

sudo apt-get install -y apt-transport-https
wget -4qO - https://packages.grafana.com/gpg.key | sudo apt-key add -
echo "deb https://packages.grafana.com/oss/deb stable main" \
    | sudo tee -a /etc/apt/sources.list.d/grafana.list

sudo apt update
sudo apt --yes install grafana

sudo systemctl start grafana-server
sudo systemctl enable grafana-server
sudo systemctl status grafana-server

That’s it. Grafana is now listening on port 3000. If you want it on port 80, some NAT magic is required.

sudo apt install --yes netfilter-persistent
sudo iptables -t nat -A PREROUTING -p tcp --dport 80 -j REDIRECT --to-port 3000
sudo netfilter-persistent save
sudo iptables -L -t nat

With Grafana installed, it’s time to get InfluxDB onboard too. Setup is again simple enough.

wget -4qO - https://repos.influxdata.com/influxdb.key | sudo apt-key add -
echo "deb https://repos.influxdata.com/ubuntu focal stable" \
    | sudo tee /etc/apt/sources.list.d/influxdb.list

sudo apt update
sudo apt --yes install influxdb

sudo systemctl start influxdb
sudo systemctl enable influxdb
sudo systemctl status influxdb

Once installation is done, the only remaining task is creating the database. In example I named it “telegraf”, but you can select whatever name you want.

curl -i -XPOST http://localhost:8086/query --data-urlencode "q=CREATE DATABASE ^^telegraf^^"

With both installed, we might as well install Telegraf so we can push some stats. Installation is again really similar:

wget -4qO- https://repos.influxdata.com/influxdb.key | sudo apt-key add -
echo "deb https://repos.influxdata.com/ubuntu focal stable" \
    | sudo tee /etc/apt/sources.list.d/influxdb.list

sudo apt-get update
sudo apt-get install telegraf

sudo sed -i 's*# database = "telegraf"$*database = "^^telegraf^^"*' /etc/telegraf/telegraf.conf
sudo sed -ri 's*# (urls = \["http://127.0.0.1:8086"\])$*\1*' /etc/telegraf/telegraf.conf
sudo sed -ri 's*# (\[\[inputs.syslog\]\])$*\1*' /etc/telegraf/telegraf.conf
sudo sed -ri 's*# (  server = "tcp://:6514")*\1*' /etc/telegraf/telegraf.conf

sudo systemctl restart telegraf
sudo systemctl status telegraf

And a minor update is needed for rsyslog daemon in order to forward syslog messages.

echo '*.notice action(type="omfwd" target="localhost" port="6514"' \
    'protocol="tcp" tcp_framing="octet-counted" template="RSYSLOG_SyslogProtocol23Format")' \
    | sudo tee /etc/rsyslog.d/99-forward.conf
sudo systemctl restart rsyslog

If you want to accept remove syslog messages, that’s also just a command away:

echo 'module(load="imudp")'$'\n''input(type="imudp" port="514")' \
    | sudo tee /etc/rsyslog.d/98-accept.conf
sudo systemctl restart rsyslog

That’s it. You have your metric server fully installed and its own metrics are flowing.

And yes, this is not secure and you should look into having TLS enabled at minimum, ideally with proper authentication for all your clients. However, this setup does allow you to dip your toes and see whether you like it or not.


PS: While creating graphs is easy enough, dealing with logs is a bit more complicated. NWMichl Blog has link to really nice dashboard for this purpose.