Why the BTRFS Information gap?

Krisbee · 22. März 2024

OMV7 has promoted BTRFS over MD RAID for mulitple device data arrays and consolidates the features in OMV6 for BTRFS support. Perhaps votdev has it on a features backlog list, but compared to MD RAID there seems to be an information gap for BTRFS.

With MD RAID there is a separate "details" page in the WebUI and "details" are also included in the "Diagnostic | Report". With BTRFS, unless a user "tags" a BTRFS filesystem, there is no "at-a-glance" indication of the BTRFS profile in use on the "Storage | Filesystems" page, no separate "details" page you can go to and no BTRFS filesystem details in the diagnostic report. The output of either "btrfs fi show ..." and/or "brtfs fi us -T ... " and/or "brtfs dev stats ... " is not included.

But, I'd hope votdev would consider it a very useful enhancement to have the same level of detail as MD RAID on a dedicated WebUI page for BTRFS systems and in the diagnostic report. For example, something like this:

Code

================================================================================
= Detailed information about BTRFS filesystems
================================================================================

BTRFS filesystem mounted at /srv/dev-disk-by-uuid-0371eae9-24da-4de7-87d6-32cfd5f9f96b:
=======================================================================================
Label: none uuid: 0371eae9-24da-4de7-87d6-32cfd5f9f96b
Total devices 2 FS bytes used 2.60GiB
devid 1 size 10.00GiB used 5.28GiB path /dev/sda
devid 2 size 10.00GiB used 5.28GiB path /dev/sdb

--------------------------------------------------------------------------------
Overall:
Device size: 20.00GiB
Device allocated: 10.56GiB
Device unallocated: 9.44GiB
Device missing: 0.00B
Device slack: 0.00B
Used: 5.20GiB
Free (estimated): 7.12GiB (min: 7.12GiB)
Free (statfs, df): 7.12GiB
Data ratio: 2.00
Metadata ratio: 2.00
Global reserve: 5.50MiB (used: 0.00B)
Multiple profiles: no

Data Metadata System
Id Path RAID1 RAID1 RAID1 Unallocated Total Slack
-- -------- ------- --------- -------- ----------- -------- -----
1 /dev/sda 5.00GiB 256.00MiB 32.00MiB 4.72GiB 10.00GiB -
2 /dev/sdb 5.00GiB 256.00MiB 32.00MiB 4.72GiB 10.00GiB -
-- -------- ------- --------- -------- ----------- -------- -----
Total 5.00GiB 256.00MiB 32.00MiB 9.44GiB 20.00GiB 0.00B
Used 2.59GiB 2.86MiB 16.00KiB
--------------------------------------------------------------------------------

Alles anzeigen

There is a second serious information gap that relates to drive failures in BTRFS RAID arrays.

With MD RAID, if a drive drops out of an array a "Fail event on /dev/md.." notification rapidly follows and the MD RAID "details" in the WebUI will show the device as "removed". There is no equivalent for BTRFS as it has no in-built daemon or utility to report such events. So there is nothing in OMV to monitor & notify for a "missing device" and catch such an event on a BTRFS RAID filesystem, no separate "BTRFS " box to tick on the WebUI "System | Notification | Events" page.

What OMV does include is a cron job, scheduled to run daily which checks for and notifies any non-zero btrfs device counts, before by default resetting those cumulative counts to zero. But 24hrs is a long time to wait for any feedback about a failed drive on a BTRFS RAID. If and when the notification prompts you to check your system, then there are no details to view via the WebUI. BTRFS filesystems with "missing devices" may not mount rw on system reboots, this will amongst other things generate a generic mount failure notification and any subsequent hourly snapshot cleanup jobs also generate notifications of errors while the BTRFS filesystems remains unmounted (and probably should be disabled). All of which could be too late if the BTRFS suffers more than a single drive failure.

So, should OMV check for and notify "missing devices", and other errors, in BTRFS filesystems and how would it achieve this?

Reacting to logged BTRFS kernel messages seems an obvious choice and one way of doing this might be to use the lightweight "simple event correlator" (https://tracker.debian.org/pkg/sec) which runs as a systemd service (See for example https://marc.merlins.org/perso…fs-Filesystem-Repair.html and https://github.com/simple-evco…ts/blob/master/systemd.md and https://simple-evcorr.github.io/man.html).

The sec program works with traditional logs as inputs, i.e. syslog, messages, kern.log, etc., as per debian versions prior to debian 12. As OMV7 has retained rsyslog for its remote login function all that is needed is to restore the /etc/rsyslog.conf to debian 11 format. Superficially at least, using sec appears to be a way of creating BTRFS error/warning messages without relying on cron or systemd timers, generating notifications in near real-time. Thus it may be a way to add BTRFS monitoring to the existing OMV "System Notifications Events".

I can't claim to be conversant with the power of perl regex pattern matching, but using grep -P on /var/log/kern.log to test some simple regex, led to this simple sec rule as a brief test:

Code

type=SingleWithThreshold
ptype=RegExp
pattern=(?i)kernel.*btrfs.error
window=60
thresh=3
desc=Btrfs unexpected log
action=pipe '%t: $0' /usr/bin/mail -s "sec: %s" root

Which generated this email alert for logged BTRFS errors:

Code

*** MESSAGE CONTENTS deferred/D/DC9971FE45 ***
Received: by omv7vm.home.arpa (Postfix, from userid 0)
        id DC9971FE45; Fri, 22 Mar 2024 09:35:01 +0000 (GMT)
To: dummysmtp@gmail.com
Subject: sec: Btrfs unexpected log
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Message-Id: <20240322093501.DC9971FE45@omv7vm.home.arpa>
Date: Fri, 22 Mar 2024 09:35:01 +0000 (GMT)
From: root <dummysmtp@gmail.com>

Fri Mar 22 09:35:01 2024: Mar 22 09:35:01 omv7vm kernel: [ 1844.335331] BTRFS error (device sdf): bdev /dev/sdd errs: wr 171066, rd 54, flush 2, corrupt 0, gen 0
*** HEADER EXTRACTED deferred/D/DC9971FE45 ***
named_attribute: encoding=8bit
named_attribute: dsn_orig_rcpt=rfc822;dummysmtp@gmail.com
original_recipient: root
recipient: dummysmtp@gmail.com
*** MESSAGE FILE END deferred/D/DC9971FE45 ***
root@omv7vm:~#

Alles anzeigen

Is this idea viable and worth pursuing? Maybe votdev would like to comment.

votdev · 22. März 2024

Please open a issue at GitHub. I've often told that i do not process feature request coming from the forum because this is not meant to be a feature/issue tracker,

votdev · 22. März 2024

sec seems to be cool tool. Maybe i can use it too vanish rsyslog completely. rsyslog is only used to monitor syslog and send emails if necessary.

Krisbee · 22. März 2024

Zitat von votdev

Please open a issue at GitHub. I've often told that i do not process feature request coming from the forum because this is not meant to be a feature/issue tracker,

I posted on the forum as I thought it stimulate some discussion and I don't have a github account. Nor was I sure if the idea was a dead duck or not. But I see you've added a self-raised sec issue for other purposes.

ryecoaaron · 22. März 2024

Zitat von Krisbee

don't have a github account.

I'm not sure why you don't but if the issue tracker was on something like codeberg (run by not for profit), would you sign up for an account?

Zitat von Krisbee

But I see you've added a self-raised sec issue for other purposes.

That doesn't cover your request for more info in the web interface for btfs.

Krisbee · 22. März 2024

ryecoaaron Yes to your first question. But I've given in, just to raise a feature request. My second point was to acknowledge votdev has shown some interest in the "sec" program/project. Although eliminating rsyslog could make using "sec" for BTRFS notification unviable.

votdev · 22. März 2024

Zitat von Krisbee

Although eliminating rsyslog could make using "sec" for BTRFS notification unviable.

Why, "sec" will monitor systemd journal.

Krisbee · 22. März 2024

Zitat von votdev

Why, "sec" will monitor systemd journal.

Good news if it does. How do you specify the input in DAEMON_ARGS in the /etc/default/sec file for use with journald?

votdev · 23. März 2024

Zitat von Krisbee

Good news if it does. How do you specify the input in DAEMON_ARGS in the /etc/default/sec file for use with journald?

You can check this here, but i've rejected the feature because rsyslog is required for syslog forwarding. But sending emails when a regex is matching a syslog message is also possible with rsyslog. OMV is using that for looking out for pam_faillock messages.

Krisbee · 23. März 2024

votdev Thank you for the links. As a non-developer I wouldn't thought of using a pipe construct in that way, but it's a nice example.

Jetzt mitmachen!