Beiträge von tophee

tophee · 9. März 2023

OK, I declared victory too fast. I now learned that the locked memory setting isn't the cause for rclone being killed (because rclone doesn't use locked memory). So, if you have any ideas, why rclone on OMV is being killed when it reaches about 4GB of memory usage, please let me know. (For more on this issue, see this thread in the rclone forum.

tophee · 8. März 2023

I am running rclone on my OMV to serve a cloud storage via SFTP and I have been struggling for months with rclone repeatedly being killed and I have been trying to trouble shoot this over on the rclone forum. We figured it had to do with memory usage, but since I have 24 GB of RAM in my OMV server, there isn't really any shortage of memory. So why does it get killed?

It looks like we've finally identified the reason:

Code

$ ulimit -a
real-time non-blocking time  (microseconds, -R) unlimited
core file size              (blocks, -c) 0
data seg size               (kbytes, -d) unlimited
scheduling priority                 (-e) 0
file size                   (blocks, -f) unlimited
pending signals                     (-i) 95063
max locked memory           (kbytes, -l) 3049767
max memory size             (kbytes, -m) unlimited
open files                          (-n) 1024
pipe size                (512 bytes, -p) 8
POSIX message queues         (bytes, -q) 819200
real-time priority                  (-r) 0
stack size                  (kbytes, -s) 8192
cpu time                   (seconds, -t) unlimited
max user processes                  (-u) 95063
virtual memory              (kbytes, -v) unlimited
file locks                          (-x) unlimited

Alles anzeigen

rclone is bumping into the max memory size. Even root has this limitation of around 2.9 GB.

So I'm wondering: why is OMV limiting memory usage in this way? And, more importantly: how do I change it?

tophee · 28. Februar 2023

I figured it out: I had it running as a user service, so I was able to stop it using systemctl --user stop rclone@pcloud.service...

I had triple-checked that there was no system service doing the restarting, but I had forgotten about user services.

So, what I don't quite understand is how the user service interacts with the scheduled task...

tophee · 28. Februar 2023

Here is what I found in the /var/log/syslog:

Code

Feb 28 18:49:07 server systemd[1275]: rclone@pcloud.service: start operation timed out. Terminating.
Feb 28 18:49:07 server systemd[1275]: rclone@pcloud.service: Failed with result 'timeout'.
Feb 28 18:49:07 server systemd[1275]: Failed to start rclone: make sure pcloud is served via sftp.
Feb 28 18:49:09 server monit[1187]: 'server' mem usage of 94.6% matches resource limit [mem usage > 90.0%]
Feb 28 18:49:10 server smbd[6312]: [2023/02/28 18:49:10.009073,  2] ../../source3/smbd/dosmode.c:137(unix_mode)
Feb 28 18:49:10 server smbd[6312]:   unix_mode(.) inherit mode 42770
Feb 28 18:49:13 server systemd[1275]: rclone@pcloud.service: Scheduled restart job, restart counter is at 578.
Feb 28 18:49:13 server systemd[1275]: Stopped rclone: make sure pcloud is served via sftp.
Feb 28 18:49:13 server systemd[1275]: Starting rclone: make sure pcloud is served via sftp...

tophee · 28. Februar 2023

Another thing that puzzles me is that the command shown in ps aux is not identical with the one I entered in the Scheduled Tasks UI, which is

export GOGC=50 && rclone serve sftp pcloud:Backup/ --addr :2022 --user ******* --pass *********** --log-file=/zfs/NAS/config/rclone/rclone.log --vfs-cache-mode writes --rc &

Notably, the password section is missing. Maybe ps just truncates long commands, I don't know, but even so, it also added a whole section: --config=/zfs/NAS/config/homedirs/christoph/.config/rclone/rclone.conf, which is weird. I have no idea where that comes from.

tophee · 28. Februar 2023

I have not created any cron job manually, but I have manually started the scheduled task shown in the OP. Does manually running a scheduled task have any particular side effects (such as the task becoming unstoppable?)

I tried killing cron/anacron but rclone keeps coming back:

Code

$ sudo killall anacron
anacron: no process found
$ sudo killall cron
$ ps aux | grep rclone
christo+  186545  0.2  0.1 761996 33252 ?        Ssl  17:55   0:00 /usr/bin/rclone serve sftp pcloud:Backup/ --config=/zfs/NAS/config/homedirs/christoph/.config/rclone/rclone.conf --addr :2022 --vfs-cache-mode minimal --log-level INFO --log-file /zfs/NAS/config/rclone/rclone-pcloud.log --user christoph
$ killall rclone
$ ps aux | grep rclone
christo+  187792  1.0  0.0 759820 21756 ?        Dsl  17:56   0:00 /usr/bin/rclone serve sftp pcloud:Backup/ --config=/zfs/NAS/config/homedirs/christoph/.config/rclone/rclone.conf --addr :2022 --vfs-cache-mode minimal --log-level INFO --log-file /zfs/NAS/config/rclone/rclone-pcloud.log --user christoph
christo+  187815  0.0  0.0   6216   624 pts/2    S+   17:56   0:00 grep rclone

Based on that there not being any anacron process, I assume that it can't be responsible for restarting rclone, right? And since killing cron didn't stop the madness, cron is not responsible either. So what is?

tophee · 28. Februar 2023

So how do I stop the rclone process from being restarted?

tophee · 28. Februar 2023

I have scheduled rclone to run at reboot via the OMV UI like this:

Now I'm trying to stop the process and it is impossible. Whenever I dokillall rclone I can see that the process is gone via ps aux | grep rclone but a second or two rclone is back. I don't understand what is going on.

I saw in the documentation that the UI doesn't write directly to the crontab but in essence, I would still expect that adding a command to the scheduler will do the same as adding the same command into crontab, i.e. if it is scheduled to be executed at reboot, it will execute at reboot and never again until the next reboot.

OMV is clearly not doing that and I fail to understand what it is doing or how I can prevent it from doing so. I have even disabled the scheduled task in the UI but rclone is still being restarted.

Could anyone explain?

tophee · 25. Februar 2023

Yesterday I upgraded to openmediavault 6.3.1-1. No problem.

Today, from one second to the next, the server lost connectivity so I eventually had to restart it. Once it was back, I found this in `/var/logs/message`:

Code

2628 Feb 25 08:15:44 server openmediavault-update-smart-drivedb: Updating smartmontools drive database ...
2629 Feb 25 11:30:21 server kernel: [51615.748213] ------------[ cut here ]------------
2630 Feb 25 11:30:21 server kernel: [51615.748216] NETDEV WATCHDOG: enp0s31f6 (e1000e): transmit queue 0 timed out
2631 Feb 25 11:30:21 server kernel: [51615.748230] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:477 dev_watchdog+0x277/0x280
2632 Feb 25 11:30:21 server kernel: [51615.748236] Modules linked in: wireguard curve25519_x86_64 libchacha20poly1305 chacha_x86_64 poly1305_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel tcp_diag udp_>
2633 Feb 25 11:30:21 server kernel: [51615.748292]  sysfillrect usbserial sysimgblt parport_pc parport mei_me mei intel_pch_thermal mac_hid acpi_pad zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znv>
2634 Feb 25 11:30:21 server kernel: [51615.748335] CPU: 1 PID: 0 Comm: swapper/1 Tainted: P           O      5.15.85-1-pve #1
2635 Feb 25 11:30:21 server kernel: [51615.748338] Hardware name: FUJITSU /D3417-B2, BIOS V5.0.0.12 R1.14.0 for D3417-B2x                    10/24/2017
2636 Feb 25 11:30:21 server kernel: [51615.748339] RIP: 0010:dev_watchdog+0x277/0x280
2637 Feb 25 11:30:21 server kernel: [51615.748343] Code: eb 97 48 8b 5d d0 c6 05 26 82 4d 01 01 48 89 df e8 5e 53 f9 ff 44 89 e1 48 89 de 48 c7 c7 f0 a9 8a b7 48 89 c2 e8 cd 99 1c 00 <0f> 0b eb 80 e9 45 ca 25 00 0f 1f 4>
2638 Feb 25 11:30:21 server kernel: [51615.748345] RSP: 0018:ffffc28740114e70 EFLAGS: 00010282
2639 Feb 25 11:30:21 server kernel: [51615.748347] RAX: 0000000000000000 RBX: ffffa0f20d4e0000 RCX: ffffa0f746520588
2640 Feb 25 11:30:21 server kernel: [51615.748349] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffffa0f746520580
2641 Feb 25 11:30:21 server kernel: [51615.748350] RBP: ffffc28740114ea8 R08: 0000000000000003 R09: 0000000000000001
2642 Feb 25 11:30:21 server kernel: [51615.748352] R10: 0000000000ffff0a R11: 0000000000000001 R12: 0000000000000000
2643 Feb 25 11:30:21 server kernel: [51615.748353] R13: ffffa0f20174b680 R14: 0000000000000001 R15: ffffa0f20d4e04c0
2644 Feb 25 11:30:21 server kernel: [51615.748355] FS:  0000000000000000(0000) GS:ffffa0f746500000(0000) knlGS:0000000000000000
2645 Feb 25 11:30:21 server kernel: [51615.748357] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
2646 Feb 25 11:30:21 server kernel: [51615.748358] CR2: 00007f575572f024 CR3: 00000004ed410004 CR4: 00000000003726e0
2647 Feb 25 11:30:21 server kernel: [51615.748360] Call Trace:
2648 Feb 25 11:30:21 server kernel: [51615.748361]  <IRQ>
2649 Feb 25 11:30:21 server kernel: [51615.748364]  ? pfifo_fast_enqueue+0x160/0x160
2650 Feb 25 11:30:21 server kernel: [51615.748367]  call_timer_fn+0x29/0x120
2651 Feb 25 11:30:21 server kernel: [51615.748372]  __run_timers.part.0+0x1e1/0x270
2652 Feb 25 11:30:21 server kernel: [51615.748374]  ? ktime_get+0x43/0xc0
2653 Feb 25 11:30:21 server kernel: [51615.748376]  ? lapic_next_deadline+0x2c/0x40
2654 Feb 25 11:30:21 server kernel: [51615.748379]  ? clockevents_program_event+0xa8/0x130
2655 Feb 25 11:30:21 server kernel: [51615.748382]  run_timer_softirq+0x2a/0x60
2656 Feb 25 11:30:21 server kernel: [51615.748384]  __do_softirq+0xd6/0x2ea
2657 Feb 25 11:30:21 server kernel: [51615.748388]  irq_exit_rcu+0x94/0xc0
2658 Feb 25 11:30:21 server kernel: [51615.748390]  sysvec_apic_timer_interrupt+0x80/0x90
2659 Feb 25 11:30:21 server kernel: [51615.748393]  </IRQ>
2660 Feb 25 11:30:21 server kernel: [51615.748394]  <TASK>
2661 Feb 25 11:30:21 server kernel: [51615.748395]  asm_sysvec_apic_timer_interrupt+0x1b/0x20
2662 Feb 25 11:30:21 server kernel: [51615.748397] RIP: 0010:cpuidle_enter_state+0xd9/0x620
2663 Feb 25 11:30:21 server kernel: [51615.748401] Code: 3d 44 3f 3f 49 e8 77 ff 6d ff 49 89 c7 0f 1f 44 00 00 31 ff e8 b8 0c 6e ff 80 7d d0 00 0f 85 5e 01 00 00 fb 66 0f 1f 44 00 00 <45> 85 f6 0f 88 6a 01 00 00 4d 63 e>
2664 Feb 25 11:30:21 server kernel: [51615.748403] RSP: 0018:ffffc287400d7e38 EFLAGS: 00000246
2665 Feb 25 11:30:21 server kernel: [51615.748404] RAX: ffffa0f746530bc0 RBX: ffffe2873fd00000 RCX: 0000000000000000
2666 Feb 25 11:30:21 server kernel: [51615.748406] RDX: 0000000000001782 RSI: 000000002c13dcf6 RDI: 0000000000000000
2667 Feb 25 11:30:21 server kernel: [51615.748407] RBP: ffffc287400d7e88 R08: 00002ef1ba561902 R09: 00000000000c3500
2668 Feb 25 11:30:21 server kernel: [51615.748408] R10: 0000000000000007 R11: 071c71c71c71c71c R12: ffffffffb80d4420
2669 Feb 25 11:30:21 server kernel: [51615.748410] R13: 0000000000000006 R14: 0000000000000006 R15: 00002ef1ba561902
2670 Feb 25 11:30:21 server kernel: [51615.748413]  ? cpuidle_enter_state+0xc8/0x620
2671 Feb 25 11:30:21 server kernel: [51615.748417]  cpuidle_enter+0x2e/0x50
2672 Feb 25 11:30:21 server kernel: [51615.748420]  do_idle+0x20d/0x2b0
2673 Feb 25 11:30:21 server kernel: [51615.748423]  cpu_startup_entry+0x20/0x30
2674 Feb 25 11:30:21 server kernel: [51615.748425]  start_secondary+0x12a/0x180
2675 Feb 25 11:30:21 server kernel: [51615.748429]  secondary_startup_64_no_verify+0xc2/0xcb
2676 Feb 25 11:30:21 server kernel: [51615.748433]  </TASK>
2677 Feb 25 11:30:21 server kernel: [51615.748434] ---[ end trace 92789cc296e4fa0f ]---
2678 Feb 25 11:30:21 server kernel: [51615.839286] br0: port 1(enp0s31f6) entered disabled state
2679 Feb 25 11:30:24 server kernel: [51619.455212] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
2680 Feb 25 11:30:24 server kernel: [51619.455320] br0: port 1(enp0s31f6) entered blocking state
2681 Feb 25 11:30:24 server kernel: [51619.455324] br0: port 1(enp0s31f6) entered listening state
2682 Feb 25 11:30:29 server kernel: [51623.684311] br0: port 1(enp0s31f6) entered learning state
2683 Feb 25 11:30:33 server kernel: [51627.780289] br0: port 1(enp0s31f6) entered forwarding state
2684 Feb 25 11:30:33 server kernel: [51627.780294] br0: topology change detected, propagating
2685 Feb 25 11:43:12 server kernel: [52387.103542] br0: port 1(enp0s31f6) entered disabled state
2686 Feb 25 11:43:16 server kernel: [52390.756071] e1000e 0000:00:1f.6 enp0s31f6: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None
2687 Feb 25 11:43:16 server kernel: [52390.756169] br0: port 1(enp0s31f6) entered blocking state
2688 Feb 25 11:43:16 server kernel: [52390.756172] br0: port 1(enp0s31f6) entered listening state
2689 Feb 25 11:43:20 server kernel: [52394.760120] br0: port 1(enp0s31f6) entered learning state
2690 Feb 25 11:43:24 server kernel: [52398.860229] br0: port 1(enp0s31f6) entered forwarding state
2691 Feb 25 11:43:24 server kernel: [52398.860233] br0: topology change detected, propagating
2692 Feb 25 11:43:26 server kernel: [52400.950832] br0: port 1(enp0s31f6) entered disabled state

Alles anzeigen

From what I can tell (which is almost nothing), ~~OMV started to update something (smartmontools?)~~ which somehow caused the network connection to break down.

Edit: sorry, I realized that the update was hours before the network issue. So I have no idea what cause the bridge to break down...

What happened? And what should I do about it?

tophee · 8. November 2022

OK, thanks so much for clarifying.

It looked like installing zerotier required more than just installing the package: https://www.zerotier.com/download/#downloadLinux

As for the warning about installing software: I assume that this applies regardless of whether I use the APT-tool plugin or the CLI, right?

tophee · 6. November 2022

A few weeks or so ago I installed ZeroTier on my OMV server via CLI (tried running it in a docker container, but couldn't get it to work). I installed it manually because my impression was that OMV doesn't support zerotier natively. But now I see a Zerotier update in my OMV updates

So, I realized I must be misunderstanding something because, to me, this looks like OMV does somehow support zerotier. But then it occurred to me that OMV might simply be showing all system updates, regardless of whether something was installed via OMV or not. Is that correct?

In that case, I just wonder whether there is anything wrong with installing something via cli and then updating it via the UI.

Related to that: I don't remember installing portainer och yacht via OMV (though it's possible I did) - I've been running OMV for some years now - but OMV does see the portainer and yacht instances in the UI. Does that mean that they will be kept up to date via the OMV update management? I'm asking because portainer has for some time been telling me in its UI that there is an update but there is none showing in the OMV update management. Perhaps it just takes some extra time? Or do I still have to do the somewhat tricky process of update portainer manually (because portainer can't update itself)?

tophee · 29. Oktober 2022

Excellent! Thanks for explaining. No, my VMs are not similar and I do have 24 GB of RAM, so whatever savings ksm produces, I can probably afford not to have them. Will try disabling ksm and hopefully forget about it,

tophee · 29. Oktober 2022

Zitat von ryecoaaron

But all the kvm plugin does is install it as a dependency. If you want to disable it, just stop and disable the service.

The problem is, I don't know if I want to disable it. What does it do? I have two KVM virtual machines running. Do these need the ksm service?

tophee · 13. Oktober 2022

A couple of weeks ago, some OMV system process started to use a lot of CPU:

I started looking into this today and it seems quite evident that the perpetrator is ksmd:

Code

 top - 20:46:54 up 63 days, 22:42,  1 user,  load average: 1.98, 2.29, 2.37
Tasks: 404 total,   2 running, 402 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3.4 us, 20.7 sy,  0.0 ni, 75.9 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  23826.3 total,    770.9 free,  19362.5 used,   3692.8 buff/cache
MiB Swap:   7974.0 total,   6864.6 free,   1109.3 used.   1206.1 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
     32 root      25   5       0      0      0 R  29.4   0.0  11308:54 ksmd
  82374 libvirt+  20   0 5701096   3.1g   4764 S   5.9  13.5   8083:01 qemu-sy+
1886799 508       20   0 3977720 716708      0 S   5.9   2.9   2:47.94 java
1895221 christo+  20   0  774788 119428   8428 S   5.9   0.5   2:56.07 io.shel+
1963879 201       39  19    2428   1456   1028 S   5.9   0.0   0:00.77 bash
1992206 root      20   0   12324   4132   3240 R   5.9   0.0   0:00.01 top
      1 root      20   0  168844  10180   5716 S   0.0   0.0  76:15.96 systemd
      2 root      20   0       0      0      0 S   0.0   0.0   6:32.81 kthreadd

Alles anzeigen

What can I do about this?

I found this: https://serverfault.com/a/1064801/399289 but I am somewhat reluctant to tweak these settings as I assume that OMV is using reasonable defaults...

tophee · 20. August 2022

Zitat von geaves

Patience

LOL. yes, thats probably true in general, but in this case I was unsure what is going on. and wanted to understand it. So thanks for clarifying. I now realize that releases don't seem to be published on github at all, since https://github.com/openmediavault/openmediavault/releases is empty...

Anyway: one question regarding the latest release:

Zitat

Adapted Samba vfs_fruit settings according to the wiki to better work with Mac OS X. The following environment variables have been introduced:

OMV_SAMBA_SHARE_FRUIT_VETOAPPLEDOUBLE (defaults to ‘no’)

OMV_SAMBA_SHARE_FRUIT_NFSACES (defaults to ‘no’)

OMV_SAMBA_SHARE_FRUIT_WIPEINTENTIONALLYLEFTBLANKRFORK (defaults to ‘yes’)

OMV_SAMBA_SHARE_FRUIT_DELETEEMPTYADFILES (defaults to ‘yes’)

I wonder wether there is anything I need to do with my SMB-settings which currently look like this:

I would assume that I can (should?) remove those settings that became redunant in the new release?

tophee · 20. August 2022

OK, I see. It's good to know that there are still use cases for AFP. I have had both AFP and SMB running side by side for some time now, but it caused some confusion because the shares had identical names (because they are sharing identical directories on the server) so I figured I should either rename them or at least only mount AFP on my macs. Then I read about SMB being faster etc, so I thought perhaps I should get rid of it alltogether. Hence my question.

I haven't gotten Spotlight to reliably index those SMB shares, which might actually have been the reason why I added AFP at some point. But since indexing still seems to fail, I guess it didnt help...

tophee · 20. August 2022

I see here that 6.0.36-1 i is out but it doesn't come up under "Update-Management". Since the release is marked as "stable", I'd exoect it to show up but perhaps only certain releases are pushed into the UI? Or am I missing something?

tophee · 20. August 2022

Any particular reason for using AFP? Even Apple recommends SMB...

tophee · 10. August 2022

Zitat von chente

look for the kernel you are running. If you don't reboot the running kernel is the same as before upgrading.

Yes, that's what the second screenshot above is about. I'm running 5.4.174-2-pve, which, I believe is a proxmox kernel. So the funny thing is that I was running a proxmox kernel on OMV6 without having the kernel plugin installed....

tophee · 10. August 2022

Zitat von chente

You should restart, if there are problems to solve the sooner the better

You have a point.

But the most acute argument is this one:

The kernel thing is a bit confusing, though. I believe I was running a proxmox kernel before the upgrade but after the upgrade the kernel-option wasn't even available anymore. Had to (re-?)install the kernel-plugin. But if using a new kernel requires a reboot, then I must still be running the (old) proxmox-kernel (despite the option having disappeared after the upgrade)... Indeed:

Funny... I'm almost afraid to reboot now...

Beiträge von tophee

How to increase max memory limit (ulimit)

How to increase max memory limit (ulimit)

Will the scheduler restart a scheduled task whenever it is killed (and how to prvent this)?

Will the scheduler restart a scheduled task whenever it is killed (and how to prvent this)?

Will the scheduler restart a scheduled task whenever it is killed (and how to prvent this)?

Will the scheduler restart a scheduled task whenever it is killed (and how to prvent this)?

Will the scheduler restart a scheduled task whenever it is killed (and how to prvent this)?

Will the scheduler restart a scheduled task whenever it is killed (and how to prvent this)?

OMV 6.3.1-1 suddenly lost connectivity

Zerotier and OMV

Zerotier and OMV

ksmd using a lot of CPU

ksmd using a lot of CPU

ksmd using a lot of CPU

6.0.36-1 not showing in the OMV UI

AFP on OMV 6

6.0.36-1 not showing in the OMV UI

AFP on OMV 6

Which plugins are (not) supported in OMV 6?

Which plugins are (not) supported in OMV 6?