Posts by tony359

    I believe I might have resolved the original issue of the NAS crashing: my iDrive Backup app was using too much RAM, it turned out that I was somehow using the Desktop app and not the "headless server" app. I uninstalled it and reinstalled it and it seems to be fine now.


    HOWEVER, I am still timing out when trying to view Syslog - can someone point me to the right direction please?


    Thanks!

    votdev


    Does the above look ok? I cannot view Syslog anymore though. It times out.


    I keep noticing that RAM usage increases a lot when I use the UI btw. My other NAS, an old N40L, seems to be using much less RAM than this.


    Thanks!

    it points me to here 🙂



    In any case, I googled it and renamed start-stop-daemon.REAL to start-stop-daemon

    Now the output of the same file is this:


    Did it work? I see rrdcached started

    See below. Not sure why the output repeats this block many times but they look identical to me.


    Thanks!




    I'm checking a few things with some AI help


    rrdcached service is running

    If I run omv-firstaid and check RRD, I get the below


    Code
    Checking all RRD files. Please wait ...
    All RRD database files are valid.
    ERROR: Command '['monit', 'start', 'rrdcached']' returned non-zero exit status 1.
    root@enterprise-nas:/# 

    Following a previous thread on this forum I ran


    Code
    omv-salt deploy run rrdcached


    But omv-firstaid comes back with the same error.


    Any help?

    Today I experienced some delays when accessing the NAS and I also heard the fan spinning up so I quickly accessed the NAS and found the following:


    - RAM consumption was at 45% and within a few minutes went up to 58%

    - Syslog show many of the below. The log is flooded by them.

    Code
    collectd[1401]: rrdcached plugin: Failed to connect to RRDCacheD at unix:/run/rrdcached.sock: Unable to connect to rrdcached: No such file or directory (status=2)

    - I ran top and it looked like SAMBA was using all the CPU - I cannot remember whether I was already trying to visualise syslog which was taking quite some time and that might explain journalctl?


    - The NAS remained accessible and working. But while writing this message, RAM consumption jumped to 62%. I guess I'll reboot it now.

    - Attached are some top screenshots with different sorting. Could iDrive (my backup software) be the issue here??


    Any useful information here?

    Thanks!

    I shall clone the USB drive, thanks.


    I was born with MS-DOS, I can deal with CLI. 🙂 It's Linux I don't like!😀


    AI can be great but way too often it gives inaccurate/outdated/non-relevant information so I am not optimistic. Unfortunately the ReadyNAS Pro 6 doesn't have an expansion slot so I am stuck with that NIC - or with a USB 2 Ethernet adaptor. It's a shame, it's a nice machine after all. Sure, it's very old but I only use the NAS for storage.


    Maybe what I can do is to clone the USB drive onto another one I don't care about and remove flashdrive. Wait until the NAS crashes and then see what the logs say - if anything. Plus some memtest and maybe check VCORE with my scope. Or just re-cap the motherboard without thinking too much, I remove SMD caps for work, it's not a big deal to me.


    I have another NAS, an old N40L I paid £30. That has an expansion slot if I needed it and it works just great. Maybe I should just find another one of those - though I'd lose one bay.


    Oh well, it's never easy.


    Thanks for all the help for now, I'll keep this thread posted for future reference.


    (I found the bug the user above filed with bugzilla and they seem to conclude that by enabling HW flow control on their switches, the issue mitigates to the point it becomes a non-issue.

    They still mention the NAS to stay "reachable" and I do remember it WAS my issue before so I don't think this is my issue to be honest. I'll keep looking!


    https://bugzilla.kernel.org/sh…format=multiple&id=219713. )

    I can try asking AI.


    The reason I am considering the option is that it took me about 6 months to figure out it was not a HW issue. And during those 6 months I had the NAS gutted, hard drives everywhere, wires, keyboard, monitor... I'm sure you understand!


    My NAS is old, nobody has that HW anymore and help from communities is limited.

    I'm planning to do a memtest for sure, thanks.


    I don't mean to purge the logs manually, just instruct the OS to write them elsewhere? On either an external USB drive (which I don't care if it dies) or directly on the RAID?


    I forgot to bring this old conversation to the table.



    Basically my ReadyNAS became unusable with Netgear OS6 (unsupported on my model but... working) and that was why I moved to OMV. User adisor19 there discovered that the issue was the sky2 driver for the Marvell Yukon 88E8053 NIC - which was fixed at some point (but OS6 was never really kept up to date kernel-wise) and broke again with kernel 6.1/6.11


    Now, the issue I used to have with OS6 was different: the NAS would disappear from the network indefinitely and once I wired a monitor and a keyboard to it I discovered that resetting the network interface would bring it back to life.


    When the NAS was dying yesterday, I could still ping it - but could not SSH or access the WebGUI.


    I'm wondering whether this could be related to that old issue though - and yes, I appreciate that OMV is not responsible for those drivers.

    Thanks again.

    2mV at the CPU - the CPU is not fed directly from the power supply though, this is not a 386! :)


    That said, I've recapped the PSU but never re-capped the motherboard. It's entirely possible that it might be failing for sure and VCore might get compromised. The thermal paste has been replaced more than once over the years and I know the CPU is not overheating. This NAS has been well looked after! I open it up and clean it with compressed air every now and then.


    Sensors is already monitoring HDDs and CPU and I can hear when the fan spins up because the CPU is a bit warmer and never happens for more than a few seconds every now and then.


    I have now tested with "stress" and the CPU gets to no more than 60C and fans become quite audible from my living room, I'd hear that if overheating was the issue.


    The command you mentioned outputs this:


    Code
    root@enterprise-nas:/# journalctl -k | grep -i thermal
    Oct 08 23:08:12 enterprise-nas kernel: CPU0: Thermal monitoring enabled (TM2)
    Oct 08 23:08:12 enterprise-nas kernel: thermal_sys: Registered thermal governor 'fair_share'
    Oct 08 23:08:12 enterprise-nas kernel: thermal_sys: Registered thermal governor 'bang_bang'
    Oct 08 23:08:12 enterprise-nas kernel: thermal_sys: Registered thermal governor 'step_wise'
    Oct 08 23:08:12 enterprise-nas kernel: thermal_sys: Registered thermal governor 'user_space'
    Oct 08 23:08:12 enterprise-nas kernel: thermal_sys: Registered thermal governor 'power_allocator'


    I'm not a Linux wizard so I don't feel like fiddling with my backup. I know it's there and I count for a little help from this lovely community if I needed it! You're not wrong, a proper Backup strategy would require some backup drills for sure.


    Is there a way to tell the system to write those system files somewhere else? An external USB drive? The RAID partition?


    I appreciate a hardware failure is a possibility but the nature of the issue (getting slower and slower while still working and still pingable) tells my guts that this is not an HW failure. I'd like to explore more options before starting HW tests (which requires moving the NAS where a monitor is located).


    Thanks!

    Thanks


    Thankfully it was late and I went to bed!


    The configurator is asking me the below. I'd imagine I want this installed on /dev/sdg - the Sandisk - USB_DISK should be the internal Netgear USB drive which is no longer in use.


    Can you confirm /dev/sdg is the correct answer in my case?


    Thanks a lot!


    Thanks for your help!


    3mV? ATX 3.0 says 50mV on 5V.


    In any case I could check that, I have an oscilloscope. Caps were replaced with high quality, low ESR ones though (NOT from Ebay/Aliexpress) so I somehow doubt it could be that. But surely worth checking.


    syslog shows no errors


    Code
    root@enterprise-nas:/# cat /var/log/syslog | grep -i "I/O error"
    root@enterprise-nas:/# 


    dmesg shows quite some stuff but I'm not sure it's what we want? Attached.


    Good point about the USB drive of course. The drive was purchased brand new and I installed OMV on this NAS in July 2023.

    OMV is saving a backup of the OS partition on the data partition which is then automatically backed up to the cloud 🙂

    Hi,


    I went to update OMV to 7.7.18-1 and the logs showed this at the end, shall I worry about that? the NAS runs off a USB stick at the back.

    Thanks!


    It does have a VGA port, I made an adaptor which is still inside. However the NAS lives far away from a monitor so that might be a problem.


    Power is stable. The PSU was recapped a few years ago. I did check voltages last time when it crashed and they were ok though I didn't check the ripple with the oscilloscope. I took the opportunity to clean it with compressed air, re-seat and clean all the drives.


    The fact that it slowly slows down before rebooting, seems to point at some resource issues, no? If it was HW or power issue I'd expect the system to either freeze for good or just reboot immediately.


    Also, the fact that after years of no issues it started misbehaving shortly after updating to OMV7 (and the new kernel) is suspicious.


    Anything I could try to monitor to see what's going on?

    This has happened again with the same behaviour: the NAS becomes slow, eventually it stops responding (but I can ping it). Eventually it rebooted by itself and logs are chopped.


    One new data point: I plugged a USB keyboard at the front USB and num-lock was unresponsive.


    Any help please?

    Hi all,


    Tonight my NAS got unresponsive, I could still ping it and access the login page but then wouldn't proceed any further. I tried SSH into it and it wouldn't go anywhere. At some point the NAS rebooted and came back as normal.


    I tried looking at Messages but there is a chunk of logs missing just before the reboot. Same for the Kernel log. Is there any info/logs I could post here that would give someone an idea of what happened?


    Thanks!


    P.S. I am running the latest version 7.7.17-1, the signature was updated after I posted.

    Thanks BernH


    Both my NAS had been misbehaving for a while - I suspect the recent more severe issues were because I was transferring a file structure coming from a Linux machine so I'd imagine macOS transferred the files with some original permissions/owners to the NAS.


    I really appreciate your detailed help.

    understood thanks BernH


    Is my understanding correct and by applying the "force user" mask I lose the ability to have different users for different shares? Not that I need that - Just for my information.


    And slightly OT: if Apple is using Samba in their own way, is there a "better way" to share files when Apple products are involved? AFS was deprecated so I thought Samba was the only way - but they don't implement it in a standard way?

    Thank you BernH for taking the time to help me.

    Just minutes ago I found this thread



    And I realised that all my trouble started some time ago when I enabled Time Machine.

    I've now disabled it and all seems to be working fine again but I am keen to apply the extra tweaks if those are recommended for macOS.


    I've only recently switched to Apple but I still have Windows around: do those tweaks you recommend be ok with Windows access as well?


    On permissions, it's just me accessing the NAS, do you think 6774 would be ok?


    Thanks so much!

    Tony