Posts by crashtest

    The user addition, for Raspberry PI's, is for SSH access only. The Web interface is always accessed with the user admin

    _______________________________________________________

    If you look on page 18, there's is a note that specifically mentions system accounts. admin is a system account. You can name your user anything that is not a system account.


    (**Note that the users admin, root, backup, and others are Linux system users. The add user dialog will reject attempts to configure a new user with the exact same name of an existing system user.**)

    The router, as mentioned, is an 802.11ax router. I have since the last post tried with a laptop, with an 802.11ax card installed, right next to the router, and maxing out at around 30-35 MB/s, which is quite poor for that standard, no?

    When using wireless and testing for throughput, there's a number of factors involved that won't have anything to do with Samba (SMB). If you're running wireless and trying to determine the best locations in your house for the best speed, etc., you might consider taking a looki at -> iperf. Iperf works in a client / server arrangement. Install Iperf on two different clients for point-to-point testing with two wired connections to the router, one wired with the other wireless, etc.


    Iperf will give you an idea of what your best case transfer rate is, wired or wireless. You'll also get an indication of how well your wireless links are doing, the locations in your house that are best, local interference factors, etc.

    You've said that you don't live close to others but you might want to see if your router has features that will either auto select the best channel(s) with the least interference or a method of letting you see the radio spectrum to make manual channel choices. (Note that interference can "local", in the 2.4Ghz band. 5Ghz is less susceptible, due to fewer consumer devices using it.)


    Once you have this information in hand, then looking at SMB transfer rates might be a bit more meaningful.


    I will say, as you're about to discover, theoretical maximums (802.11ax) are rarely seen in the real world.

    I wanted my server to be encrypted as I plan to store personal data on it.

    LUKS only protects you from the physical theft of your drives. This means, if a theft breaks in and steals your server, he won't be able to get anything from your hard drives. LUKS does nothing to prevent "over the wire" hacks which represents the vast majority of compromises.

    Yes the Dockers are stored on the default location. The boot drive is a 30GB Flash drive.

    I forgot to mention I also have a 120GB SSD which I map the config files to.
    Eg my docker run file looks like this.

    That was the main concern. Since Dockers use overlayfs and the UnionFS plugin (mergerfs) uses overlayfs as well, moving Dockers onto the mergerfs mount point can create problems. (Overlayfs on top of overlayfs doesn't work well.)

    Do you mind my asking why you're using LUKS?
    ___________________________________________________________________


    Would changing my docker location to the 120GB SSD make a difference? I was considering doing this anyway.

    Since you have the SSD for utility purposes, I would move the Docker to the SSD. At a minimum, the Web interface should be more responsive and there won't be any worries about the growth of metadata.

    If you need a bit of guidance doing it here's an example -> Move Dockers You'll need to substitute the details for your install.

    I don't know exactly what but I replaced it and now I get 110MB/s READ and 60MB/s WRITE speeds.

    If you don't know exactly what you did, there's no way a forum user could figure out the issue from remote.

    __________________________________________________________________


    You had a lot of moving parts involved in this problem.

    A few examples:

    - NTFS (If going for top speed, using a native Linux file system is a given.) You fixed that.

    - Windows client hardware and the "Windows" OS in general. (It's nearly impossible to know the state of health of a Windows client. They slow down considerably after a few years of use, they may have malware installed, etc. This one item, a Windows client, is an enormous variable.)

    - Software RAID1 mirroring + LUKS (While minimal, Software RAID, along with LUKS, does incur a slight CPU penalty. Without knowing what else is running while speed tests are conducted, well, it's all guess work. Even among the faster models, ARM CPU's are relatively easy to stress.

    - The "X" factor. The chipset used to bridge SATA to USB3. There's considerable performance variation, depending on the bridge. Some are good, others are not.


    But you fixed it so, as they say, all's well that ends well. (Whatever it was.) :)

    _________________________________________________________


    I can say this. If I was you, I'd dump LUKS and RAID1. LUKS provides physical protection, from hard drive theft, ONLY. If you're not worried about a thief breaking and stealing your hard drives, LUKS is not doing you any good.

    RAID1 is not doing you any good either. You have two disks in RAID1 where, for home use, nothing is being accomplished. The second disk would be far better utilized for backup, as opposed to RAID1.

    Solution: The router needs to be rebooted.


    After re-flashing my card with Raspbian buster lite I still couldn't SSH to the Pi. Restarted the router and SSH connected. Then I installed OpenMediaVault and couldn't SSH after the installation completed and the Pi had rebooted. Restarted the router again and the web interface came up.

    Interesting. Who makes the router and what model is it?

    Given the errors, the question I'd ask is, would you want to take a chance? What I'm getting at is, the above are problems and errors what you're aware of. The underlying reasons for these errors are not known and there may be more problems present that you haven't discovered yet.

    To be certain that all is healthy again, the safest path would be to rebuild. Mostly like, a rebuild would be far faster than trying to analyze and fix an installation that may not, at the end of the day, be fixable.


    But that's just one opinion. Others may have different advice.

    __________________


    Along the lines of advice, when you're back up, clone your SD-card.

    Something changed permissions to many folders and files. I tried to solve it by modifying ACL permissions and I didn't get anything. Is it possible that this ends up affecting the access of a disk?

    I can tell you why this happens. Since you're pulling files from another machine, the files are coming from a foreign volume. The local server's root account, must take control of what are new files (that have never been on the local machine before) so the default local permissions create mask is used. (Usually root:root 0644)

    I correct that by changing permissions back to what I want them to be, using Extra options at the bottom of the Rsycn job.



    When files come in, from an outside server, the options above change ownership to the local root account and the local users group.


    In this case, this rsycn job creates a backup of a music share on another server

    ____________________________________________________


    .

    To "decode" what permissions are allowed by 0775, you can use WinSCP (WinSCP is in the guide) to look at files and folders in the share. Following is an example song in the music share. Look at the "octal". This number changes depending on permissions assigned.


    This manual covers many topics but without going into depth and often in technical language.

    The wiki was, originally, for Linux experts and developers. What I added to it, the "New User Guide" doubled the size of the wiki (+/-). In a word processing document, it's over 80 pages. I didn't go into depth in any topic for a couple reasons. Partly, to keep the length reasonable and, truthfully, because most beginners are not interested in nuts and bolts. I did a walk through format for the purpose of getting a server up and running, with just a bit more added to give users some understanding of what they are doing. I did assume Windows clients would be used because Windows still has the greatest percentage of market share.


    and google translates it for me on the screen without doing anything.

    Great. That was the intent behind merging it onto the wiki which was finally finished a few days ago. It's good to know language translation is working. Tell me, did the translation make sense?


    When he says that a long test is an "off line" test, it is as if they spoke to me in Chinese.

    Off line means you can't access the drive while the test is running, so the drive is "off line". And a long test takes much longer than a short test. How long, I can't answer. That depends on spindle speed (rotation speed), the size of the drive, and the drive's condition. A long test does all the diagnostics of a short test AND it tests the surface of the platters. If the test finds an error or a bad spot, the time the test takes is considerably longer as it attempts to reallocate bad sectors.


    If time is an issue, run the short test. That's a couple minutes. Given that you're experiencing errors on start up, a short test might be enough to reveal a problem.


    To find the answer I have to start googling what that means ... and in the end it takes three hours to understand everything. Or ask here continuously and bother for nonsense.

    This is life. Live and learn. I try to help where I can, but trying to troubleshoot from remote is not a perfect science.


    Or the issue of file permissions, which drives me crazy. Finding the detailed explanation is the difficult thing. And the procedures.

    You and everyone else, to include me. I've been thinking of trying to write something for basic permissions, to try to simplify it for new users, but there are so many scenarios. I'm still thinking about doing something very narrow, for controlling access to OMV server network shares only. That might be useful.

    Usually once a month or so, I log into the webUI, enable the delete trigger, run the job manually to bring the two drives completely in sync, then turn it off again.


    It may not be the best or most practical practice, but I've done it forever and it has never caused me an issue and has in fact saved me a few times.

    I like that method and it's mentioned in the User Guide. It allows for some accidental delete protection.

    I'm using ZFS Snapshots, providing versioned protection for up to a year. Still, that's local and there's no point in closing off options when replicating to the backup server.

    When using Rsync or similar backup tech that doesn't do incremental backups or versioning:


    The longer the time interval between automated backups is, the more time you'll have to
    [1] Discover there is data problem (failing hard drive, accidental delete, etc.)
    AND
    [2] Stop any automated backup
    BEFORE
    [3] The problem is replicated to the backup device.


    Just a thought

    First, look at your SMART stat's for the drive with problems:

    Under Storage, SMART, and the Devices tab: Click on a drive. Then, using the Edit button, enable SMART monitoring. Do this for all drives.



    _________________________________________________________________


    Look at your SMART stat's. In the Devices tab, select the drive with problems and click the i Information button. Another window will pop up. These are the stat's for the drive.




    You're interested in the raw counts for the following which have been associated with drive failure.


    SMART 5 – Reallocated_Sector_Count.

    SMART 187 – Reported_Uncorrectable_Errors.

    SMART 188 – Command_Timeout.

    SMART 197 – Current_Pending_Sector_Count.

    SMART 198 – Offline_Uncorrectable.

    Anything more that 2 or 3 counts in any of the above is cause for concern.


    SMART 199 - UltraDMA CRC errors

    (This stat is worth a look if a drive is malfunctioning. It's usually related to hardware or a bad / loose cable.)



    _________________________________________________________________________


    Setting up drive tests:



    In the Scheduled Tests Tab, click the ADD+ button.

    Select a drive, in the device line and select a test type, Long or short. The following is an example that would run an automated short test every Sunday at 01:00AM. I do this for all spinning drives.




    ______________________________________________________________________________________

    It's important to note that drive tests can be set up, but NOT enabled.

    Tests can still be run manually by clicking on a drive and the RUN button.


    Set up a Long test for the suspected drive, don't enable it, and run it manually. (With the run button)



    After the test is finished, advise what the stat's show.

    Enough.
    This forum is about helping users, not regulating them.
    ___________________________________________________

    Cheroot

    If you need more help, please start a PM.

    In the menu bar above, click on

    Then click on +

    Put my username crashtest in participants.


    If the information provided doesn't work, I'll be glad to assist you with this issue, in your language, if that will help.

    It occurs to me to remove that disk, format it, and recover the information with SnapRaid. Would it fix the problem?

    I don't know for sure. But, think about it. You can't fix a hard drive if it's dying. There's no way to repair them.


    DON'T run another SNAPRAID SYNC. You'll be backing up corrupted data which means you won't be able to recover completely.
    _____________________________________________

    Again, test the drive with at least a short test. If it was me, I'd do a LONG test. If the drive has problems, order a new one the same size or larger and do a SNAPRAID recovery on that drive.

    I have a copy from a week ago. It won't do, I made a lot of changes this week.

    I would try to use it and bring it up to date, if possible.


    It won't do, I made a lot of changes this week.

    I just made a new one.

    If there's something wrong the boot drive you're using, you duplicated the problem in the new copy.
    __________________________________________________________

    Here's the the logic - this is either a software problem or a hardware problem. (It's starting to look like a hardware problem.)

    Booting up on a known good backup would give some indication if there is a software issue. If you don't have the above messages in the startup log, when using the older backup thumbdrive, that might indicate that software is corrupted.

    __________________________________________________________


    The following indicates that something is going on with /dev/sda

    Code
    EXT4-fs warning (device sda1): ext4_end_bio:349: I/O error 10 writing to inode 118489917 starting block 792619264)

    /dev/sda needs to be checked with at least a short test. Sometimes SMART doesn't show statistic counts if something is wrong. A drive test will cause SMART statistics to update.

    Buffer I/O error on device sda1, logical block 792526848

    /dev/sda is producing this error on different blocks.


    EXT4-fs (sda1): Delayed block allocation failed for inode 118489917 at logical offset 385024 with max blocks 2048 with error 30
    EXT4-fs (sda1): This should not happen!! Data will be lost
    EXT4-fs error (device sda1) in ext4_writepages:2797: Journal has aborted

    More indications that /dev/sda may be failing or there's a motherboard problem on the SATA port.


    Jan 24 11:43:07 sotano kernel: [ 0.092175] MDS CPU bug present and SMT on, data leak possible. See https://www.kernel.org/doc/htm…in-guide/hw-vuln/mds.html for more details.

    I've never seen this before. It can't be good news.

    ____________________________________________________________________________________________


    Run at least a short drive test on /dev/sda.
    I hope you have backup of data you don't want to lose.

    Quote

    Service: filesystem_srv_dev-disk-by-uuid-336149ff-95b4-4cfc-ad09-ab37c7a469a2

    Event: Filesystem flags changed

    Description: filesystem flags changed to ro,relatime,jqfmt=vfsv0,usrjquota=aquota.user,grpjquota=aquota.group

    The above looks like the flags for a drive entry in etc/fstab


    Removing ro, (readonly), by back spacing three times to remove the characters might make the drive readable again.


    Did you backup your USB boot drive?