Posts by 1activegeek

    Well, as I was working on trying to quickly via the UI discover the UUID of the new disk, I found the culprit!! Turns out in my SnapRAID config, there are exclusion rules that call out the individual disks instead of the main pool. This was suggested as the best way to go I think somewhere. So it seems these leftover rules, were what was being seen as the reference! Now that I updated those rules to apply to Disk4, and Disk3 has now been listed at Not referenced, and I can unmount and remove it properly. So just wanted to share this here in case anyone else stumbles upon this issue as well.

    Checked both. Good ideas, but still coming up short. I just really don't want something to get messed up if I pull the disk and it's still referenced from some config in OMV. I have an issue a long while back when I first started out with it that caused me havoc in a config that needed to be "redone" and removed again to get rid of the errors I kept getting making changes to other non-relevant things. Long story short, I know OMV can be temperamental in some instances if the config has a reference to something that doesn't exist.


    Any other ideas? All I found in fstab is the regular mount call for the actual mounting of the drive at this time. So I'm pretty sure that's supposed to be there at the moment, and should give me the option to unmount the filesystem if it wasn't referenced somewhere else.


    One hunch I'll ask - is it possible that the UFS function is causing this? Since there was some reference at one point to Disk3 being part of the Storage1 pool, that it's somehow still linked with all the instances of Storage1 being used? This could be extremely problematic if it's the case, since Storage1 is what's used ALL over my system. But I've already switched out Disk3 for Disk4 in the pool setup.

    So I'm running into this same issue and was hoping for some direction. I've got a disk that went bad, I've RSYNC all the data to a new disk replacement. I've swapped out where this disk was used in the MergerFS setup and the SnapRAID setup. Everything else references either Disk1 (not the one that went bad) or Storage1 which is the MFS pool.


    I've dropped down to CLI and grepped through the config.xml to look for entries of disk3, sdc1, or the UUID of the disk in question holding the Filesystem assigned. I'm not finding any traces. Any other places to look? I'm hoping to properly get this bad boy unmounted/deleted from the system before I pull the physical disk.

    So I'll be updating this weekend (do bi-weekly updates on sundays for systems), and I will check out all the enhancements worked on for this new version. Also report back on the transition with the .9 vs .10 versions and the CE repo functionality.


    Something I wanted to bring up though in some testing I was doing today with the elasticsearch, logstash, and kibana containers. I'm finding it more likely I'm going to want to be setting up containers to be attached to different networks than the stock Bridge network (docker0). Currently, it seems I can't select an alternate network. With the changes being made for support of MACvlan, I'm thinking perhaps this support should be extended for all 4 types from the dropdown. Specifically, I created a secondary Bridge network called bridge2. When you create new networks, container name resolution will work, but not in the default bridge. For this reason I think it's becoming more necessary to enable the same secondary config section like for MACvlan to help allow choosing of other bridge type networks, and likely for other use cases the None option. Host won't ever have a choice.


    At the moment, I've gotten around this by choosing the bridge network, setting everything as usual, and then down bottom using the --net=bridge2 option in the extra args. The problem here, is anytime I try to modify or alter the current container, it interestingly shows Bridge2 in the network options (not in the dropdown, but at the time it populates the field with it). The problem is, if I leave everything as is or try removing the --net option, it will wipe out my network port mappings and/or will reset to the original Bridge network. So if I'm careful I can get it right, but it's not perfect. If it helps I can try to do some screenshots to help, or short screen cap clips.


    Lastly, more of a long shot nice to have in the future (certainly not a need right now) - if there could be an option to allow attaching multiple networks. This isn't possible with the core docker setup or in a single run command. Since this plugin is running some scripts to perform the actions though, I figure it could be made to run the "run" command to create the container first, sleep (very short just to let the container create), then run a docker network connect command to add additional networks to a container. Like I said, not needed/required ATM, not possible in normal docker run, but would be a cool benefit to how we're allowing it to be used here.


    PS - thanks for all the work @subzero79 @ryecoaaron and @nicjo814 - loving this capability easily managed inside OMV! :thumbsup:

    Ya so it looks like I will need to search out a potentially different solution. I'm glad I discovered this now at least with a disk that has less stuff on it. And have a working copy to move from. In the process of a long RSYNC right now. Almost halfway through from about 3 hours ago so I'd say it's in good shape to finish pretty quickly overall, considering.


    Thanks for helping me discover this, glad I know now.

    Well now it seems I can't manage to get the snapraid command to run locally on the system. This was my concern running it manually using SnapRAID, as when I recently attempted using SnapRAID to restore a few pictures that I deleted by accident, I ended up with wacky permissions. Likely because I was hit with this same permission denied when running the command. So I just used SUDO and had no issue. The end result though was mangled permissions that wouldn't let me access the content.



    Code
    >> snapraid -d disk3 -l fix.log fix
    Self test...
    Error creating the lock file '/media/97974d61-46e4-43fa-a535-54a31b4faec2/snapraid.content.lock'. Permission denied.

    The mentioned media mount, is not MergerFS, it is the local individual disk mount that is part of the pool.


    In doing a bit of reading, am I to possibly understand now that SnapRAID will unfortunately NOT restore the permissions for this content? If that's the case, I think I need to re-think my WHOLE strategy as I do have staggered permissions on content. Having to sort out all those permissions may be a dealbreaker for relying on SnapRAID.


    Since the disk isn't bad yet, is there any easy alternative to move the content onto the new disk? Perhaps just using RSYNC, then letting (assuming I keep using it) the SnapRAID index build again with the fresh content?

    Interestingly, you make me think of a different point in the restore speed. While the disk is a 4TB disk, it's only got about 800GB worth of content. Perhaps for that reason alone, a Repair operation using SnapRAID could produce a faster result as it only has to construct less than 25% of the data, whereas a clone will be attempting to read/write empty blocks (give or take, as I know advances reduce it from 100% copy of empty blocks) - but it should be a bit faster overall. And as you mentioned avoid the necessity to read the bad sector data.


    Perhaps since I haven't gotten very far, I'll attempt stopping it now and testing that method instead.

    Well, I was figuring that using CloneZilla, and getting an actual replica of my drive would actually be faster and more efficient assuming the SMART detected errors are not fully evolved (Aka the just hit and I reacted right away) and I can get fully copies of the data. Even if not, I'd assume it should skip through those bad sectors pretty quick and leave me with almost a perfect replica. In my instance basically I haven't lost the disk, just got the SMART errors returned and proactively replacing the disks.


    If this seems foolish, perhaps I will go that route. At this point, the disk seems to be copying over at a slower and slower rate declining since the start of the Clone. Only made it to 8.8% of Data Block and 2.11% of Total Block processing in 1hr15min. Started at 9.25GB/min down to 886MB/min.

    So I've started down this path on the non-destructive processes. Specifically I've reboot into CloneZilla. Seems the bad sectors are being detected there as well. During an early portion of the clone process, I received an error from bad sectors. As suggested, I've restarted in expert mode and enabled the -rescue option.


    I would assume at this point then I may be missing data blocks, but the FS should be intact? If this is the case, then a Fix process from SnapRAID should fix my missing files. Assuming that I haven't overwritten or re-run a SnapRAID sync process that could no longer read the blocks in the bad sectors.

    I'm looking to see if I can get some validation or confirmation of my plan relevant to the setup and situation I have below.


    Current Setup:

    • SnapRAID pool

      • 3x 4TB Data Disks
      • 1x 4TB Parity Disk
    • MergerFS

      • /storage with multiple sub-folders and content
    • SMART Test

      • Running every Wednesday night (corrected to Wed)
      • Recent report flagged 3 bad sectors, then within the last 3 days or so bumped up to 7 bad sectors
      • Already filed RMA with WD and new drive is here

    Plan:

    • Install new physical disk alongside others
    • Reboot into CloneDrive
    • Clone current failing disk to new disk
    • Shutdown, and remove failing drive
    • Startup and name disk same as failed disk
    • Add new (replacement) disk to SnapRAID/MergerFS pools as same name


    Does this logically make sense, and actually make sense to achieve what I'd like to? It seems cloning will have the quickest/easiest processing to get this done, and avoid stupid snafus by me on the command line with permissions and other potential problems. My only main concern really is ensuring SnapRAID doesn't have problems continuing on like nothing happened, MergerFS volume /storage isn't affected, and that the clone with bad sectors on the drive doesn't cause some other type of copied issue.


    If I lost the 7 bad sectors, I'd be ok with it honestly. I'd rather salvage the rest of the data, then be able to do a SnapRAID Fix operation.

    It may not be that straightforward. There are multiple layers here - user running the container in OMV, user running inside the container, and the core users/groups on the system. At a base level, you may want to do an <tt>ls -la</tt> on the /dev/dvb directory to understand what permissions are set on it. This will give you an idea if the dialout group will help. Then you need to understand what the user inside the container is running as. The use inside the container needs to have permissions that match to the local permissions to gain access to the device. By default, I don't think it will have access unless it's running as root. Even in some instances, that may not map through properly and shouldn't for security reasons.

    @indigo how the card was mounted locally is what really matters. I don't know how you set it up in your system or what the default is. I believe something like this needed to be done for something I was testing out with a home automation platform at one point. It required insuring that the container UID/GID running were part of the dialout group I believe to be able to get access to the device. This may have been specifically due to the nature of the dongle being attached to the host. It may be similar though. I'd suggest asking around in the forums for support of your specific card. They may have more detail there on what permissions are needed where, or alterations to the mounting that will ensure other apps can access it.


    I will say the hardest thing about Docker, is troubleshooting permissions. Outside of this though, it's a breeze. And when you finally get it figured out (permissions), it becomes a lot easier overall to solve them in the future.

    The method called out here is the Watchtower container. I use it and it’s fantastic. Specifically same case you mention, weekly updates to LS.io containers is seamless.


    Keep in mind as well, if you stop a running container, then update (aka pull the new) the image, then restart the machine - to my knowledge that will work as well to get the updates. I did this previously where I’d pull the updates, then delete the older image (shows with funky name) and go on. Alternatively if needed you can just hit the modify button, then save, and it will re-create the container fresh with your settings as well and def use the latest image.


    And to do the pull, you can do this in the GUI. Right on the first screen there is a button for Pull Image. But I believe you just have to enter the repository name properly. Then it will check for updates.


    My opinion is biased since it’s how I do it, but Watchtower is the way to go.

    ;) No worries, happens to all of us sometimes. Glad you got it sorted.


    The second part is more a security advisory. Reality is if someone can manipulate or exploit a vulnerability in a container, running the UID/GID as root could lead to malicious attack further exploiting the WHOLE OMV system.

    • First - you shouldn't be trying to map the permissions to the root user. You should be creating a less privileged user than root to store things with.
    • Second - as the container is pointing out to you, you have set the mappings in the Volumes section to Read Only (the R/O button is lit green) - this will stop you from being able to write to these directories which will very likely cause issues for almost any application

    Sorry my coding skills are not good enough to make such a change.

    @no_Legend - mine are not great either, though I'd encourage you to dig in and learn a bit about it. In this case, I've already made the appropriate PR for SubZero. When they update and push the new plugin (I believe after a few other updates are ironed out and properly merged) - this should include a service indicator for the Docker service as you were expecting.

    @subzero79 - I think obviously adjustments to code to make it easier/possible is always the best option, but your idea of outputting the run command, could be extremely useful! I just recently ran into this where I was testing a new container, got sidetracked. Sunday came, and my script ran to cleanup non-running containers. I ended up having to rebuilt everything in the console again. If it would be possible to output the run command based on a currently running container and/or to spit it out from a config you've created in the console on demand, that would be fantastic!! It could have saved me since I could easily copy down the run command, and if I start it again via the CLI, the plugin always picks up the details as well. So it could be easy to run it again later and re-manage from the plugin.


    Just my two cents, but I could see this being very useful for more than just this port range scenario. The more extensible it is, the easier it becomes for everyone to use.