Posts by trapexit

    It's not likely mergerfs per se but how it's being interacted with. Over the years people have reported drastically different performance results from what on the surface appear to be similar setups but clearly aren't. The problem has been tracking down exactly the differences. If you look at the benchmarking examples in the mergerfs docs you can see how drastically the per read/write payload size has on throughput. Unfortunately, mergerfs isn't really in a position to help with apps that may use smaller than idea sizes. Perhaps I could read more than the user asks and cache writes for a time to limit the trips into the kernel but that would increase complexity. It's on my experimentation todo list. Another thing that impacts throughput is the latency. If the latency is higher and the app is serially dispatching requests (which is the norm) then throughput suffers.

    Anyway... wrt this situation... could be a number of things. You've got TCP with it's complex behaviors, the network filesystem and it's things, mergerfs and the core drives. If any of the "rhythms" get out of sync it could lead to this.

    Tracking it down: if you're able to provide me with some trace logs (using strace) of mergerfs in the middle of a transfer for both the 1Gbps and 10Gbps I can look to see the interactions between Samba? (NFS?) and mergerfs.

    How are you testing *exactly*? Try to remove any and all variables. If you have a network filesystem mounted on the client use "dd if=/mnt/cifs/file of=/dev/null bs=1M" kind of thing to remove the local disk from the equation and play with the per block read size to see if that makes a big difference. On the server you can put mergerfs into nullrw mode to remove the underlying drives from the situation. And take a look over the performance tweaking section. If you play with a few settings and see a noticeable difference it might suggest what the true cause is. Also I'd suggest trying a different protocol (FTP, scp, etc.) if you haven't to rule out that as either a cause or catalyst.

    If you (or anyone else) happen to be in the Manhattan/NYC area I've a few of those ICYCube's and a esata SansDigital I'm not using and willing to part with on the cheap.

    Regarding other chipsets... in my journeys it always seemed like every single enclosure had its haters. Very hard to suss out which were systemically bad and what were just one offs.

    My own tool which is similar to badblocks.…ki/Real-world-deployments

    Those are the enclosures I used to use. They worked OK but I would have occasional issues that I simply don't have now with my HBA. I was using them over eSATA however so I can't speak to their USB stability. I know this isn't ideal but 1 to 1 drive to controller setup is probably better when using USB. Almost all sata -> usb bridges will reset the whole set of drives if something acts up. Was why I used eSATA.

    It *could* be the OS drivers as well being flaky but that's going to be harder to show without having the exact same device to compare against. When I used USB3/eSATA enclosures I had 4 of the same so it was clear that one of them was physically bad vs a software bug. Though even then... it's plausible that the hardware issue could be worked around in software.

    It's complicated is what I'm getting at :)

    I'd never say 100% but, yes, very very likely. Preferably you'd be able to check the drives outside the suspect enclosure but I've had individual ports on controllers go bad so if the same drive is the one having issues try changing the port it's on (changing the physical drive order in the bay). If the same bay is acting up then it's the controller but that one port. I've seen this a number of times.

    If SMART checks come back OK as well as badblock or bbf (self promotion) then while still possible to be a problem it is less likely.

    I don't see how without a way to narrow down what the issue is. Clearly the USB device being reset isn't a good sign. It could be bad controller, bad usb cable, bad drive. To find out which you need to be able to swap each out.

    Don't know if those are related given the time difference between them but certainly could be. USB3 - SATA controllers are not known for being the most stable and it's totally possible that the drive is fine and the controller is flaky. Does this happen often enough that if you put the drive in another enclosure you'd expect to reproduce this in a short time?

    Also ensure your Pi power supply and cable are up to spec. I'm less familiar with the 4 but previous versions could behave poorly the power was not delivered up to spec.

    I don't know anything about the plugin but the first column of /proc/mounts is not unique. You can't be relying on that for anything. In mergerfs that is configurable by the fsname option.

    I'm guessing you're having problems because you can't trace back to which mounts are created by which plugin? If that's the case I suspect you'd need to find a way to indicate which is which some other way.

    I don't understand why that would be better. Just turn off auto scanning. Much easier and less complex. Why put in *another* level of caching (the OS already does) just to keep drives from spinning which is 1) totally in the users control in the first place 2) not that common of a need and 3) severely increases the complexity of the project. If you just have mergerfs ignore what data is actually in the pool and make it nothing more than a metadata cache that has to be forcefully or on a schedule refreshed... why not just not look at the pool? Plex doesn't *have* to scan. And when it does it usually is reading files. Not just scanning the directories. Some software looks at extended attributes. That's a lot of random data of unknown purpose to copy around with the hope that some time in the future an app might ask for the same data. That works fine with the OS does it. It doesn't work fine in this usecase. If you put lvmcache or bcache in front of your drives it wouldn't stop them from spinning up either. It might limit it from time to time but wouldn't guarantee anything.

    You are hardly the first person to ask for such a feature but it's simply not practical. I built a prototype years ago... it doesn't work in many cases. mergerfs can't even reliably check if a drive is spun up or not. Drives would literally spin up when querying if they were.

    There are the mergerfs tools which offer a tool to balance drives. You'd install the drive, run the balance tool, then use as normal. Or you use the rand policy.

    What data do you propose to cache on this SSD? Plex is not just reading filesystem info (stat and readdir, basically a **ls**). It's reading file data. Also... when would that data be cached? Would mergerfs or another tool have to read the entire pool and try to figure out which data might be needed? What few blocks of every file *might* just have the metadata some random app will want? If it caches on demand then the drives would still need to be spun up for mergerfs to know if new data was available. Plex scanning is configurable. Mine is once a day. If mergerfs' timeout was shorter than that then it'd spin up the drives more than once a day.

    This problem is not really solvable. People underestimate what's going on and what is possible. Many people work on drives under mergerfs directly. It's not practical to watch those behaviors so caches would get out of sync more easily. The best way to deal with keeping drives from spinning is to not use them.

    Quote from nightrider

    I did set up the pool with 2 drives and it did not work for me with "Existing path, least free space", I could not continue write files to that pool. I did try with the same relative path on both drives, only thing was that one of the drives was full and already reached the minimum free space. I do not know, maybe it was only a temporary bug.

    Did you create the *full* relative paths on both drives and try creating something *in* that directory?

    Quote from nightrider

    I like the idea of having all the files relatively in order on the drives, I just add a new drive when the pool starts to be filled up.

    Then why not just use **ff** with an appropriate **minfreespace**? Also... filling up drives one at a time, if you've not already filled N-1 drives in your collection would be wasteful and increases data risk or time to recover. That may not be your situation but if you have less than minfreespace on multiple drives mfs or lus are generally the best option.

    What does "relatively in order" mean? Order of when you created them? I'm only familiar with two reasons for that 1) someone wants to be able to hot remove the drive so they can take it elsewhere and want sets of data on that drive. Like taking a drive on vacation for watching a whole TV show. That's a super niche case given most would stream or would transfer to another device. and 2) You don't have any backup and would rather lose everything written around the same time rather than the random'ish layout of mfs, lus, etc.

    Besides those niche cases using ff in a general setup only has negatives.

    Quote from nightrider

    I store my files for long term use, I mean I write it once and then leave it there and I do not have to spin up all the drives unless I need to access that specific file. Power saving and HDD endurance at its best.

    Most everyone using mergerfs has that pattern and most use mfs or lus create & mkdir policies.

    You're mistaken thinking drives won't spin up or that the endurance will be the best. Drives will spin up if data from them is necessary. That includes any metadata. The OS does cache some data but on the whole it will often not have the data needed when a directory listing happens or whatnot. Many pieces of software in the media space must scan the file to pull metadata, file format, etc. so any scan they do will spin up all drives even if the metadata was cached. mergerfs can't control how software behaves. It can't know what it is looking for. If "foo" happens to be on the last drive and the app is searching for "foo" then every drive before the last will have to be active to give the kernel the entries for them. It's extraordinarily difficult to limit spinup if you have any sort of activity. Torrents, Plex, etc. If you stage your data you can limit it but I find few can do so practically.

    As for endurance there is very mixed data on how power cycling affects drives. I've seen some reports that said it had no obvious effect and others that said it significantly impacted them. If I had to bet I'd say the latter is more likely to be true because like starting a car the starting of a drive is a more jolting and energy intensive process. The physical and electrical stress is higher. It's not uncommon to fear restarting a system when the drive is acting up due to the possibility of it not starting back up.

    Quote from nightrider

    How much performance do I loose really by using it like I do? I mean I write it once and then leave it there?

    I'm not sure what your usage patterns are so it's impossible to comment. Your only as fast as the slowest part. If you colocate data on a drive and then access that data in parallel then that will perform worse than if the data was on 2 different drives.

    Path preservation is working just fine. What you described is exactly the behavior expected.

    I'm not sure I understand what you expect. Path preservation preserves the paths. As the docs mention it will only choose from branches where the relative base path of the thing being worked on exists. If you only have 1 drive with that 1 directory then it will only ever consider that drive. If it runs out of space you should rightly get out of space errors. The "change" referenced on that website was a bug fix. If you "fall back" to another drive... what's the point of path preservation in the first place? If you don't care what drive your data is on why would you reduce your speed and reliability by putting everything on one drive while the others sit around unused?

    Path preservation is a niche feature for people who want to *manually* manage their drives but have them appear as one pool.