Beiträge von kokodin

    Hello . i am here to ask for more help. typical of me .

    I am currently trying to migrate my low budget school file server, with docker aplications runing on it from using raid 0 array to raid 5

    The problem is only avalable sata port i have left to connect another drive is an esata port.

    I would rather not mix esata with internal sata ports in an array so i thought i move system drive to that port, but since this motherboard is made by intel with some crazy ideas, esata port always work in ahci mode while i installed my server in ide mode long ago.

    Switching type of operation for the controlled ends up in non bootable system right after grub so i presume the same will be true for external sata port.

    is thera a way to fix it for a non linux user like me? I would rather not downsize the array, it is small as it is.

    i know how to do it for windows but that won't hepl much.

    Sorry not 45g but g45 i always get those mixed up . Also i think pentium 4 era was 845g to 945g there was no 45g alone.

    Either way it is the 4th and last generation of lga 775 chipsets made by intel, and the last ddr2 board they made for the desktop.

    It was pain to find bios version openmediavault would not complain about but that entirely different story.


    Anyway i think the topic came to it's own natural conclusion and should be wraped up. raid 0 was restored and soon it will be gone due to upgrades. thanks for help and i hope we all have stress free experience from now on.

    It is a intel made desktop motherboard based on 45g chipset so technically i could put in pcie controller 2 x1 and one x16 slots are open, and one pci. I might have pci sata150 controller somewhere or bodge ide>sata over adapter. But financially my hands are tied for non emergency stuff untill april. i did found the esata>sata cable on local ebay for 2$ so i just bought it from my own pocket.

    Board by itself have 5 sata 300 ports with some software/hardware raid hybrid in the bios, but only for raid 0 and 1 and one esata port on the back i/o shield, which is on separate sata channel i hope. i also upgraded southbridge heatsink since i didn't trust sensor reasing of how hot it runs 3 years ago, but now i have to remove it whenever i'm swaping drives :] it is just 3 mm too high.


    I did find one more matching wdc drive, but with bad sectors so it is delegated to be scraped along with some other drives i keep for no good reason

    i did also dig out 3 - 1tb drives , each different and each overworked, if one would test ok i might dump my daily driver samsung just to have fifth drive, or would have to dig deeper untill i find out where the missing samsung and one wdc went :] because for sure they would be lower hour drives . The 1 tb 3,5 inch drives are mostly from 2014 with 5 years online in security cameras so, those are slow and relible, smart test ok, but also old and even more worn out than my raid drives with exeption to broken one.


    i did clean up the server case though, some cable mannagment, some new thermal pase , relocated system boot drive away from array drives and made room for 3 more drives on the sled, ironically new living space is away from air duct so it might be warmer since samsungs idle at 26, wdc's at 27 and relocated boot drive shows 29 celsius so it should be in low 40's during summer

    Well i do use some of those drives as mobile disk image repository for clonezilla, but that is 2 out of 15 , rest is dormand state in humidity controlled room (where 3d printer filement is), all pulled from fairly new dell laptops when we discovered you can't really use them with slow hdd and windows 10. Laptops were bought in late 2019, switch over to ssd's happened around summer 2021, (laptops were barely used becuse they were infuriatingly slow) so i wouldn't even check them to be honest other than confirming run time and smart health with crystal .They have like average 9 days (200-250 hours) of use, and most of it is on dell factory side.


    Although using crystal is scary on by itself since my work computer has another one of those samsungs drives with much more uptime than a server ones, to the point i think crystal confusing hours with minutes (around 30500 hours with 5000 powercycles)but then i think about how i use my computer and it checks out. Although failed samsung had twice that. While rest of the raid drives hover around 17000 and 26000 so this one must been saved from another desktop computer used as a domain controller around 2010 and 2013 . It also had different firmware revision than all the other samsung drives i have of this model so i suspect i am his first owner, and it never left my "server room"(more of a storage for various electrical things). It is funny what you find out when you investigatefaied computer components.


    Currently raid runs on 2 samsung drives with ~26000 and 19000 hours and 2 identical wdc's with 17000 and 900 hours on the clock. They all running on sata 3 speed have the same capacity and cahe size and are of the same firmware revision in pairs.

    All the data has been pulled over to external laptop hdd's overnight.

    I think if i can hack e-sata port of the motherboard i could run system drive from it and possibly rebuild raid as raid 5 if i find another 500gb drive in my used computers pile. Which isn't unlikely i have surprisingly amount of junk from retired computers and related electronics, mostly 80-250gb though i have some oddball 1tb drives too and i am convinced there was 4 of identical samsung + the one that failed and there should be 4 of those wdc drives too. just need to think where i put them. because i currently don't know where 3 of them are (i have a strong suspicion though) Living with raid 5 would so much less stressfull untill i can get something new. For obvious reason raid 1 or 6 is currently out of the question , I would need a motherboard swap.

    Well the ddrescue had been atempted. And i live hard die fast or something , no image just plain disk to disk clone .

    Size of the drives is identical to a single byte.

    Initially ddrescue designated around 8mb as problematic of which it recovered 6 and slightly less than 2mb was in bad sectors. Therefore i think gone. How much of it was on free space and how much in data holding area i can't tell. But most wear was between 70 and 85% of the drive where read error count jumped from 30 to 130 and first pass finished with only one read error more. (after advanced data recovery pases ddrescue reported around 5000 read errors trough wgole process)


    That being said we don't care about how many error it had but did it worked, and yes it did.

    Initially array self assembled with the copy but volume remained missing, but this time runing ckdsk on it let it do the job and filesystem is back, mounted and files seems to be on it.

    How many of them are corrupted i will only know when i find one, but there is some redundency and i don't really need everything.

    There is one conspicuous empty folder, but it might been empty for a while , there is no random extra files, no silly names.

    Either checkdisk unceremonially removed all corupted files, or i was lucky


    Since schools in my region have very tight budget on non esential equipment, i will probably have no budget for new drives untill december, but i will make regular backups from now on.

    The only other option to replace the drives i have now is to swap them with barely used laptop toshibas mq01abd100 of which i have 15 laying around, but that bad idea of itself even in raid 1+0 configuration (server only have 5 sata ports and one is for system boot drive (which is even older) so the size would be still only 1,8TB ) i don't think running laptop drives 24/7 would work long term and they might not even live untill december, but maybe i am biased. Nothind stoping me from using them as backup usb drives though.


    For now i think the raid is saved and there is no much more to be done to improve it. I will pull the data of next trough network since there is no other way. And if you think raid 1+0 of 2,5 inch 1tb slightly used laptop drives is better than running raid 0 of ancient desktop drives i might do that.


    As for smart sheduled checks i had no idea that is a thing untill recently so of course there was never any of them scheduled or performed. Server wasn't even planed to exist. It was hastly assembled 3 years ago from junked vista computer, got 2 extra gigabit nics, i think at one point it even had 5, but the network is bottlenecked at 1gbps by goverment program "new network equipment", and we ended up with just feeding 3 separate networks at 1gbps instead of bundling the bandwidth into one so harddrives didn't even utilize full raid0 speed over network. And server just kind of remainded operating since one of the teachers asked me to keep it. Probably gobbles power like a champ for a job rawsberry pi would do faster too.


    Thanks Krisbee you really saved my bacon this time. if you think my dodgy raid 1+0 of laptop drives is a better option let me know, because otherwise they will just rot in the box

    mdadm -E /dev/sdc


    I will try that rescue cd, but i don't have identical drive, i do however have identical western digital drive with 900 hours online and working fine so it may be a good option. Since they are all striped equaly , valid data block should be the same size on each drive as the smallest one in the array., But they both exacly the same size jurdging by the smart report.


    here some informations i can gather from smart page in the gui (part of it , because message go over the character limit of single post)

    I am pretty sure the data is byebye then. Because fsck thinks it is ext4 as it should but then spit out input output error while trying to open /dev/md0. I think if there is no way to safely probe the drive outside the raid or clone raw to another drive it will simply refuse access due to being dificult :], some samsung self defense or simply drive being broken. it may be just that smart makes it slow (some kind of recovery mode) and unable to synch up with the remaining 3 while still works as a basic storage device thus raid sees it as working.


    I totaly agree it is stupid to make raid volume like this, and i had known the risks while making it. But it was made to test structural network troughput over the school and just kind of remain active due to docker serving some other functions on the same server. It mostly works as a clonezilla short term backup server for images of other computers and file server for providing installation images. So i wouldn't lie i am a bit salty about it not working. but it might be time to build recovery images from scratch after 5 years of junk acumulation. The only thing that might been valuable is the stuff that some teacher might put there for safekeeping , ironically.


    thanks for help anyway

    I will remain open to any ideas till thursday if there is anything else we can do with it, like special blend of clonezilla or something to a single clone drive if possible , but i think on friday the case will be close and i would have to put in place something else.

    Hello i'm just a stupid admin / electrician, so what i say might be incohearent for an actual tech person. Please forgive my lack of technical language.


    After powerlose my homebrew fileserver at work have a missing filesystem. I am pretty sure it might have something to do with smart status of one drive being "few bad sectors" but raid does assemble and work as "clean" status with no drives missing, only the filesystem is missing.


    Required commands outputs

    Code
    root@sejf:/# cat /proc/mdstat
    Personalities : [raid0] [linear] [multipath] [raid1] [raid6] [raid5] [raid4] [raid10]
    md0 : active raid0 sdc[0] sdb[3] sdd[2] sde[1]
          1953017856 blocks super 1.2 512k chunks
    
    unused devices: <none>


    Code
    root@sejf:/# blkid
    /dev/sde: UUID="fa9d9871-e08e-0991-28b9-1686674acb8f" UUID_SUB="29e259c0-8d5c-d4fe-ec7d-2377526a6590" LABEL="sejf.local:magazyn" TYPE="linux_raid_member"
    /dev/sdb: UUID="fa9d9871-e08e-0991-28b9-1686674acb8f" UUID_SUB="4bdc5a68-0a08-e92b-acbd-54f80eac2716" LABEL="sejf.local:magazyn" TYPE="linux_raid_member"
    /dev/sda1: UUID="390b1de1-90ce-4894-95a4-90e34d2aec8c" TYPE="ext4" PARTUUID="6e05c9d1-01"
    /dev/sda5: UUID="4d880235-05ac-4272-b613-34b130d10b07" TYPE="swap" PARTUUID="6e05c9d1-05"
    /dev/sdc: UUID="fa9d9871-e08e-0991-28b9-1686674acb8f" UUID_SUB="9ff6f7d0-f639-6510-3928-587cd931b21c" LABEL="sejf.local:magazyn" TYPE="linux_raid_member"
    /dev/sdd: UUID="fa9d9871-e08e-0991-28b9-1686674acb8f" UUID_SUB="3b7daa3f-36ed-50f8-b00b-feae1dff8b4c" LABEL="sejf.local:magazyn" TYPE="linux_raid_member"




    Code
    root@sejf:/# mdadm --detail --scan --verbose
    ARRAY /dev/md0 level=raid0 num-devices=4 metadata=1.2 name=sejf.local:magazyn UUID=fa9d9871:e08e0991:28b91686:674acb8f
       devices=/dev/sdb,/dev/sdc,/dev/sdd,/dev/sde


    It is a simple 4 drives striped volume i wanted to pull data from and rebuild it with a new set of drives, but if i can't mount filesystem data is as good as lost. it is not a critical storage, but some data could needs to be onlineby next week from a local computer backup. But i have no idea what users put there.

    It is used as a fileserver for a small school, to share files between computer classrooms, so students or teachers might have put something up there without me knowing and there is no backup then.


    One thing that might be important i might add, there is no error messages on terminal window, but local monitor there is one repeated message

    Code
    blk_update_request: i/o error, dev/sdc sector 264192 op 0x0 read flags 0x80700 phys seg 1 prio class 0
    blk_update_request: i/o error, dev/sdc sector 264192 op 0x0 read flags 0x0 phys seg 1 prio class 0
    buffer i/o error on dev md0 logical block 0 async page read

    I assume it is bad since this raid has no redundancy , but i am not linux educated person so there might ba a simple fix like scandisc, pray , copy, and run away.