That shows the array as unmounted, if you select it and then click mount on the menu does it mount
Raid 5, 3 HDD's, clean degraded
-
- gelöst
- Darcu
-
-
-
but I didn't reboot. Should I
TBH I'm at a loss as to what is going on, so reboot might resolve it, there are options that can be run to correct fstab and the entry mdadm.conf
-
after reboot:
the good news: the Raid is shown again. (it was mounted automatically)
the bad news: when I try to recover /dsv/sdd via WEBGUI it is the same error than in post #1 --> Failed to write metadata
-
the bad news: when I try to recover /dsv/sdd via WEBGUI it is the same error than in post #1 --> Failed to write metadata
What's the output of cat /proc/mdstat
-
Code
root@MEDIASERVER:~# cat /proc/mdstat Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0] [raid1] [ra id10] md127 : active raid5 sda[0] sdb[1] 27344500736 blocks super 1.2 level 5, 512k chunk, algorithm 2 [3/2] [UU_] bitmap: 28/102 pages [112KB], 65536KB chunk unused devices: <none>
-
That's active, that should allow you to add that other drive, but does it show in blkid
-
no
Code
Alles anzeigenroot@MEDIASERVER:~# blkid /dev/sdc1: UUID="4A9E-C1A4" TYPE="vfat" PARTUUID="edd760ec-6ae2-49a6-9057-4bac06cfb8fc" /dev/sdc2: UUID="7951e886-abd2-4a07-b708-e3bab6e33a8f" TYPE="ext4" PARTUUID="0984dc8b-69b0-4b80-aace-860f4418e579" /dev/sdc3: UUID="b988c23d-7d64-42d5-9eee-7b63be103b2e" TYPE="swap" PARTUUID="b224a268-4fb6-4511-b89c-556bdf4c8d29" /dev/sdb: UUID="d1b4a55e-a6b8-12ae-41fe-b7af1e41a9ca" UUID_SUB="1dd37406-fd44-9c3e-b179-316da379a6ca" LABEL="MediaServer:RAID" TYPE="linux_raid_member" /dev/md127: LABEL="RAID5" UUID="6eb513cb-8151-4ec9-8f71-b3e518b31255" TYPE="ext4" /dev/sda: UUID="d1b4a55e-a6b8-12ae-41fe-b7af1e41a9ca" UUID_SUB="fc59dfda-4bd3-4f3d-4b60-9fe2cf1572dc" LABEL="MediaServer:RAID" TYPE="linux_raid_member"
Should I try to change SATA cable? Or switch SATA-ports on my Motherboard (I have only 4 SATA for my four drives).
-
Should I try to change SATA cable? Or switch SATA-ports on my Motherboard
It's either the cable or the port, I take it doesn't show in the GUI under Storage -> Disks
-
-
i changed all three SATA-cables for the HDD's but no difference. Still the same error when I try to recover /dev/sdd in the WEBGUI.
Maybe give it a try to add the disk with the SSH. Is it:
?
Next step would be to change SATA ports on the mainboard, but now I am struggeling:
For example
sda --> SATA-Port 1 (works acutally -->OS)
sdb --> SATA-Port 2 (works acutally)
sdc --> SATA-Port 3 (works acutally)
sdd --> SATA-Port 4 (works not)
e. g. I would try to change sdb to SATA-Port 4 and sdd to SATA-Port 2:
Maybe SATA-Port 4 is broken and OMV does not recognize sdb but sdd on SATA-Port 2: So sdc is the only disk with is in the RAID, but OMV tries to connect sdd automatical to the RAID, but my data structure on the RAID has changed the last weeks only using sdb & sdc.
I hope you know what I mean?
Is it dangerous. My RAID is not a backup. If I loose the RAID, I will loose a lot of data
Another option: Buy a new motherboard?
Regards
-
Is it dangerous. My RAID is not a backup. If I loose the RAID, I will loose a lot of data
Yes you will
OK, the issue could be related to one of the sata ports so you need to test them, but why are you booting from a hard drive when a usb flash drive is good enough, which is what I use.
Disconnect the data drives (your 3 raid drives) at least then the data should be Ok, use the OMV OS drive to test each port, it will take a little time because OMV will try to locate the raid but it should still boot. If it is Sata Port 4 then you have 3 options, a) a new m'board b) run omv from a usb flash drive, (that would require 2 or 3 one working and one as a backup) I use 3 in rotation, c) PCIE sata card and run the raid drives off that
Option2 is the cheapest, but it means your board would only have 3 workable sata ports, option3 would not be too expensive but you need to be careful what you buy, we've had user's with cheap chinese cards.
-
strange.
I test all ports using only the omv OS drive. Every Port works. I reach the WEBGUI and did a apt-get update & apt-get upgrade via SSH. Works fine with every PORT
Before I tested the PORT's, I did a S.M.A.R.T. test with sdd:
Code
Alles anzeigensmartctl 6.6 2017-11-05 r4594 [x86_64-linux-5.10.0-0.bpo.9-amd64] (local build) Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org === START OF INFORMATION SECTION === Device Model: WDC WD140EFFX-68VBXN0 Serial Number: 9RHZARBC LU WWN Device Id: 5 000cca 264dbe2e2 Firmware Version: 81.00A81 User Capacity: 14,000,519,643,136 bytes [14.0 TB] Sector Sizes: 512 bytes logical, 4096 bytes physical Rotation Rate: 5400 rpm Form Factor: 3.5 inches Device is: Not in smartctl database [for details use: -P showall] ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4 SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Thu Dec 30 11:23:39 2021 CET SMART support is: Available - device has SMART capability. SMART support is: Enabled AAM feature is: Unavailable APM level is: 127 (intermediate level with standby) Rd look-ahead is: Enabled Write cache is: Disabled DSN feature is: Unavailable ATA Security is: Disabled, NOT FROZEN [SEC1] Write SCT (Get) Feature Control Command failed: scsi error medium or hardware error (serious) Wt Cache Reorder: Unknown (SCT Feature Control command failed) Read SMART Thresholds failed: scsi error medium or hardware error (serious) === START OF READ SMART DATA SECTION === SMART overall-health self-assessment test result: PASSED General SMART Values: Offline data collection status: (0x80) Offline data collection activity was never started. Auto Offline Data Collection: Enabled. Self-test execution status: ( 0) The previous self-test routine completed without error or no self-test has ever been run. Total time to complete Offline data collection: ( 101) seconds. Offline data collection capabilities: (0x5b) SMART execute Offline immediate. Auto Offline data collection on/off support. Suspend Offline collection upon new command. Offline surface scan supported. Self-test supported. No Conveyance Self-test supported. Selective Self-test supported. SMART capabilities: (0x0003) Saves SMART data before entering power-saving mode. Supports SMART auto save timer. Error logging capability: (0x01) Error logging supported. General Purpose Logging supported. Short self-test routine recommended polling time: ( 2) minutes. Extended self-test routine recommended polling time: (1436) minutes. SCT capabilities: (0x003d) SCT Status supported. SCT Error Recovery Control supported. SCT Feature Control supported. SCT Data Table supported. SMART Attributes Data Structure revision number: 16 Vendor Specific SMART Attributes with Thresholds: ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE 1 Raw_Read_Error_Rate PO-R-- 100 100 --- - 0 2 Throughput_Performance --S--- 135 135 --- - 108 3 Spin_Up_Time POS--- 092 092 --- - 107 (Average 220) 4 Start_Stop_Count -O--C- 099 099 --- - 403 5 Reallocated_Sector_Ct PO--CK 100 100 --- - 0 7 Seek_Error_Rate -O-R-- 100 100 --- - 0 8 Seek_Time_Performance --S--- 133 133 --- - 18 9 Power_On_Hours -O--C- 100 100 --- - 5070 10 Spin_Retry_Count -O--C- 100 100 --- - 0 12 Power_Cycle_Count -O--CK 095 095 --- - 360 22 Unknown_Attribute PO---K 100 100 --- - 100 192 Power-Off_Retract_Count -O--CK 097 097 --- - 21770 193 Load_Cycle_Count -O--C- 097 097 --- - 21770 194 Temperature_Celsius -O---- 044 044 --- - 37 (Min/Max 18/46) 196 Reallocated_Event_Count -O--CK 100 100 --- - 0 197 Current_Pending_Sector -O---K 100 100 --- - 0 198 Offline_Uncorrectable ---R-- 100 100 --- - 0 199 UDMA_CRC_Error_Count -O-R-- 100 100 --- - 0 ||||||_ K auto-keep |||||__ C event count ||||___ R error rate |||____ S speed/performance ||_____ O updated online |______ P prefailure warning Read SMART Log Directory failed: scsi error medium or hardware error (serious) General Purpose Log Directory Version 1 Address Access R/W Size Description 0x00 GPL R/O 1 Log Directory 0x03 GPL R/O 1 Ext. Comprehensive SMART error log 0x04 GPL R/O 256 Device Statistics log 0x07 GPL R/O 1 Extended self-test log 0x08 GPL R/O 2 Power Conditions log 0x0c GPL R/O 5501 Pending Defects log 0x10 GPL R/O 1 NCQ Command Error log 0x11 GPL R/O 1 SATA Phy Event Counters log 0x12 GPL R/O 1 SATA NCQ Non-Data log 0x13 GPL R/O 1 SATA NCQ Send and Receive log 0x15 GPL R/W 1 Rebuild Assist log 0x21 GPL R/O 1 Write stream error log 0x22 GPL R/O 1 Read stream error log 0x24 GPL R/O 256 Current Device Internal Status Data log 0x25 GPL R/O 256 Saved Device Internal Status Data log 0x2f GPL - 1 Set Sector Configuration 0x30 GPL R/O 9 IDENTIFY DEVICE data log 0x80-0x9f GPL R/W 16 Host vendor specific log 0xe0 GPL R/W 1 SCT Command/Status 0xe1 GPL R/W 1 SCT Data Transfer ATA_READ_LOG_EXT (addr=0x03:0x00, page=0, n=1) failed: scsi error medium or hardware error (serious) Read SMART Extended Comprehensive Error Log failed Read SMART Error Log failed: scsi error medium or hardware error (serious) ATA_READ_LOG_EXT (addr=0x07:0x00, page=0, n=1) failed: scsi error medium or hardware error (serious) Read SMART Extended Self-test Log failed Read SMART Self-test Log failed: scsi error medium or hardware error (serious) Read SMART Selective Self-test Log failed: scsi error medium or hardware error (serious) SCT Status Version: 3 SCT Version (vendor specific): 256 (0x0100) SCT Support Level: 0 Device State: Active (0) Current Temperature: 37 Celsius Power Cycle Min/Max Temperature: 27/37 Celsius Lifetime Min/Max Temperature: 18/46 Celsius Under/Over Temperature Limit Count: 0/0 SMART Status: 0xc24f (PASSED) Minimum supported ERC Time Limit: 70 (7.0 seconds) Write SCT Data Table failed: scsi error medium or hardware error (serious) Read SCT Temperature History failed Write SCT (Get) Error Recovery Control Command failed: scsi error medium or hardware error (serious) SCT (Get) Error Recovery Control command failed Device Statistics (GP/SMART Log 0x04) not supported Pending Defects log (GP Log 0x0c) supported [please try: '-l defects'] SATA Phy Event Counters (GP Log 0x11) ID Size Value Description 0x0001 2 0 Command failed due to ICRC error 0x0002 2 0 R_ERR response for data FIS 0x0003 2 0 R_ERR response for device-to-host data FIS 0x0004 2 0 R_ERR response for host-to-device data FIS 0x0005 2 0 R_ERR response for non-data FIS 0x0006 2 0 R_ERR response for device-to-host non-data FIS 0x0007 2 0 R_ERR response for host-to-device non-data FIS 0x0008 2 0 Device-to-host non-data FIS retries 0x0009 2 5 Transition from drive PhyRdy to drive PhyNRdy 0x000a 2 6 Device-to-host register FISes sent due to a COMRESET 0x000b 2 0 CRC errors within host-to-device FIS 0x000d 2 0 Non-CRC errors within host-to-device FIS === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION === Sending command: "Execute SMART Short self-test routine immediately in off-line mode". Command "Execute SMART Short self-test routine immediately in off-line mode" failed: scsi error medium or hardware error (serious)
If the Ports are okay, it has to be the disk?
-
I enclose the system log from rebooting the system
edit:
I could extend all four disks and put it in my DESKTOP-PC. It will be a lot of work, but maybe it helps to go on
-
If the Ports are okay, it has to be the disk, Works fine with every PORT
The drive appears to be OK, 5, 10, 196, 197, 198 are not returning any error counts.
So if the ports are OK the drive /dev/sdd is seen in Storage -> Disks, but not in blkid my only suggestion then would be to wipe that drive then add it to the array, BUT be VERY VERY CAREFUL ensure you are selecting the right drive to wipe, if you wipe one of the others your data is toast!!!
Look at your post 31 and screenshots of storage -> disks this will show the serial number for each drive, if you have to, write down the serial number and the port it's connected to, check and double check, satisfy yourself what you have written is correct, then check when it's booted up.
I can't help you recover a toasted array
-
hey,
I am not sure if I understand you right:
I have written down my SERIAL-No. from the WEBGUI. I search for the SERIAL-No of my "fault-disk" sdd physically and disconnect this drive from SATA-Port and energy?
What to do next? I am not sure what you mean with wipe the drive. Formatting? Formatting with my DESKTOP PC and after format back to the NAS and add it to array?
-
No you wipe the drive in OMV, if you're satisfied you have the correct one Storage -> Disks select the drive and click wipe on the menu, you can select short, do nothing until the drive has finished, then add it to the array as you attempted before using recover
-
-
I am not sure why the SERIAL-No is important for your suggested step. Sorry, I am a bit nervous
I assumed you had the drives disconnected
But going back to your post 32, you could try that command from the cli (ssh) and post any errors, if there are no errors then run cat /proc/mdstat it just might show the raid rebuilding
-
Jetzt mitmachen!
Sie haben noch kein Benutzerkonto auf unserer Seite? Registrieren Sie sich kostenlos und nehmen Sie an unserer Community teil!