Sorry for the cross-post, i realized I posted my initial info in the wrong forum.
I will admit this is my first full size server and I may be in a bit over my head. I am using a Dell C2100 which has 12 bays and all kinds of hardware inside (backplane, cabling, etc) to make the 12 drives work. I bought this used so my fear is that I have some bad hardware in here (the drive seems to pass all tests).
3 times now, while performing a copy operation to this drive, the system has frozen, the drive has errored out and come back offline. What worries me is that the log, while it looks mostly greek to me, may be telling me something about the hardware (backplane, etc) failing that can tell me which component is the culprit, but I just can't seem to make heads or tails of it.
I have the full log at the exact spot it happens, from the moment things go "bad", and am hoping someone might see something in the messages here that spells out what I should be looking for.
This admitted server newb would appreciate any insight at all, sincerely! If this points to a hardware failure and there are some components I should be checking, that would be a great help in a head start. This used server has a 30 day warranty so perhaps if there is a faulty component somewhere in the drive array I can get it replaced.
LOG in 3 PARTS, Part 1:
[42931.787502] sd 0:0:2:0: attempting task abort! scmd(ffff8800597421c0)
[42931.787508] sd 0:0:2:0: [sdd] CDB: Read(10): 28 00 c3 c0 08 00 00 00 08 00
[42931.787520] scsi target0:0:2: handle(0x000c), sas_address(0x500065b36789abe2), phy(2)
[42931.787524] scsi target0:0:2: enclosure_logical_id(0x500065b36789abff), slot(2)
[42935.560074] sd 0:0:2:0: task abort: SUCCESS scmd(ffff8800597421c0)
[42935.560080] sd 0:0:2:0: attempting task abort! scmd(ffff8805fef3b580)
[42935.560084] sd 0:0:2:0: CDB: Test Unit Ready: 00 00 00 00 00 00
[42935.560114] sd 0:0:2:0: task abort: SUCCESS scmd(ffff8805fef3b580)
[42935.560127] scsi target0:0:2: attempting device reset! scmd(ffff8800597421c0)
[42935.560130] sd 0:0:2:0: [sdd] CDB: Read(10): 28 00 c3 c0 08 00 00 00 08 00
[42935.560228] sd 0:0:2:0: device reset: SUCCESS scmd(ffff8800597421c0)
[42935.560239] scsi target0:0:2: attempting target reset! scmd(ffff8800597421c0)
[42935.560243] sd 0:0:2:0: [sdd] CDB: Read(10): 28 00 c3 c0 08 00 00 00 08 00
[42935.560305] scsi target0:0:2: target reset: SUCCESS scmd(ffff8800597421c0)
[42935.560312] mpt2sas0: attempting host reset! scmd(ffff8800597421c0)
[42935.560315] sd 0:0:2:0: [sdd] CDB: Read(10): 28 00 c3 c0 08 00 00 00 08 00
[42935.560338] mpt2sas0: sending diag reset !!
[42936.509239] mpt2sas0: diag reset: SUCCESS
[42936.741630] mpt2sas0: LSISAS2008: FWVersion(20.00.07.00), ChipRevision(0x03), BiosVersion(00.00.00.00)
Dec 1 10:12:01 OMV kernel: [42936.741634] mpt2sas0: Protocol=(Initiator,Target), Capabilities=(TLR,EEDP,Snapshot Buffer,Diag Trace Buffer,Task Set Full,NCQ)
Dec 1 10:12:01 OMV kernel: [42936.741692] mpt2sas0: sending port enable !!
[43057.617433] mpt2sas0: port enable: SUCCESS
[43057.617608] mpt2sas0: search for end-devices: start
[43057.618629] scsi target0:0:0: handle(0x000a), sas_addr(0x500065b36789abe0), enclosure logical id(0x500065b36789abff), slot(0)
[43057.618708] scsi target0:0:1: handle(0x000b), sas_addr(0x500065b36789abe1), enclosure logical id(0x500065b36789abff), slot(1)
[43057.618786] scsi target0:0:5: handle(0x000c), sas_addr(0x500065b36789abe6), enclosure logical id(0x500065b36789abff), slot(6)
[43057.618790] handle changed from(0x000f)!!!
[43057.618866] scsi target0:0:3: handle(0x000d), sas_addr(0x500065b36789abeb), enclosure logical id(0x500065b36789abff), slot(11)
[43057.618945] scsi target0:0:4: handle(0x000e), sas_addr(0x500065b36789abfd), enclosure logical id(0x500065b36789abff), slot(255)
[43057.619020] mpt2sas0: search for end-devices: complete
[43057.619023] mpt2sas0: search for expanders: start
[43057.619110] expander present: handle(0x0009), sas_addr(0x500065b36789abff)
[43057.619183] mpt2sas0: search for expanders: complete
[43057.619190] mpt2sas0: search for end-devices: start
[43057.620186] scsi target0:0:0: handle(0x000a), sas_addr(0x500065b36789abe0), enclosure logical id(0x500065b36789abff), slot(0)
[43057.620265] scsi target0:0:1: handle(0x000b), sas_addr(0x500065b36789abe1), enclosure logical id(0x500065b36789abff), slot(1)
[43057.620342] scsi target0:0:5: handle(0x000c), sas_addr(0x500065b36789abe6), enclosure logical id(0x500065b36789abff), slot(6)
[43057.620419] scsi target0:0:3: handle(0x000d), sas_addr(0x500065b36789abeb), enclosure logical id(0x500065b36789abff), slot(11)
[43057.620496] scsi target0:0:4: handle(0x000e), sas_addr(0x500065b36789abfd), enclosure logical id(0x500065b36789abff), slot(255)
[43057.620571] mpt2sas0: search for end-devices: complete
[43057.620574] mpt2sas0: search for expanders: start
[43057.620658] expander present: handle(0x0009), sas_addr(0x500065b36789abff)
[43057.620731] mpt2sas0: search for expanders: complete
[43057.620738] mpt2sas0: host reset: SUCCESS scmd(ffff8800597421c0)
[43062.781308] INFO: task kworker/u:5:330 blocked for more than 120 seconds.
[43062.781338] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[43062.781368] kworker/u:5 D ffff88043fc127c0 0 330 2 0x00000000
[43062.781374] ffff88042c131840 0000000000000046 ffff88082d0b22e0 ffff88042d89b800
[43062.781381] 00000000000127c0 ffff88042d661fd8 ffff88042d661fd8 ffff88042c131840
[43062.781386] ffff88042d661930 ffffffff81152619 0000000000000000 ffff88042d8912e0
Alles anzeigen