Harddrive Failure and Data Recovery

curious1 · 22. Dezember 2019

Sorry about the duplicates. I sometimes get "An error was encountered" and then the next time I reply, it repeats.

After tonight, I'll be away from my server until next weekend (Dec 28th). So have a great holiday, and I'll talk with you then.

Thanks,
Steve

henfri · 22. Dezember 2019

Hello,

I am really surprised... The drive seems to be empty.
Are you sure you never created an empty btrfs filesystem on this disk? Otherwise I would suspect that the ddrescue failed and we just see your manually created old filesystem
We can do two things now (it's just a question of Order):

Option A)
put back in the original failed drive and do

Code

mount /dev/sda /srv/test

And

Code

dmesg | grep -i btrfs | tail

Please show the output of

Code

mount

And

Code

ls /srv/test

Option B)
On the current (copied) drive. It must be mounted for that.

Code

btrfs scrub start /srv/test

This will take long (24h?)
You can monitor the progress with

Code

btrfs scrub status /srv/test

Greetings and happy holidays,
Hendrik

curious1 · 27. Dezember 2019

Zitat von henfri

Option A)
put back in the original failed drive and do

and do ???

I'm not back yet, but thought I would "check in". In Option A, what am I supposed to "do"? Also, if the drive appears to be blank, should I maybe redo the ddrescue (with a nice tight set of commands so I don't goof up anything)? I'm pretty sure I did all the commands you describe in Option A. Would redoing them be appropriate?

Thanks
Steve

henfri · 28. Dezember 2019

Hello Steve,

sorry, that was a copy&paste problem.
I have edited that post.

You can do B) first (because you do not have to change the drives) and if that does not bring back the data you do A).

Also, I would like to see

Code

btrfs filesystem show /dev/sda

For both drives. You can do that during the course of A and B. It does not matter when you do that.

You once posted the output of one of the drives (I think the copy)

Code

Label: 'sdadisk1' uuid: fdce5ae5-fd6d-46b9-8056-3ff15ce9fa16
Total devices 1 FS bytes used 384.00KiB
devid 1 size 931.51GiB used 2.04GiB path /dev/sda

If you properly copied the drive with ddrescue, the output for the original and the copy should be the same, if I am not mistaken.

Greetings,
Hendrik

curious1 · 8. Januar 2020

Hi. I'm back from my hiatus. Ready to get back into it.

Zitat von henfri

On the current (copied) drive. It must be mounted for that.

btrfs scrub start /srv/test

I tried that first command from option B, but it errored with "not a btrfs filesystem: /srv/test".
Then I realized I needed to mount the "copy" drive, which I tried to do, but got a series of:

BTRFS error (device sda1): bad tree block start, want 209870=904, have 0
BTRFS error (device sda1): failed to read chunk root
BTRFS error (device sda1): open_ctree failed

In the OMV File Systems, it says that sda1 is BTRFS, but has no Total or Available memory.

Here's btrfs filesystem show /dev/sda

Label: 'sdadisk1' uuid: fdce5ae5-fd6d-46b9-8056-3ff15ce9fa16
Total devices 1 FS bytes used 384.00KiB
devid 1 size 931.51GiB used 2.04GiB path /dev/sda

I'm guessing that somewhere along the line, I didn't get this thing properly formatted and/or ddrescued or something. Seems like I might need to start that process over, although I can see from the "show" that 2.04GiB have been used on that disk. Seems like I did something wrong along the line. I know it would stress the original drive to do another ddrescue, but I'll wait for your assessment.

Staying tuned ...

henfri · 8. Januar 2020

Zitat von curious1

Label: 'sdadisk1' uuid: fdce5ae5-fd6d-46b9-8056-3ff15ce9fa16

Total devices 1 FS bytes used 384.00KiB
devid 1 size 931.51GiB used 2.04GiB path /dev/sda

If you do the same with the original: Is the output the same?

Zitat von curious1

I'm guessing that somewhere along the line, I didn't get this thing properly formatted and/or ddrescued or something. Seems like I might need to start that process over, although I can see from the "show" that 2.04GiB have been used on that disk.

Why do you think so? The copy does not work, just like the original. That's expected.
Is 2.04GiB used less than expected?
As the filesystem is damaged, we do not know how reliable this value is.

Zitat von curious1

Seems like I did something wrong along the line. I know it would stress the original drive to do another ddrescue, but I'll wait for your assessment.

Staying tuned ...

Let's see, what we can fix here.

I assume that the mount did not work?
You can check by executing

Code

mount | grep sda

If it is mounted, it will show an output. If not, there will be no/an empty output.

If you cannot mount successfully, then try:

Code

mount -t btrfs -o rootflags=recovery,nospace_cache /dev/sda /srv/test


-or-


mount -t btrfs -o rootflags=recovery,nospace_cache,clear_cache /dev/sda /srv/test

First try the one, then the other.

Greetings,
Hendrik

curious1 · 9. Januar 2020

Zitat von henfri

If you do the same with the original: Is the output the same?

Didn't do it to the original yet. Haven't reinstalled the original into the system.

Zitat von henfri

Is 2.04GiB used less than expected?

I would say, considerably less than expected.

Zitat von henfri

mount | grep sda

Yep. Returned nothing.

Zitat von henfri

mount -t btrfs -o rootflags=recovery,nospace_cache /dev/sda /srv/test

Returned:
mount: wrong fs type, bad option, bad superblock on /dev/sda,
missing codepage or helper program, or other error

In some cases useful info is found in syslog - try
dmesg | tail or so.

Zitat von henfri

mount -t btrfs -o rootflags=recovery,nospace_cache,clear_cache /dev/sda /srv/test

mount: wrong fs type, bad option, bad superblock on /dev/sda,
missing codepage or helper program, or other error

In some cases useful info is found in syslog - try
dmesg | tail or so.

So I ran "dmesg | tail" and here are the results:

[ 1479.577153] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1479.577350] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1479.577395] BTRFS error (device sda1): failed to read chunk root
[ 1479.593620] BTRFS error (device sda1): open_ctree failed
[76455.155515] BTRFS info (device sda): unrecognized mount option 'rootflags=recovery'
[76455.155592] BTRFS error (device sda): open_ctree failed
[76756.435464] BTRFS info (device sda): unrecognized mount option 'rootflags=recovery'
[76756.435539] BTRFS error (device sda): open_ctree failed
[76787.189105] BTRFS info (device sda): unrecognized mount option 'rootflags=recovery'
[76787.189184] BTRFS error (device sda): open_ctree failed

Should I change to the original drive and try the "btrfs filesystem show /dev/sda"?

henfri · 11. Januar 2020

Zitat von curious1

In some cases useful info is found in syslog - try
dmesg | tail or so.

dmesg| tail does no harm. You can always do that.

We have used a bad syntax.

Try

Code

mount -t btrfs -o recovery,nospace_cache /dev/sda /srv/test


and if that does not work


mount -t btrfs -o recovery,nospace_cache,clear_cache /dev/sda /srv/test

Then

Zitat von curious1

Should I change to the original drive and try the "btrfs filesystem show /dev/sda"?

Yes.
You can also try the two commands above with the original.
But we will continue working on the copy.

Greetings,
Hendrik

curious1 · 11. Januar 2020

Zitat von henfri

mount -t btrfs -o recovery,nospace_cache /dev/sda /srv/test

and if that does not work

mount -t btrfs -o recovery,nospace_cache,clear_cache /dev/sda /srv/test

The first mount command just came back to the prompt.

The second mount command produced the following:
"mount: /dev/sda is already mounted or /srv/test busy
/dev/sda is already mounted on /srv/test"

So then I went back and looked at previous instructions that didn't seem to work. I ran "mount" and got this:
"sysfs on /sys type sysfs (rw,nosuid,nodev,noexec,relatime)
proc on /proc type proc (rw,nosuid,nodev,noexec,relatime)
udev on /dev type devtmpfs (rw,nosuid,relatime,size=10212176k,nr_inodes=2553044,mode=755)
devpts on /dev/pts type devpts (rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000)
tmpfs on /run type tmpfs (rw,nosuid,noexec,relatime,size=2046264k,mode=755)
/dev/sdc1 on / type ext4 (rw,relatime,errors=remount-ro)
securityfs on /sys/kernel/security type securityfs (rw,nosuid,nodev,noexec,relatime)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /run/lock type tmpfs (rw,nosuid,nodev,noexec,relatime,size=5120k)
tmpfs on /sys/fs/cgroup type tmpfs (ro,nosuid,nodev,noexec,mode=755)
cgroup on /sys/fs/cgroup/systemd type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/systemd/
systemd-cgroups-agent,name=systemd)
pstore on /sys/fs/pstore type pstore (rw,nosuid,nodev,noexec,relatime)
cgroup on /sys/fs/cgroup/cpu,cpuacct type cgroup (rw,nosuid,nodev,noexec,relatime,cpu,cpuacct)
cgroup on /sys/fs/cgroup/blkio type cgroup (rw,nosuid,nodev,noexec,relatime,blkio)
cgroup on /sys/fs/cgroup/cpuset type cgroup (rw,nosuid,nodev,noexec,relatime,cpuset)
cgroup on /sys/fs/cgroup/freezer type cgroup (rw,nosuid,nodev,noexec,relatime,freezer)
cgroup on /sys/fs/cgroup/net_cls,net_prio type cgroup (rw,nosuid,nodev,noexec,relatime,net_cls,net_prio)
cgroup on /sys/fs/cgroup/memory type cgroup (rw,nosuid,nodev,noexec,relatime,memory)
cgroup on /sys/fs/cgroup/perf_event type cgroup (rw,nosuid,nodev,noexec,relatime,perf_event)
cgroup on /sys/fs/cgroup/rdma type cgroup (rw,nosuid,nodev,noexec,relatime,rdma)
cgroup on /sys/fs/cgroup/devices type cgroup (rw,nosuid,nodev,noexec,relatime,devices)
cgroup on /sys/fs/cgroup/pids type cgroup (rw,nosuid,nodev,noexec,relatime,pids)
systemd-1 on /proc/sys/fs/binfmt_misc type autofs (rw,relatime,fd=31,pgrp=1,timeout=0,minproto=5,maxproto=5,dir
ect,pipe_ino=1455)
mqueue on /dev/mqueue type mqueue (rw,relatime)
hugetlbfs on /dev/hugepages type hugetlbfs (rw,relatime,pagesize=2M)
debugfs on /sys/kernel/debug type debugfs (rw,relatime)
sunrpc on /run/rpc_pipefs type rpc_pipefs (rw,relatime)
tmpfs on /tmp type tmpfs (rw,relatime)
/dev/sdb1 on /srv/dev-disk-by-label-NewDrive2 type btrfs (rw,relatime,space_cache,subvolid=5,subvol=/)
/dev/sda on /srv/test type btrfs (rw,relatime,nospace_cache,subvolid=5,subvol=/)"

So, given that the drive is mounted, I went back to what you were trying to have me do before after having me mount the copy drive. So I have run "btrfs scrub start /srv/test". It came back with this:
"scrub started on /srv/test, fsid fdce5ae5-fd6d-46b9-8056-3ff15ce9fa16 (pid=11445)"
So it looks like we have successfully started the scrub that you wanted me to do back on December 22nd. You had said that I could check the progress with "btrfs scrub status /srv/test". You indicated this could run for quite awhile, maybe 24hrs. Here is the status:

"scrub status for fdce5ae5-fd6d-46b9-8056-3ff15ce9fa16
scrub started at Sat Jan 11 11:51:03 2020 and finished after 00:00:00
total bytes scrubbed: 512.00KiB with 0 errors"

So after 10 minutes, this is the btrfs scrub status /srv/test
"scrub status for fdce5ae5-fd6d-46b9-8056-3ff15ce9fa16
scrub started at Sat Jan 11 11:51:03 2020 and finished after 00:00:00
total bytes scrubbed: 512.00KiB with 0 errors"

It would appear to me that it's finished. Or does the "finished after 00:00:00" indicated it's still working?

If I do a "ls /srv/test", I get no results.

So, I await your command ...

Steve

henfri · 11. Januar 2020

Well done.
But not really what I hoped for.

Code

dmesg | grep -i btrfs

?

Try unmounting and then try the Second command and a scrub.

curious1 · 12. Januar 2020

Zitat von henfri

dmesg | grep -i btrfs

[ 2.780953] Btrfs loaded, crc32c=crc32c-intel
[ 2.794045] BTRFS: device label NewDrive2 devid 1 transid 16 /dev/sdb1
[ 2.794691] BTRFS: device fsid c81bf277-6ded-4e03-8bce-d4b25a690e27 devid 1 transid 9 /dev/sda1
[ 11.574160] BTRFS: device label sdadisk1 devid 1 transid 11 /dev/sda
[ 12.646572] BTRFS info (device sdb1): disk space caching is enabled
[ 12.646573] BTRFS info (device sdb1): has skinny extents
[ 1340.396263] BTRFS info (device sda1): disk space caching is enabled
[ 1340.396264] BTRFS info (device sda1): has skinny extents
[ 1340.402929] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1340.406046] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1340.406097] BTRFS error (device sda1): failed to read chunk root
[ 1340.425705] BTRFS error (device sda1): open_ctree failed
[ 1479.576285] BTRFS info (device sda1): disk space caching is enabled
[ 1479.576287] BTRFS info (device sda1): has skinny extents
[ 1479.577153] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1479.577350] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1479.577395] BTRFS error (device sda1): failed to read chunk root
[ 1479.593620] BTRFS error (device sda1): open_ctree failed
[76455.155515] BTRFS info (device sda): unrecognized mount option 'rootflags=recovery'
[76455.155592] BTRFS error (device sda): open_ctree failed
[76756.435464] BTRFS info (device sda): unrecognized mount option 'rootflags=recovery'
[76756.435539] BTRFS error (device sda): open_ctree failed
[76787.189105] BTRFS info (device sda): unrecognized mount option 'rootflags=recovery'
[76787.189184] BTRFS error (device sda): open_ctree failed
[317998.891158] BTRFS warning (device sda): 'recovery' is deprecated, use 'usebackuproot' instead
[317998.891163] BTRFS info (device sda): trying to use backup root at mount time
[317998.891170] BTRFS info (device sda): disabling disk space caching

Zitat von henfri

Try unmounting and then try the Second command and a scrub.

Uhhh ... I'm still not that adept. Do I unmount using the Web client? And which second command are you meaning? Should I assume the following sequence?:

1. Unmount /dev/sda using the web client
2. Perform: "mount -t btrfs -o recovery,nospace_cache,clear_cache /dev/sda /srv/test"
3. Perform: "btrfs scrub start /srv/test"

Is that what you want me to do?

Thanks
Steve

henfri · 12. Januar 2020

1. umount /dev/sda entered in the commandline
2. Perform: "mount -t btrfs -o recovery,nospace_cache,clear_cache /dev/sda /srv/test"
3. Perform: "btrfs scrub start /srv/test"
4. dmesg as before

curious1 · 13. Januar 2020

Zitat von henfri

1. umount /dev/sda entered in the commandline
2. Perform: "mount -t btrfs -o recovery,nospace_cache,clear_cache /dev/sda /srv/test"
3. Perform: "btrfs scrub start /srv/test"
4. dmesg as before

1 through 3 performed successfully.
Scrub returned: "scrub started on /srv/test, fsid fdce5ae5-fd6d-46b9-8056-3ff15ce9fa16 (pid=11761)"

dmesg | grep -i btrfs:
"[ 2.780953] Btrfs loaded, crc32c=crc32c-intel
[ 2.794045] BTRFS: device label NewDrive2 devid 1 transid 16 /dev/sdb1
[ 2.794691] BTRFS: device fsid c81bf277-6ded-4e03-8bce-d4b25a690e27 devid 1 transid 9 /dev/sda1
[ 11.574160] BTRFS: device label sdadisk1 devid 1 transid 11 /dev/sda
[ 12.646572] BTRFS info (device sdb1): disk space caching is enabled
[ 12.646573] BTRFS info (device sdb1): has skinny extents
[ 1340.396263] BTRFS info (device sda1): disk space caching is enabled
[ 1340.396264] BTRFS info (device sda1): has skinny extents
[ 1340.402929] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1340.406046] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1340.406097] BTRFS error (device sda1): failed to read chunk root
[ 1340.425705] BTRFS error (device sda1): open_ctree failed
[ 1479.576285] BTRFS info (device sda1): disk space caching is enabled
[ 1479.576287] BTRFS info (device sda1): has skinny extents
[ 1479.577153] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1479.577350] BTRFS error (device sda1): bad tree block start, want 20987904 have 0
[ 1479.577395] BTRFS error (device sda1): failed to read chunk root
[ 1479.593620] BTRFS error (device sda1): open_ctree failed
[76455.155515] BTRFS info (device sda): unrecognized mount option 'rootflags=recovery'
[76455.155592] BTRFS error (device sda): open_ctree failed
[76756.435464] BTRFS info (device sda): unrecognized mount option 'rootflags=recovery'
[76756.435539] BTRFS error (device sda): open_ctree failed
[76787.189105] BTRFS info (device sda): unrecognized mount option 'rootflags=recovery'
[76787.189184] BTRFS error (device sda): open_ctree failed
[317998.891158] BTRFS warning (device sda): 'recovery' is deprecated, use 'usebackuproot' instead
[317998.891163] BTRFS info (device sda): trying to use backup root at mount time
[317998.891170] BTRFS info (device sda): disabling disk space caching
[423681.254949] BTRFS warning (device sda): 'recovery' is deprecated, use 'usebackuproot' instead
[423681.254954] BTRFS info (device sda): trying to use backup root at mount time
[423681.254961] BTRFS info (device sda): disabling disk space caching
[423681.254964] BTRFS info (device sda): force clearing of disk cache"

Runs for 24hr or so?

Thanks Hendrik,
Steve

p.s. I'm curious about that "unrecognized mount option 'rootflags=recovery'". Do we still have a bad syntax?

curious1 · 13. Januar 2020

Just ran the scrub status. Don't think anything is happening.

btrfs scrub status /srv/test

scrub status for fdce5ae5-fd6d-46b9-8056-3ff15ce9fa16
scrub started at Sun Jan 12 17:09:53 2020 and finished after 00:00:00
total bytes scrubbed: 256.00KiB with 0 errors

"ls /srv/test" produced no results

henfri · 13. Januar 2020

And dmesg?

henfri · 13. Januar 2020

Never mind. Only saw your 2nd post

curious1 · 13. Januar 2020

Hi Hendrik,
I see you're online. I'm here and ready to jump on whatever instructions you give me. Or maybe you're not online

henfri · 13. Januar 2020

Did we Run btrfs check /dev/sda yet?

curious1 · 14. Januar 2020

Zitat von henfri

Did we Run btrfs check /dev/sda yet?

I hadn't, but just did. Here is the result:
"/dev/sda is currently mounted. Aborting."

Can that command only be run on an unmounted drive?

henfri · 16. Januar 2020

Yes, you Need to umount

Jetzt mitmachen!