Sorry, you need to enable JavaScript to visit this website.

Non-existent Device Mapper Volumes Causing I/O Errors?

Homer's picture

A couple of days ago I started getting these errors whenever I ran anything that scanned for logical volumes (Linux LVM2):

Buffer I/O error on device dm-6, logical block 0
Buffer I/O error on device dm-7, logical block 0
Buffer I/O error on device dm-8, logical block 0
Buffer I/O error on device dm-9, logical block 0

My first reaction was panic, as I initially believed my HDD was failing, but after some investigation I realised that the above devices simply didn't exist.

Yes, that is strange. Why would device mapper suddenly think there were devices there that ... well, weren't?

I had a look in the /sys/block/ directory, and sure enough there were entries for dm-6; dm-7; dm-8 and dm-9, but looking in their respective slaves/ directory revealed the problem ... the soft links to the actual block devices were broken. Broken links to non-existent device nodes? It gets stranger.

So then I thought I'd just try to delete those broken links, after all they pointed to non-existent hardware (for some reason that hadn't yet occurred to me), but alas the /sys/ directory is read-only, even for root. Hmm, what now?

Then I suddenly remembered that a couple of days previously I'd inserted a USB thumb-drive, copied some files off it, then unplugged it. I did make sure that I'd unmounted it first, but I'd completely forgotten that logical volumes need to be explicitly deactivated first (using "lvm vgchange -an {volume group}"), before you remove them, and I hadn't done that.

Oops.

Unfortunately the lvm command simply returned a "device busy" error, so I found myself back at square one.

Although the error messages were not fatal, since no actual hardware was damaged, and no data loss was likely, it was still very annoying to see these Buffer I/O error messages every time I did anything related to LVM. Rebooting would have fixed the problem of course, but I'm deeply averse to utilising Windows-style solutions on Linux systems that should be repairable without rebooting. Also, this is a server, and I hated the thought of losing uptime, and having to restart everything and check all the services were working properly, just to solve some stupid "non-existent logical volumes" problem.

Sigh! It looked like I'd have to solve this problem the really old-fashioned way ... by going back to RTFM, or in my case several FMs.

Some time later...

I'd never played around with device mapper using the dmsetup command directly before, since I'd never needed to use anything other than the higher level vg* and lv* commands, but there's a first time for everything, I suppose, and this was one of those times.

Apparently, stubborn device mapper entries can be forcibly removed using the "dmsetup remove {logical volume}" command, and that exactly what I did. In order to discover the errant volume names, I first ran "dmsetup info" and compared that with the actual devices present on the system. It was immediately obvious which devices were the orphaned ones (there were four in total), so one at a time I ran "dmsetup remove {logical volume}" to get rid of them, and ... finally the problem was solved.

At last!

Well that'll teach me to RTFM before I start messing around with something new again, although frankly I think there should be a simpler way of handling removable devices that happen to have LVM volumes on them.

Maybe I should start a bugzilla entry...

Anyway, I hope someone finds this info useful, and saves them from the panic and frustration I endured.