Yesterday I ran yum update on most of my servers. It was a large update, I guess it was centos 6.2? At any rate part of the update was kernel /vmlinuz-2.6.32-220.el6.x86_64. After I updated 3 of my servers became unresponsive – which has never happened before. On the console I found strange error messages:
‘rejecting I/O to offline device’
and
‘task jbd2/sda3-8 blocked for more than 120 seconds’
Scary stuff! Googling around found a lot of mention of serious hardware issues. Bummer! I figured it would take a while to figure it out. Probably need to get new drivers for the LSI SAS and build the new module.
Then I wondered if it was maybe the new kernel from the update. I went into /boot/grub/grub.conf and changed the default from 0 to 1, which selects the prior kernel.
Reboot. Problem solved. Going home.
I am seeing this too – raid controller locks up during or soon after reboot.
Has this bug been reported?