A timeline of my early Thanksgiving morn
1:30 AM: I notice that my shell session on my OpenBSD router has stopped responding and I can no longer SSH in, although the machine is still forwarding packets.
1:45 AM: I try to hard reset the machine. It does not come back up.
1:50 AM: Attaching a keyboard and monitor, I rapidly discover that the problem is a failed hard drive that seems to have corrupted the /usr and /var partitions to the point that the kernel panics when trying to mount either of them. From the debug messages, the root cause appears to be disk damage.
2:05 AM: I swap the network cards into a spare machine that had been running an old FreeBSD 7 snapshot. VERY fortunately I had purchased a complete install set of OpenBSD 3.9 so I had an actual CD with all the sets (this was done for just this eventuality). I commence the OpenBSD install.
2:25 AM: After wasting a few minutes remembering how to run disklabel, the install is finished. Note—I have started running ntpd so I set it to start on boot.
2:27 AM: firstboot; notice I have the network cards in backwards. I swap the Ethernet cable and all is well.
2:30 AM: I create my own account so I can stop using root. For a change, I remember to put myself in the wheel group.
2:31 AM: After a keyboard swap, I ssh to my main workstation and grab the most recent backup of this host’s /etc directory. I swap in the necessary config files to get the PPPoE working properly, remembering to change the interface names to reflect the now swapped cards.
2:32 AM: Success #1, I can ping hosts on the Internet.
2:33 AM: Forgot to change the interfaces in pf.conf, so my first attempt to start PF freezes my ssh connection. Back to the console to unlock the machine and change the offending setting.
2:35 AM: PF is up but strangely packets aren’t being forwarded.
2:42 AM: After several minutes of head scratching, I realize I am a dolt and forgot to use sysctl to turn on packet filtering in the kernel. This accomplished, I am back online!
There—an hour and 12 minutes from system hardware failure to back up and running on a new host. This proved to me I know my OpenBSD (although having a backup of /etc so I didn’t have to rewrite config files from scratch was a huge help—always make back-ups). Frankly, given the beat up old hardware I use for these tasks, I know these situations will come up (this has happened once before and caused a lot more pain).