Cool Work Story

Written by Michael Cole - September 19th 2015

Since my first migration from Windows to a BSD has been going kinda horrible, I thought I would mention something that happened at work. I don't really want to talk much about my work, so I'm going to explain this story without details about my work.

So firstly I currently work as a Unix System Administrator. At my work I purely focus on the operating system, I do not deal with the hardware, networking or storage. At least not the components that don't directly occur in the operating system. For example I would handle multipath and LVM on a Linux box. At my work we also have policies in place that when we get software it needs to work on all of our systems. So that includes Windows and Unix like operating systems. Of course all of this is above my head and outside of my control. Needless to say this gets us into some interesting situtations.

This particular story is about Solaris (version 11). With Solaris virtualization is pretty heavy. It has LDOMs and zones for example. So we had two LDOMs that needed to be moved from an internal host to a DMZ host. This fell outside of our groups responsibilities, so we let our virtualization specialists handle the move. Some how they deleted the operating system and setup completely…

We had ZFS setup on the Solaris servers, but we only had one drive/lun passed to the server. So we really couldn't use any of the features of ZFS that would help with this issue. And even though ZFS has great snapshots, we can only really use them locally. So any snapshots we had were gone as well. We did however have a file level backup software (I won't go into which). So the brilliant idea that our virtualization team had was to setup new LDOM with just the base operating system and the backup software. They wanted us to restore the root and other important file systems (like var) over the currently running one.

Of course as any decent Admin knows, even thought things are in memory new processes will still try and read files from disk as they spawn. Just to humor everyone we tried to do it their way. The system started restoring, but as soon as it hit libraries like libc it stopped and broke.

So what we came up with to create a bare metal restore system, was to use the features Solaris and FreeBSD share. They both have ZFS, and they both as a result have boot environments. So our idea, was to create a boot environment, and restore to that environment. This worked pretty well. The system booted the new environment with minimal errors and we worked through them.

So without a couple of BSD users on our team we may have ended up re-doing the complete operating system setup. This would have pushed out dates, etc. As we were just about to turn them over to the customer. Which ultimately costs money, customer confidence, etc. Instead we were able to get them right over without a hitch.