Server Virtualization Blog - A SearchServerVirtualization.com blog

Server Virtualization Blog:

 

A SearchServerVirtualization.com blog


A server virtualization blog covering virtual machine (VM) management and administration, VMware, Xen, Microsoft, server consolidation and hardware, backup and disaster recovery, VDI (virtual desktop infrastructure) and more.

OEM + old machine + P2V migration = Murphy’s law

OK, I had a foul up in my last post about my physical-to-virtual (P2V) migration journey. I used Vizioncore’s vConverter in my file server P2V migration, and it didn’t work. I then posted PlateSpin’s product name with Vizioncore’s product name (i.e. PlateSpin vConverter) as if there were some merger from the great beyond, rather than doing the simple thing and actually editing my own posts for accuracy. It’s all been edited out correctly now, but for full disclosure purposes, there was indeed a company name and product name mix up.

Now, that said, I’ve used both products on other OEM boxes and they went just fine, so take it for what it is: a singular experience and the nature of blogging (and working with editors… that “no thanks to” line was not mine).

I seem to have a hard time with company names… I previously used incorrect capitalization for eGenera (it’s actually Egenera) some time back, and I often refer to Openfiler as OpenFiler.

Back to the tale… The domain controller from hell - a Windows 2000 server with OEM disk drivers, OEM RAID controller management tools, OEM disk management tools, and OEM network management tools. A machine nearly as old as my oldest child, one that shipped with Windows 2000 before there were service packs. It has had patches, service packs, driver roll-ups, OEM driver updates, and (probably) chocolate sauce slathered onto it in it’s lifetime.

It has also had so many applications added and removed that I think I could actually hear grinding from worn spots in the registry and creaking from the drive platters. Needless to say, this DC was spiraling down into the vortex at the end of usefulness. That’s not to say it wasn’t a great machine for a long time, but with full disks and a gurgling CPU, the poor thing was about done. It still doesn’t beat the teenagers I decommissioned early in 2007 - pair of Novell 4.11 boxes that were old enough to drive (well, with a Learner’s Permit anyway). Gotta love longevity.

First order of business: a little cleanup. Backup. Backup again to a second location. Remove IIS (it’s not used anymore). Remove an extraneous antivirus management console (a competing product is in use, this one’s just dead weight). Dump the temp files. Wipe out randomly saved setup files for applications like Acrobat Reader. Compress what I can. Defragment.

Finally, enough free space is there to support the VMware Converter agent. Lets try that and see how it goes (often, it’s the only tool I need). Failure. Hours of waiting are gone, even though the conversion hit 100% and claimed success. Turns out there’s an invisible OEM partition sitting out there that the OEM tools don’t show, and said partition has hosed the boot device order on the new virtual machine (VM). What do I see - the Blue Screen of Death (BSOD), pointing me to INACCESSIBLE_BOOT_DEVICE.

Not a huge deal, right? Just edit the boot.ini, right? No need to worry about that missing partition, right? Nope. Sure, I try to repair it by mounting to a good VM and going into the disk to edit the ini file. No luck. Ok, lets get rid of the driver. We can see the partition. Done. Now lets try again.

Failure. Are we sensing a pattern here? Same BSOD. Just like the first time, Converter goes 100% and the box BSODs on boot again. So, now that the disk management tools are no longer hiding the OEM partition, I edit boot.ini to get rid of the partition, make sure that the partition is unchecked in Converter, and try again. It succeeds!

Kind of. It’s slower than molasses in the Minnesota winter, that kind of winter where all you want to do is sit inside by the fire and let the good folks out at — sorry, for a minute there I was channeling my inner Garrison Keillor. I’m back. It’s drivers.

The OEM RAID drivers are still in there, but they are easy to strip. And it even boots up again and runs. There’s no network though. OEM NIC drivers get the strip next, but still no network (as expected). Reinstalling the VMware tools to replace the drivers doesn’t help. Next step is to shut down the VM, remove the NIC, boot again, and then add in a new NIC.

Hosed. Now the machine boots up, but it won’t let me log on. The OS is toast, and I’m not whipping out the recovery CD.

Time to pull out another product. Vizioncore’s vConverter, an acquisition from Invirtus that’s stable, robust, and more feature-rich than VMware’s offering. Redo the P2V with this tool. Same problems with the boot screen.

And there it sits for a day, in limbo, while I spend some time on Google, TechTarget, and VMware’s websites.

Finally, it’s all together… AD is corrupt. Somewhere in all that stripping of drivers, I’ve whacked Active Directory. Ok, lets fix that: Start over. Whack the VM. Re-backup. Run DCPROMO and demote the server so that it’s no longer a Domain Controller.

Time to P2V - I used vConverter again, but edited the VM before boot so that there’s no NIC. I boot it, and remove all the OEM drivers then add in the NIC. It boots. It runs. It flies. No need to robocopy. All the apps are in place and running. It just hums along happily and serves it’s purpose.

Murphy’s law: Whatever can go wrong, will.

P2V migration success, thanks to Robocopy

…and no thanks to VMware Converter Enterprise or Vizioncore’s vConverter. 

The situation: a very successful physical-to-virtual (P2V) migration, with only two servers to go. Both original equipment manufacturer (OEM) boxes.

One is a Windows 2003 Server File/Print/VMware Server. One is a Windows 2000 domain controller, with accounting and payroll software. The owners have been very reluctant to migrate from stable boxes, which have run reliably, backed-up successfully, and (until recently) have also performed decently.

However, disk space is at an all-time low and prompting alerts from the system’s management console so often that it’s been put on the exclusion list, complete with note taped to the ops board. There’s a plan to upgrade to Exchange 2007 and thus get out of Windows 2000 Native mode in Active Directory.

The players: me, VMware VI3.5, VMware Converter Enterprise (of course), VMware tech support, and Vizioncore’s vConverter.

The end result:  Less than stellar migration with VMware Converter and Vizioncore’s vConverter. On the file server, it went the easiest. After the first (Converter) P2V attempt failed, and vConverter came up empty, I took a hint from the ITIL playbook and implemented the workaround (check the many writings on ITIL and Change Management for more meaty details than I care to post here). That workaround? Robocopy, IP changes, and host name changes.

Robocopy

Robocopy is your friend. It is your dear, dear friend that loves you. It’s a tool of similar functionality to the *nix rsync command, in that it can mirror a directory structure, survive the occasional network interruption, etc. It has fundamental differences, but it comes from the same root - an improved version of the copy command that exists in every operating system ever designed. My favorite part? /SEC, which copies NTFS permissions from host to host (normally, these are destroyed by being replaced by inherited permissions at the target). So, it’s just a simple batch script. That’s right… batch. That old beast of burden, come back to ride high once more.

@ECHO OFF

SETLOCAL

SET _rcsource=\\SOURCEHOST\d$\shared

SET _rctarget=\\TARGETHOST\d$\shared

SET _rcaction=/COPYALL /TBD /ZB /E /SEC

SET _rcopts=/R:20 /W:1 /LOG:FSMIGRATE.log

ROBOCOPY.EXE %_rcsource% %_rctarget% %_rcaction% %_rcopts%

The end result is a complete copy of all directories from the source to the target that can survive network outages, copies NTFS security, retried in-use files 20 times with a one-second delay, and logs it all.

I’ve long since lost the source for that batch, but I’ve used it on countless file servers. After that it was very simple to swap IP addresses and host names, remove the old shares on the source server, and share out the appropriate directories on the target server. World’s easiest P2V not done via P2V tool - mostly because a file server is simple.

Next post, the Windows 2000 Server domain controller, a.k.a. my private OEM hell.

Antivirus management issues in physical-to-virtual migrations

I am well into my company’s physical to virtualization (P2V) migration for most general purpose server systems, and it’s been pretty successful. But as our environment grew, we experienced a problem involving our virtual systems and our antivirus management system. In this blog post, I’ll explain the problem and tell you the solution so you can avoid a similar situation.

For most server systems, regardless of whether they are physical or virtual, maintaining a centrally managed antivirus package is a good strategy. This strategy includes regular definition updates, engine updates, policies for exclusions and scheduled full scans.

Let’s talk about the scheduled full scan. Historically, we regularly ran a full antivirus scan of the local file system on both the physical and virtual servers during off hours. This became a problem as the virtual environment became more populated.

We use the vkernel capacity analyzer and chargeback virtual appliance to monitor the performance of our virtual environment. What I noticed is that during the off time, we had an incredible spike in CPU utilization across all hosts and virtual machines. This spike was about 300% of our average CPU use for about two hours. We initially wanted to blame it on the full backup that happens close to this timeframe, but closer investigation led us elsewhere.

We had noticed that the CPU spike occurred on guest systems that are in isolated networks for stage-configure or isolation test roles. With the isolated systems, it was determined that the spike would not be caused by the full backup, as the the isolated systems were not able to communicate with the backup mechanism.

Avoiding the CPU spike

Once we determined that the scheduled full antivirus scan on the local file system was the culprit, we decided that a staggered set of full scans were required to avoid this massive spike. On physical systems with local processing, this is not a big issue, as they are generally idle. But applied to the virtual environment, this may cause unnecessary virtual machine migrations or performance alerts. So, in your migration strategies, be sure to consider any centrally scheduled activity like this and how it may affect your entire infrastructure — both physical and virtual.

Don’t omit the VMware Tools!

So, you are very proud of yourself because you can roll out Windows virtual systems like popcorn, right? Well, don’t forget to ensure that you are using the correct version of VMware tools. This is important because it provides an optimized inventory of hardware for the guest operating system. On a Windows guest operating system (OS), take a look at the device manager and see how many devices have VMware, Inc. listed as the manufacturer for the device. The VMware tools will apply the correct drivers to the SCSI and RAID controllers, network interfaces, video display adapters, and many more.

Why Does This Matter?

The presence of VMware tools is good, but just as important is the version of VMware tools. Each VMware product has its own version of VMware tools, and if you migrate via VMotion or the VMware converter, your version of the tools may be out of date. Some items will be natively recognized with obselete versions of VMware tools, while others may not yet be determined in the Device Manager. The best candidate here is the network interface. For example, if you have a virtual machine hosted on VMware ESX 2.5.4 and you wish to migrate this guest to your newer VMware ESX 3.0.2 system. Your migration via your tool of choice will proceed correctly enough, but you may soon discover an issue with the VM.

How do I Install VMware Tools?

Installing VMware Tools is quite easy, and VMware has provided a knowledge base article for each VMware product (ESX, Workstation, etc.) Click here to view the knowledge base article.

Once upon a time, there was a VI3 migration…

I found another blog post-worthy blog. Rightfully called ”Documenting a virtualization project“, it’s pretty darn cool. Read about one company’s experience with virtualizing their servers from the start. Most recently, the author, (Martin?) reported that they company (which remains namelesS) has 75 servers virtualized at approximately a 20:1 ratio, and 25 servers to go. They seem to be doing a lot with VDI and VMware, so if that’s your forte I highly suggest becoming a frequent visitor to this blog (after ours, of course.)

Throughout the blog, he talks about migrating Oracle servers, their VDI project, their first production HA failover:

“A quite unexpected event yesterday was the very first HA failover in production.”

The day the Oracle servers froze:

“I shoudn’t be writing that all is well on the Oracle front.

“Just now two of the Oracle servers froze with database problems. The DBA tells me that they have had block corrupts which he hasn’t seen in five years of running the things.”

Then he goes on to blog about what they learned from the Oracle freeze:

“…memory settings turned out to be highly critical in relation to the performance of the VM.”

…I guess he should have been paying more attention to his SearchServerVirtualization.com Virtualization Advisor e-newsletters. ;)

Physical-to-virtual (P2V) migration blooper

Since I wrote the post, P2V wins and losses, I’ve been hearing from IT managers who’ve done the P2V deed. One migration was halted by a too-proactive IT guy. Here’s the story, but I’ll let the parties in question remain anonymous.

So, this company was moving from physical to virtual servers to reinstall its Oracle ODBC drivers and number of third party support applications. During the migration of one physical server, the process failed. Why? Says the IT manager:

“The first attempt failed because our support personnel noticed the server being replaced was down and rebooted it in the middle of the P2V process.”

Oops.

After that, migrations went smoothly, and no more mistakes were make. However, performance issues arose when this particular server’s virtual machines server went into production. The IT manager explained:

“The original server had been a two-processor hyper-threaded system, and it was moved to a single processor VM. After adding a second virtual processor, it ran at an acceptable level.”

This company has done successful P2V migrations since then and plans to do more.

Here’s hoping your P2V moves are blooper-free. If they aren’t, share your goofs with us by commenting or emailing me at jstafford@techtarget.com. We can all learn from your bloopers and chuckle at the same time.

P2V wins, losses: VMware Converter

The road from physical to virtual servers isn’t a freeway…yet. There some potholes that hold up P2V migrations. Here are a couple of views from those who’ve taken a trip with VMware Converter.

Language support issues on VMware Converter caused several P2V mishaps for Robert Sieber of SHD System-Haus-Dresden GmbH in Dresden, Germany. Responding to my blog entry on physical-to-virtual (P2V) migration mishaps, he wrote:

“Mainly all of our tries to migrate from physical to virtual or virtual to virtual failing at 97%. It looks like if there is an issue with the language of the underlying OS. We used German OSes for installing VMware converter and now we are trying to use only English ones. Since we switched to English success rate is somewhat better.

“I really hope that the people who developed ESX server are much better than the one who developed Converter and Capacity Planner.”

Blogger Scott Lowe had mostly good experiences with VMware Converter. Check out his trip through online and CD-boot experiments on his blog. Lowe liked VMware’s network throughput of about 8-9GB per hour and its ability to import directly to VMFS on ESX Server farm with no need to use a helper VM or vmkfstools. On the other hand, he had trouble logging in to VirtualCenter and had to connect to back-end ESX server instead. Booting up seemed to take more time on VMware Converter than on VMware’s older tool, P2V Assistant; but, he says, “this is a very subjective assessment.”

What are your objective or subjective opinions about the state of P2V migration tools? Please comment here, or email me at jstafford@techtarget.com.

P2V blues: Physical-to-virtual migration mishaps

While researching P2V migrations, I came across some discussions of problems encountered during the process.

Dave Mast, Geek at Large, learned this lesson about P2V migrations and shared it on his blog:

“This Tuesday we attempted a P2V (Physical-to-Virtual) conversion on our domain controller.  It worked, but not as well as I expected.  We wound up losing the SYSVOL share (where all your group policy stuff is kept) during the conversion.  We still had our physical machine present, so we fired it back up and all was well.”

Mast is prepared for his next P2V experience. “I set up a second domain controller and stuck it in one of the IDFs.  Hopefully the domain data will replicate from the new DC to the old, and if not, we can always start from scratch with a new VM.”

Here’s a thread about Exchange problems in P2V migrations. It’s a tale of many crashes.

Another informative P2V migration tale comes from Scott Lowe’s blog. He talks about using VMware Converter. Here’s a tidbit, but you should check out the whole entry:

“The only odd thing we ran into was that Converter refused to log in to VirtualCenter. We tried short hostname, fully qualified hostname, and IP address, and still had zero luck getting Converter to connect to VC. Fortunately, connecting directly to one of the ESX servers using the root account worked without any problems whatsoever, and the overall conversion process took about 1 hour and 20 minutes to move the 12 to 13 gigabytes of data on the source server.”

I’m still looking for P2V migration stories — successes and failures — to help me build a best practices and “beware” guide. Got any tales to tell? Add a comment, please, or send me an email at jstafford@techtarget.com.