December 19, 2011

VMware ESXi 4.1 Can’t See Dell MD3200 After Reboot – A Story About Bad Customer Service And The Solution To The Problem

Author: Marc Champoux - Categories: Storage - Tags: , , ,

This Was Weird… and Bad For Dell!

 

My colleagues in Europe had an oh-so-fun time today: they added memory to their ESXi servers that were connected to their Dell MD3200 SAN… and after booting the ESXi servers: they couldn’t see the SAN anymore!

 

Here’s what it meant for them: no VMs booted up. No vCenter. No emails. Nothing… it was back to typewriters and pagers for everybody!

 

What followed was a complete failure from Dell to provide support from my colleagues in Europe (and me later)… here’s what they ran into:

 

  • Troubles opening the ticket via the Dell Europe support number.
  • 4 dropped calls while trying to open the ticket.
  • A Dell support rep that said the problem was on the VMware side.
  • A call with VMware… and the Support Rep said the storage unit the problem and that we should call Dell back.

 

In the end, my European colleagues called me in the Americas and asked me to open up a call with Dell in the Americas. They were desperate and were unable to reach their European Dell tech.

 

Over the years with my MD3000 in the Americas region, I have received what I consider great service. I had one or two bad calls but they are mere “bumps” on the road compared to the great support I’ve received time-and-time again when calling about my storage arrays.

 

Unfortunately, I Didn’t Have A Good Experience Either…

 

So, I started a conference call with Dell and my European colleagues and, right then and there,  I knew I was in trouble when the Dell Support rep kept asking me (and my colleagues) what was the Dell Service Tag for our MD3200 and where it was physically located.

 

In the end, that particular Dell support tech that we worked with:

 

1. Asked us the Firmware version of the SAN… once we gave him the information, he said it was 2 revisions old and needed to be upgraded.

 

2. However… he did not even want to do a remote control session with me -or- my European colleagues to double check the config of the SAN and to help upgrade the firmware of the SAN.

 

3. Basically: he was very uncooperative! In the end, he said he was not authorized to help us because the law (???) prevented him from working on an European system! Because he had no way to transfer us to the European Dell Support line and no method to contact a Dell Support rep in Europe, we hung up.

 

I was a little bit baffled by that particular tech… I hope I’ll get one of those surveys via email in a few days just to be able to voice my concerns.

 

The Solution…

 

Ok, so out of desperation (and maybe on a “hunch”?) one of my European colleagues decided to test a theory of his that the configuration of the “Host Group” of the MD3200 was now corrupted.

 

So, my European colleague did the following:

 

1. He went in the “Mappings” tab of the Dell Storage Manager software.

 

2. He removed the “Host Group” and re-created it (you have to right-click on the Host Group to remove it and then re-add it).

 

3. He then went back to his VMware vSphere client and in the “Configuration” tab used the “Rescan All” button to rescan his storage HBAs…

 

4. Presto: the storage array was visible again! The VMs on the storage array were able to be powered-on without any problems after that.

 

Here’s a screenshot of the “Mappings” tab in our Dell Storage Manager (version 10.80):

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Before trying this little “remove-and-recreate” routine, they had tried several times to power-off and power on the ESXi servers and the MD3200 without any luck.

 

Conclusion?

 

You *will* have issues with your SANs… regardless of the brand, regardless of your support coverage and regardless or how much you paid for it! I’m not being pessimistic but realistic: something will eventually go wrong with it.

 

And when it happens… I just hope you’ll get better support from your vendor than we just did!

 

I just hope this tiny blog post helps someone, somewhere!

 

Thanks for reading…

 

Marc

December 6, 2011

vCenter 4.1 Update 2: Your License Server Settings Will Be Wiped Out During Your Upgrade

Author: Marc Champoux - Categories: VMware - Tags: , ,

This Will Be A Short Post…

 

I recently upgraded our vCenter server from vCenter 4.1 Update 1 to vCenter 4.1 Update 2. The upgrade was very smooth and I cannot find anything to complain about in regards to the upgrade process: just start the installer and click next until your finger falls off. Rinse and repeat for every vCenter component, reboot and enjoy!

 

However, a few weeks later down the road, one of our LAN Admins rebooted an old ESXi 3.5 servers that we still have lying around the server farm (because they have 32 bit cpus we can’t upgrade them to ESXi 4.1).

 

After the reboot, he could access the server anymore via the vSphere client. This is the error he saw:

 

The license server is not configured to perform the operation

 

And this is a screenshot of the error:

 

 

 

 

 

 

 

 

What Was The Root Cause?

 

Basically speaking, any “license server” errors should point you toward the “VMware License Server” service on the vCenter server.  If that service is not running or not configured properly, you’ll get some weird errors after rebooting a “Pre ESXi 4.x” server.

 

In my case, the service was running fine. I restarted it but it didn’t resolve it. Since I had upgraded the vCenter server recently from ESXi 4.1 Update 1 to ESXi 4.1 Update 2, I decided to check the configuration…

 

And my settings were gone:

 

 

 

 

 

 

The Simple Solution

 

Well, to resolve this, here’s what I simply had to do:

 

  1. I started my vSphere client and logged into my vCenter server.
  2. I clicked on the menu “Administration → vCenter Server Settings…”.
  3. In the “Licensing” section, I typed “vcenter.uu.com:27000″ and clicked on OK (replace the hostname with the one of your own licensing server).
  4. I waited 5 minutes.
  5. I right-clicked on the host in the inventory that was disconnected and selected “Reconnect”.
  6. Presto, the server got reconnected…

 

In fact, you can check the messages in the “Recent Tasks” view to see what’s cooking:

 

 

 

 

 

 

Conclusion?

 

Well, that was an easy thing to troubleshoot… I just wanted to share in case it helps someone somewhere.

 

Thanks for reading!

 

Marc

April 26, 2011

VMware ESXi 4.1: Solution to the Error “Number of virtual devices exceeds the maximum for a given controller” When Cloning a Virtual Machine

Author: Marc Champoux - Categories: Cloning, VMware - Tags: , ,

This Is Probably Nothing New For Seasoned Pros…
 

But for a new VMware admin like me, when I cloned a virtual machine (my vCenter actually – to make a DR copy of it as per a suggestion from the book “VMware ESXi: Planning, Implementation, and Security” by Dave Mishchenko), I received a weird error when the process was about 78% done. The error said the following:
 

Number of virtual devices exceeds the maximum for a given controller
 

And the error appeared in the message and events area the bottom of my vSphere client:
 

 
 

 
 

 
 

So, what was the solution? It’s quite simple acutally…
 

Read it all..

April 20, 2011

Book Review: “Mastering VMware vSphere 4″ by Scott Lowe

Author: Marc Champoux - Categories: VMware - Tags: ,

That Was Tough…
 

As I became a VMware Administrator “by default” last year, I purchased a few books on the topic of VMware in October of 2010. Due to various time contraints and other projects, I did not have the time to sit down each and everyday to read one of the books I had purchased back then, namely, “Mastering VMware vSphere 4” by Scott Lowe.
 
 

 
I started reading that particular 673 pages behemoth in October of 2010 with the intention of reading it from cover-to-cover. By December of 2010, I had about 250 pages read when I hit the 6th chapter of the book titled “Creating and Managing Storage Devices”. It’s during that particular chapter that I stopped reading that book and waited until March of 2011 to continue reading it.
 

Why did the 6th chapter hurt so much? Read on to learn!

Well, continue reading to find out why…

March 17, 2011

The Short Road to Becoming a VMware Administrator …

Author: Marc Champoux - Categories: VMware - Tags: , , ,

Last Year Was An Interesting Year … I Became a VMware Administrator “By Default” …
 

If you rack your brains for a few minutes (and if you ask around) probably not many people have fond memories of 2009 and 2010. The recession hit everybody pretty hard and, despite the fact that business at our company did not enter into an unrecoverable death spiral toward the ground, there were some layoffs.
 

What those layoffs meant to us was that the IT team shrunk beyond anything we had ever seen before. As with every other company out there, we saw job cuts in the programming team, in the admin team and even some managers were laid off!
 

In plain English: we lost *a lot* of people!
 

And then a funny thing happened: nothing crashed. Most systems kept humming along. No cataclysmic event brought the entire infrastructure down… and I personally think everybody “left behind” let a collective “sigh of relief” when we saw that our precious servers and applications were still up and running after a few days and weeks. Do you wonder what happened later?
 

Click here to Continue Reading this Story… and learn what happened in the aftermath…

March 16, 2011

Welcome!

Author: Patrick - Categories: General - Tags: , , ,

Hi there,

Since the success at thenewdominoadmin.com. Marc asked that we start a blog site around the latest work we have done around VMware. At our employer, who’s name we will not not mention on this site due to a policy in place, we have just completed the first of three phases of virtualization deployment. This first phase fell under Marc’s control in late 2010, and in here we will publish the good, the bad and the ugly of what we have learned and are still learning every day. Please feel free to email us, make us a contact on linkedin.com or just comment on what is posted here.

Patrick