Dana Gardner's BriefingsDirect: Expert Chat on how HP ecosystem provides holistic support for VMware virtualized IT environments

Listen to the podcast. Find it on iTunes/iPod. Read a full transcript or download a copy. Sponsor: HP.

Redefine the potential of your virtualization investments.

View the full Expert Chat presentation on VMware support best practices.

Advanced and pervasive virtualization and cloud computing trends are driving the need for a better, holistic approach to IT support and remediation.

And while the technology to support and fix virtualized environments is essential, it’s the people, skills, and knowledge to manage these systems that provide the most decisive determinants of ongoing performance success.

In a special BriefingsDirect sponsored podcast, created from a recent HP Expert Chat discussion on best practices for VMware environment support, HP experts explain how they have made the service and support of global virtualization market leader VMware a top priority.

For example, Cindy Manderson, Technical Solutions Consultant for Complex Problem Resolution and Quality for VMware Products at HP, provides case studies for how managed escalation and multi-vendor support around the globe can reduce downtime by 70 percent, with large ROI benefits as well.

Other HP experts in the discussion include Pat Lampert, Critical Service Senior Technical Account Manager and Team Leader, as well as Sumithra Reddy, HP Virtualization Engineer. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: HP and VMware are both sponsors of BriefingsDirect podcasts.]

Here are some excerpts:

Gardner: Virtualization isn’t just server-by-server, but really impacts the entire data center. You need to think about it more holistically, particularly in regard to things like security, performance and how your brands and businesses are perceived across the globe. Many of the companies that I deal with day in and day out are up at 80 percent and even 90 percent virtualized.

When they think about virtualization, they go beyond just server virtualization. It’s really now looking at storage, applications, networks and even the end-user desktop experience, or desktop as a service (VDI).

Successful virtualization is no longer just about servers, it’s about managing complexity when you get beyond the 20 percent or 30 percent level and expand into converged infrastructure virtualization without failures.

So how to take advantage of the best things about virtualization? Part of that means allowing your IT team to have access to other experienced support teams, from HP and VMware, around the world, 24×7, to help keep systems up and running. Such support also allows your IT team to progress, to learn as they go, and to be able to take advantage of more virtualization benefits over time.

Expert panel

So how do you go about attaining such benefits? How do you keep the positive side of virtualization on track? And how do you put in place an insurance policy around service and support?

Manderson: We have several different packages. Our highest level is the mission-critical. In this particular process, you're assigned a team that are across the technology that you have in your environment. But you also get a set of folks who would actually look at not just the reactive support and even some of the proactive, but how actually your entire business is running according to the ITIL standard.

That is coupled with keeping you up and running, and we also can work with you on a type that would be best suited for your environment.

Our critical and independent support includes onsite resources from HP that also include a lot of proactive support. In addition, they're more focused on specific management, but that would be more of an ITSM technology. We can look at that for you.

... We also have the hardware and software support. One of the cool things we have with our hardware support is support automation, our Insight for remote support. That can notify HP that you're having a disk drive failure. Or we will call you and say that we know that disk drive is failing, or something on a buffer server and storage is about to.

You can even take that a step further to look inside at the Windows operating system. We're hardware agnostic on that operating system. We don't care about the vendor -- and I believe we are looking at expanding that automation to other operating systems. We have installation and startup services that we can actually go out and set up and configure the hardware and software at a site.

We're hardware agnostic on that operating system. We don't care about the vendor.

So we definitely integrate across all the multi-vendor services. We run the gamut between all the x86 operating systems, as well as our proprietary operating systems, our servers and storage. Again, we're no stranger to multi-vendor support and keeping the entire environment up and running.

... One of our most creative services would be Proactive Select, a core product series of credits. You can use these credits for maybe planning on migration and upgrade. You can say you need some consulting time. You can use these credits and work with upgrade and migration. You may need some performance or you may need some type of environmental assessment, and these credits can be used for that.

Gardner: When people do employ these services, how do they measure what the payoff is, the value of these services?

IDC study

Manderson: In 2010, IDC did a study. They went out and looked at the methodology, and this is out on our website. They saw that the customers who have the mission-critical services, reduce their downtime by over 70 percent, and increase their return on investment (ROI) quite high, over 400 percent. The main benefit was in problem management as well as help desk calls, because these were alleviated due to the proactive nature, a lot of looking at the entire environment, and looking at the business processes.

So take a look at the study. It shows IDC's methodology. So looking at things proactively and these support processes can certainly help you reduce that downtime.

... I've been in the multi-vendor space for many, many years -- from applications to operating systems -- all with HP.

In 2002, when VMware came on the scene, HP actually became alliance partners with them. In 2003, we became a reseller, and thus began our support partnership with them. It would only extend recent in 2005, we also became an OEM.

We have the largest number of VMware-certified professionals. We're also the largest global VMware off-site training center
We have thousands of trained and certified Microsoft engineers and Linux professionals, too.

But we have the largest number of VMware-certified professionals. We're also have the largest global VMware off-site training center. So HP also does education on these technologies as well. We’ve trained over 20,000 students in the VMware space alone.

And we have had this very strong collaboration with VMware for many years and have support teams around the globe. In addition, we also offer the same level of training that VMware support engineers do. We actually go to their facilities and train right alongside them, too.

We further do this training virtually. The training is then recorded and made available on demand for reference, for folks who are not able to attend a scheduled course. There's definitely a very strong partnership, and as you see from our history with the other vendors as well as VMware, we are no strangers to multi-vendor support.

With all of the VMware products that HP sells, we do provide support across them all. It runs the gamut from the vSphere operating system that will install on the x86 server, through the enterprise management to the vCenter, and virtual desktop infrastructure products like VMware ThinApp. We also support the converter product getting into vCloud Director.

In addition to that, we have the ability to access our peers on the other teams across HP hardware support. This includes servers and storage, and our networking chain. We are quickly able to collaborate with them and pull together a virtual team in to focus on the customer's whole environment, to provide a one-stop shop.

Expertise across technologies

Additionally, you saw that we’ve been in this multi-vendor support business for so many years, with many experts across the other technologies, such as Microsoft and Linux. Of course, the virtual machines (VMs) are running these operating systems. So if the contract is also with them, we can easily pull them in to help us work an end-to-end solution and support it.

Gardner: Let’s think about what happens when there are different levels of support at work. How does that shake-out?

Manderson: We're in a reactive support business. If the customer has a problem, they can either call in at their local region telephone number -- whether they are in America, Europe, or Asia Pacific. There are different phone numbers for them to call.

They can also log in via the web, and they'll get to our next developer Level 1 engineer. They're a great organization and have solved over 85 percent of their cases.

If they have issues where they have to escalate, first they will be collaborating with us. We also have an online chat tool, where we are all in a virtual room, the Level 1 engineers, Level 2 engineers, etc. So we’ll be consulting and collaborating with them before they even get to a point of escalation.

If the case does end up needing escalation, chances are this person that they're already collaborating with will end up taking that case.

If the case does end up needing escalation, chances are they're already collaborating with the first person, and will then end up taking the case. That saves a lot of information transfer, as far as what type of server you have, what’s the firmware, what build level, and what’s the problem there, etc.

Once it reaches Level 2 support, as far as we can continue to collaborate, we can reach our teammates and the hardware teams, too, so we can look at the server and make sure that the environment is what we need it to be. If we can't resolve it, we can also go to Level 3 with VMware at an offline service-partner level.

We have a great relationship with the folks that we work alongside with and would escalate calls to at VMware. We’re obviously not going into Level 1 at VMware because we’ve already done all that work, and we are a service partner. They'll go right up to our peers over at VMware and then we work together, while always owning the solution that we provide back to the customer.

Another part of our infrastructure-as-a-support-organization is that we have a single customer database. I can give an example. A call came into our Level 1 French engineer. When this call came in, for the European folks, it was already the end of their day, and the French engineer could not speak English. It was a critical down, their VMs were offline.

HP Virtual Room

So we worked in a virtual room and they talked to us, and brought the case to us here in America’s time zone. We worked with this case and another tool called HP Virtual Room, where we could actually all look at the customers' desktops in real time. They happened to have EVA storage, and we quickly got an EVA engineer engaged. Of course, we had to find a resource in the Americas because the European folks had already left. So we're all looking in real-time at the customer’s environment and found out that they had locked the storage.

The EVA engineer helped to get back online, while we all watched and the French engineer was translating in French for the customer in order to get it all resolved. We got it back online, and the customers were ready to home.

We gave instructions on getting log files and we placed a call for follow-up for the daytime hours in Europe the next day. So our counterparts in European support teams picked that up and worked with the customers to resolution, to analyze exactly what happened and prevent it in the future.

We have another process in HP that can actually go with top organizations, our escalation manager process. I was lead source for a particular case where we had a field team assisting a customer deploying a virtual desktop infrastructure (VDI) design. They had a third-party VDI vendor. They had HP hardware, servers, and virtual connects. They had our storage, and we didn’t quite know where the bottleneck was. They were having performance issues by trying to have this VDI at two different locations with the hardware at one site.

The escalation manager was able to get the local office to borrow equipment, and then try to get performance and network traces. They had the Engineering Problem Management Resource (EPMR) lab in Houston trying to duplicate the problems.

Our escalation manager was able to drive the issue to completion across not only the solution standards, but the local office, to owning the actual escalation with all the action items to keep this all on track. We knew where we were going to go. That was about a six-month case, but we did finally find was that the customer was on the technological edge, and the "pipe" to have that performance just did not exist.

Redefine the potential of your virtualization investments.
View the full Expert Chat presentation on VMware support best practices.
Site visits

Pat Lampert is a technical account manager and does site visits. The technical account managers do go out on site. So we’re aware of the environment. We have the information of your environment documented into the database. When you call, we’re not saying, "Now what kind of server is this? What’s the firmware?" We know this because we already have it documented. We could be calling them to say, "Server 3 is running a little off." We already which know VMware version this is on, because we have that information.

And because we have that, we can also offer proactive advice. We can know that there's a new firmware update, or VMware just came out with a new build, and we have a place where you can go find the latest that's specific to your environment. So this helps to reduce further incidents, because we can be more proactive to help you maintain your business.

Gardner: What are some of the the most frequent questions you receive from the field?

Reddy: I'll address two questions that are frequently showing up. One is, what is the difference between the VMware ESXi image and an HP ESXi image?

Basically, HP takes the same ESXi image that VMware provides to the customers. It then adds HP thin components for hardware management, and it also adds any latest fibre channel and network drivers. Once it's tested and certified, it's available for download both from HP and VMware websites.

Major differences

And one of the major difference between the two images is that VMware image is disk installable only, whereas HP image can be installed on a disk, USB key, or a SD card.

The other question we're getting nowadays is how to upgrade from VCA4 to VCA5. As with any major upgrades, planning helps. The first thing I would do is understand the difference between ESX 4 and ESX 5, because starting with ESX 5, we have no service console. So we need to understand what the architectural differences are.

Also learn about the new licensing policies. Then, use the System Analyzer that VMware provides to evaluate the current environments, and download, check, and complete the checklist. Once this is done, hopefully the upgrade will go smoothly.

Lampert: Another question that has come up from customers has to do with the added value of getting support directly from HP. It was partly addressed during the presentation we just gave. First of all, VMware does have a fine support organization. I have a couple of friends who work in VMware Support, and they do a good job of supporting their product.

HP, in addition to a similar level of expertise in the product, also offers our expertise in HP hardware, especially if you have systems based on HP Blades. The infrastructure behind that often is tied very closely to the performance and availability of your ESX host. So when you call us, you will have not only someone who is very familiar with the VMware product, but also is familiar with the HP hardware and able to pull in the proper resourced results, problems you might encounter with running vSphere on HP hardware especially.

In addition to that, we have a partnership agreement with VMware, and when you call in for support through HP, you're getting that same level of service when we have to go to VMware to get answers to questions or fixes.

One other question that has come up is about our lab ability to reproduce problems. We have two global labs, one in India and one in the United States. We have several static vSphere cluster configurations with a number of different types of servers already in those configurations, and the ability, when needed, to add specific models, if there is a problem that’s specific to a particular Blade or rack-mounted server model, or a particular card or something like that. So we're quite able to reproduce most problems that come in. We even have some Dell and IBM equipment in our lab also.

Gardner: What other issues are users grappling with?

Reddy: One question I can answer is how to troubleshoot server crashes. When something goes wrong in ESX, we call it the "Purple Screen of Death." Often, these are results of hardware failure, but we still need to rule out the software. So we collect all the logs, and look at it to see if it's a software issue. If it's not a software issue, then we engage the hardware team to see how we can get to the root cause and fix the issue.

Lampert: To dovetail with Sumithra’s comment there, one of the questions I get frequently is what to do if you don’t have a dump. Say the host hangs, and that seems to be almost more common than the Purple Screen of Death. Some customers are't aware that through HP’s Integrated Lights-Out Management, there is the ability to generate a non-maskable interrupt (NMI) just by pressing a button, and by saving a certain environment variable ahead of time in your ESX host.

KB article

There is a KB article on this, by the way, if you just search on NMI and core dumping in VMware. But with that setup, you can force a dump while a system is in a hung state, and that will assist us usually in troubleshooting and isolating what caused the hang, whether it be hardware or a problem with the ESX host software.

One question that came up ahead of time is what HP suggests as far as getting a handle on our inventory of VMs? I happened to be involved in field testing some new tools from HP that will be available in January and February regarding vSphere.

One of them is a Holistic Blade and Firmware Analysis that takes into account the VMware environment on our Blade systems which we are working on having ready soon. We have just completed field tests.

And the second is a really nifty Inventory Report HP has just put together. We're just completing field tests on that now. It will be available soon. Basically, we install a small Perl script in the customer environment on any machine that has access to the vCenter host and has a vSphere CLI installed.

This Perl Script crawls through the VMware environment and builds an XML file, which we then feed into a report generator here at HP. This can be used for us to gather information on customers, so we have ahead of time a clear picture of the environment. But also it will be sold as a service to customers.

The report is really quite nice, with all sorts of charts and showing availability of machines and availability of memory and also disk space. It's a very nice report.