The latest BriefingsDirect discussion focuses on one of the toughest
balancing acts in seeking the best of cloud computing benefits. This
balance comes from obtaining the proper degree of centralization or
"common good" for infrastructure efficiency, while preserving a sufficient
culture of decentralization for agility, innovation, and departmental-level control.
The requirement for empowering
centralization is no more evident than in a large university setting,
where support and consensus must be preserved among such constituencies
as faculty, staff, students, and researchers -- across an expansive
educational community.
But the typical IT model does not support localized agility when it takes weeks to spin up a
server, if
online services
lack automation, or if manual processes hold back efficient ongoing IT
operations. Too much IT infrastructure redundancy also means weak security,
high costs, lack of agility, and slow upgrades.
We're joined by an IT executive from the
University of New Mexico (UNM) to learn more about moving to a streamlined and automated
private cloud model to gain a
common good benefit, while maintaining a vibrant and reassured culture of innovation. We're also joined by a
VMware executive to learn more about the latest ways to manage cloud architectures and processes to attain the best of
cloud efficiencies, while empowering improved services delivery and process agility.
They are:
Brian Pietrewicz, Director of Computing Platforms at the University of New Mexico in Albuquerque, and
Kurt Milne, Director of Product Marketing in the Management Business Unit at VMware. The discussion is moderated by me,
Dana Gardner, Principal Analyst at
Interarbor Solutions.
Here are some excerpts:
Gardner: Tell us about your IT organization at the university
and how you've been able to do change, but at the same time not
alienate your users, who are, I imagine, used to having things their
way.
Pietrewicz: At the University of New
Mexico, it's a highly decentralized organization. In
most cases, the departments are responsible for their own IT. In most
cases, that means they don't have the resources to effectively run IT,
in particular, things like
data centers, servers, storage,
disaster recovery (DR), and backups.
What we're doing to improve the process is providing
infrastructure as a service (IaaS)
to those groups so that they don’t have to worry about the heavy
lifting of the infrastructure pieces that I mentioned before. They can
stay focused on their core mission, whether that’s physics, or
psychology, or who knows what.
So we offer IaaS. We're running a VMware stack, and we're also running
vCloud Automation Center (vCAC).
We've deployed the
Self-Service Portal. We give departments, faculty
members, or departmental IT folks the ability to go into the portal and
deploy their own machines at will.
Then, they are
administrators of that machine. They also have additional management
features through the vCAC console so that they can effectively do
whatever they need to do with the server, but not have to worry about
any of the underlying infrastructure.
Gardner:
That sounds like the best of both worlds. In a sense, you're a service
provider in the organization, getting the benefits of centralization and
efficiency, but allowing them to still have a lot of hands-on control,
which I assume that they want.
Pietrewicz:
Correct. The other part is the agility, the ability for them to be able
to react quickly, to consume infrastructure on demand as they need it,
and have the benefit of all the things that
virtualization brings with redundant infrastructure, lower cost of ownership, and those sorts of things.
New expectations
Milne:
It’s an interesting time to be in the IT
space, because there's this new set of expectations being imposed on IT
by the business to be strategic, to quickly adopt new technology, and
boost innovation.
At the same time, IT still has the full set of
responsibilities they've always had -- to stay secure, to avoid legacy
debt, to drive operational excellence so they maintain uptime, security,
and quality of service for transactional systems and business-critical
systems.
It’s really an interesting paradox. How do you do these two
things that are seemingly mutually exclusive -- go fast, but at the same
time, stay in control?
Brian’s approach is what I call
it "push button IT," where you give folks a button to push and they get
what they need when they want it. But if IT controls the button and
they control what happens when the user pushes the button, IT is able to
maintain control. It’s really the best of both worlds.
Gardner: Brian, tell us a little bit about how long you have been there and what it was like before you began this journey?
Pietrewicz:
I've been at UNM for about two-and-a-half years, and I can tell you the
number one complaint. We suffer from a lot of the same problems that
other large IT shops have, with funding and things like that. But the
primary issue that we had when I walked in the door was customers being
upset because we didn't have clearly-defined services, and we had sold
these services to these customers.
We had sold
virtual machines (VMs) with database backups, and all kinds of interesting things, with no
service-level agreements (SLAs), no processes, nothing wrapped around it. The delivery of these services was completely inconsistent.
So
I started out down the new path. The first thing that we did was to
make
the services more consistent. Just to give you an example, deploying a
virtual machine for a customer. The way that it was when I got here was
that a ticket
came into the service desk. It went to a single technician, and then
whichever technician got that ticket figured out their own way of
getting that machine deployed.
At the same time, IT still has the full set of responsibilities they've
always had -- to stay secure, to avoid legacy debt, to drive operational
excellence.
As the next step in that process, we
went through and, instead of just having it being done a different way
by whoever received the ticket, we identified all the steps associated.
In looking at all the steps associated, we identified over a 100 manual
steps that went though six different completely separate groups inside
of our organization.
Those included operating system, storage, virtualization, security, and networking for
firewall
changes. In all those various groups that deploy their individual
piece of that puzzle, it was being done differently every time. Our
deployment times were taking as long as three weeks. You can imagine how
painful that is when it takes 20 minutes to spin up a VM -- but it was
taking three weeks to deploy it to a customer.
We
identified all the steps and defined the process very, very clearly;
exactly what it takes to deploy a VM. The interesting thing that came
out of that was that it gave us the content necessary to be able to
start developing a true service description and an SLA.
Ticketing system
It
also made it so that it was consistent. We did a few things after we
did the process development. We generated workflows within our ticketing
system, so that all that happened was a ticket was put in and then it
auto-generated all the necessary tickets to deploy the VM, so it
happened in a very consistent way.
That dropped the
deployment time from three weeks down to about three days, because it
still had to go through certain approval process and things like that
with security.
For the next step we said, "Okay, how
can we do this better?" We looked at all of those steps that we put in
place and found that they were all repetitive, manual steps that could
be easily automated. So enters VMware vCAC.
We took all the
steps, after we had them clearly defined, and we automated all the steps
that we could. We couldn’t automate all of them, for example, sending
information to our billing system to bill the customer back. From vCAC
we shoot an email over to our ticketing system, that generates a ticket.
Then, the billing information is still entered manually, and we are
working on an upgrade to that.
UNM is
approximately 45,000 faculty, staff, and students. We have about 100
either departments or affiliates, and today, we're running about 660 VMs
for our organization. For central IT, we're between 98 percent and 99 percent virtualized.
When I first got here,
the services were not defined and the processes were not defined. Since
then, we have clearly defined the processes, narrowed those down into
the very specific processes and tasks that had to be done, and then we
automated. We're going through the process of automating every step in
that process.
ITIL is very challenging to implement, but it's extremely helpful, because it gives you a framework to work within.
Now,
we have a thing we call Lobo Cloud -- our mascot is
the Lobo. Customers
can now go online and deploy a machine within 20 minutes. So basically
everything has transformed from extremely inconsistent service and
taking as long as three weeks to deploy, to now it being the equivalent
going into McDonald’s and ordering a Big Mac. It’s extremely consistent
and down from three weeks to 20 minutes.
Gardner:
I assume Brian that you've adopted some industry-standard methods,
perhaps a framework, that gave you some guidance on this. How does your
service delivery policy adhere to an industry standard like
ITIL?
Pietrewicz:
That’s what we use. We follow ITIL and we're at varying levels of
maturity with it. ITIL is very challenging to implement, but it's
extremely helpful, because it gives you a framework to work within, to
start narrowing down these process, defining services, setting SLAs. It
gives you a good overarching framework to work within.
The
absolute hardest part of all of this is implementing the ITIL
framework, identifying your processes, identifying what your service is,
and identifying your SLA. Walking through all of that is exponentially
harder than putting the technology in place.
Gardner:
It seems to me that not only are you going to get faster servers,
response times, and automation, but there are some other significant
benefits to this approach. I'm thinking about security,
disaster recovery (DR), the ability
to budget better through an
OPEX model, and then ultimately reduce total costs.
Is
it too soon or have some of these other benefits that I have heard
about typically when people move to a more automated cloud approach? How
is that working for you?
Less expensive
Pietrewicz:
We don’t really have good statistics on it. For the folks that had
machines sitting underneath their desks and in closets before, we don’t
have a lot of the statistics to know exactly the cost and the time they
were spending on that.
Anybody who works with
virtualization quickly learns that once you hit a certain size, it
becomes significantly less expensive. You become far more agile and you
get a huge number of benefits. Some of them are things that you
mentioned -- the deployment time, DR, the ability to automate, the
taking advantage of economies of scale.
Instead of
deploying one $10,000 server per application, you're now loading up 70
machines on a $15,000 server. All of those things come into play. But we
really don’t have good statistics, because we didn’t really have any
good processes before we started.
What’s interesting
now is that our next step in the process is to automate our billing
process. Once we do that, we're going to have everything from our
virtual infrastructure deployed into our billing system and either a
charge-back or a show-back methodology.
The same kind of tools and processes that can automate the delivery of
those services can also automate tearing down those services when
they're done.
So we'll have complete detailed
costs of all of our infrastructure associated with every department and
every application that is using our service. We'll be able to really
show the
total cost of ownership (TCO).
Milne:
Brian, it sounds like you're on a path that a lot of our customers are
on. What we see typically is that there is a change in consumption
behavior when your customers know that they can get IaaS
on demand. They
stop hoarding resources. The same kind of tools and processes that can
automate the delivery of those services can also automate tearing down
those services when they're done.
Virtualization by
itself increases capacity utilization quite a bit, but then going to
this kind of services delivery, service consumption for infrastructure,
actually further increases utilization and drives down
over-provisioning.
Adding that cost transparency to
that service will further change your consumers' behavior and the
ability to get it when you need it and only pay for what you use drives
down the amount of resources that you have to keep in your data center.
Pietrewicz: Absolutely. It’s amazing what happens when you have to pay for something and it’s very visible.
Milne:
I always feel that if IT is free that really changes the supply and
demand equation, if you study economics. People don’t know what to do
with free. They typically take too much.
Economic behavior
Pietrewicz:
Right. This really starts driving basic economic and social behavior
into the equation in IT. It’s a difficult thing for organizations to get
their head around, and they're sort of getting it here at the
university. It’s not completely in place. The way that we look at it is
as a, "We'll build it, and they'll come" kind of thing.
Most
folks have figured out that they can really save that money. Instead of
going out and buying a $10,000 server, they can buy a $1,000 VM from us
that does the exact same thing. If they don’t want it any more, they
can turn it off and not pay any more. All of those things come into
play.
Another piece on that is the university was experimenting with a thing
called
reliability centered maintenance (RCM), which is a budgeting process that works toward the bottom
line of a particular organization. That means that people have to be
transparent and make clear decisions about where they're spending their
money. That's also starting to drive adoption.
Ancillary benefits
Gardner:
We talked about some of the ancillary benefits of your approach, but
there are some direct benefits when you go to a cloud model, which gives
you more options. You can have your private cloud. You can look to
public cloud and other hosting models, and then you can start to see a path or a vision towards a
hybrid cloud
environment, where you might actually move workloads around based on
the right infrastructure approach for the right job at the right time.
Any thoughts about where your clouds goals are vis-Ã -vis the hybrid
potential?
Pietrewicz: We have a few things in
play that we're actively working. Today, we have people using various
cloud providers. The interesting part about that they're just paying for
it with a credit card out of their department, and the university
doesn’t have any clear way of knowing exactly what’s out there. We don’t
really have any good security mechanisms in place for determining
whether there's any sensitive data being stored out there inadvertently.
We're
working with a lot of the cloud providers that we are already spending
money with and we are already working with to develop consolidated
accounts. One, we can save money through economies of scale. And two, we
can get some visibility into what folks are actually using the cloud
for. And then three, IT would like to act as an adviser to be able to
point out for the various cloud providers that are out there -- this
particular provider is good at functionality or this particular provider
is good at security.
We envision setting up hybrid cloud services with those public cloud
providers to be able to move the workloads back and forth when
necessary.
The first step is to corral the use
of public cloud for UNM and create an escorting process to the cloud.
The second step is going to be a hybrid cloud that we'll set up from our
private cloud here on site. We envision setting up hybrid cloud
services with those public cloud providers to be able to move the
workloads back and forth when necessary.
The other
major benefit that we very much look forward to is being able to do DR
in the cloud and taking advantage of the ability to replicate data and
then spin up systems as you need them, rather than having a couple of
million dollars in equipment sitting, waiting, and hoping you never use
it. Things that you have to refresh every four years so that you have a
viable DR plan.
Gardner: Is vCloud Automation
Center something that will be useful in moving to this hybrid model? The
one button to push, as it were, on the private cloud, will that become a
one button to push in the hybrid model as well?
Pietrewicz: It will. I mentioned those various cloud service providers. Most of them are compatible with the
vCloud Connector,
so that you can simply just connect up that hybrid cloud service and
with a little bit of work, be able to massage your portal.
We
can have a menu option of public cloud providers through our portal
that they could just select and say that they want to get a
vCHS,
Amazon, or
Terremark, and then potentially move workloads back and forth. So vCAC and vCloud Connector are all at the center of it.
The
other interesting piece that we're working on and going to try to
figure out as part of this is that we really want to start looking into
NSX and/or
VIX to be able to provide very clear security boundaries, basically
multi-tenancy,
and then potentially be able to move those multi-tenant environments
back and forth in the cloud or extend them from public to private cloud
as well.
Software-defined networking
Gardner:
Brian, you mentioned multi-tenancy earlier, and of course, there is a
lot going on with software-defined data center, networking, and storage.
What is it about it that’s interesting to you and why is this a
priority for you,
software-defined networking (SDN), for example?
Pietrewicz:
SDN is the next sort of step in being able to truly automate your IaaS
and your virtual environment. If you want to be able to dynamically
deploy systems and have them be in a
SAN
box that is multi-tenant by customer, you really need to have an
SDN-type solution, or at least that’s extremely helpful to do that.
One
of the things that we are looking at next is to be able to implement
something like
NSX, so that we can deploy the equivalent of what’s a
virtual wire, a multi-tenant environment, to individual customers, so
that they can only see their stuff and can’t see their neighbors and
vice versa.
The key is the ability to orchestrate that on demand and not have to deal with the legacy
VLAN and firewall kind of issues that you have with the legacy environment.
Gardner:
It’s interesting how a lot of these major trends -- service delivery,
cloud, private cloud, DR, and SDN -- are interrelated. It’s a complex
bundle, but the payoffs, when you do this inclusively, are pretty
impressive.
From VMware’s perspective, that kind of network virtualization capability is critical for our hybrid cloud service.
Pietrewicz:
Whenever you get to the point of abstracting things to the software
level, you provide the ability to automate. When you have the ability to
automate, you get tremendous flexibility. That sometimes can be an
issue in and of itself, just making decisions on how you want to do
something. But along with that flexibility, you get the ability to
automate just about anything that you want or need to be able to do.
The
second piece to that is that we're really excited about figuring out,
when we build the hybrid cloud model, how we might be able to extend
those tenants into the cloud, either as active running workloads or in a
DR model, so that the multi-tenancy is retained.
Milne:
From VMware’s perspective, that kind of network virtualization
capability is critical for our hybrid cloud service. It’s that
capability that NSX provides that creates that seamless experience from
your data center out to the hybrid cloud.
As you said, Brian, that kind of network configuration, allocation, and reallocation of
IP addresses,
when you are moving things from one data center to another, is not
something you want to do on a manual basis. So NSX is a key component of
our hybrid cloud vision. It’s something that lot of the other cloud
providers just don’t have.
Pietrewicz: I see it
as the next frontier in IT. I think that when SDN starts taking off,
it’s going to be a game changer in ways that we are not even recognizing
yet, and that’s one example. Moving a workload from one network to
another network is extremely powerful.
Cloud broker
Gardner:
Kurt, this sounds as if not only is Brian transitioning into being a
service provider to his constituencies, but now he's also becoming a
cloud broker. Is this typical of what you're seeing in the market as
well?
Milne: It is. Some of our customers will
take a step to try to get their arms around shadow IT, users going
around IT, to just offer that provisioning option through the IT portal.
So it’s like, "You're using Amazon? That’s fine. We can help you do
that." So putting a button in the service catalog deploys the kind of
work that they've been doing in a public cloud like Amazon, but it has
to come through IT. Then, IT is aware of it.
There's a
saying I like. It’s called the "cloud boomerang." A lot of times, the IT
customers will put thing out in the public cloud, but like a boomerang,
it seems to always come back. The customer wants to integrate it with
an existing system or they realize that they have to support it up in
the cloud. A lot of times, those rogue deployments make their way back
to the IT organization. So putting an Amazon service in the vCAC portal
and not changing anything else is a nice first step in corralling that.
Now, we're taking that next step and combining a lot of those capabilities into a single platform.
Pietrewicz:
That is exactly what we're seeing. At a university, because there isn’t
really governance, it’s more like build a good service and hope they
come. We take the approach of trying to enable it. We want to make it
very transparent and say that they can use Amazon or vCHS, but there's a
better way to do it. If you do it through the portal, you may be able
to move those workloads back and forth.
We are actually
seeing exactly what you mentioned, Kurt. Folks are reaching the
limitations of using some of the cloud providers, because they need to
get access to data back here at UNM and are actually doing the boomerang
approach. They started out there and now they're migrating their
machines into our IaaS so that they can get access to the data that they
need.
Gardner: Kurt, we heard some very
interesting things at VMworld recently around the cloud-management
platform. Why don’t you tell us a little bit about that and how that
fits into what we've been discussing in terms of this ongoing maturity
and evolution that a large organization like the University of New
Mexico is well into?
Milne: We
recently announced the
vRealizeSuite, which is a cloud management platform. So we're moving our product management strategy to a common platform.
Over
the years, VMware has either built or acquired quite a few different
management products. We've combined those products into a number of
suites, like our automation, operations, and our business management
suites. Now, we're taking that next step and combining a lot of those
capabilities into a single platform.
There are a couple
of guiding ideas there. We see in organizations like Brian’s is that
the lines between the automated provisioning of those workloads
automation, provisioning those workloads, and the ongoing operations and
maintenance and support of those workloads, is really starting to blur.
So you have automation tasks that might happen when
you're doing a support call. Maybe you want to provision some more
resources, and there are operations tasks like checking system health
that you might want to do as a step in an automation routine.
Shared services
Our
product strategy change is to move toward a
shared-services model,
similar to a service-oriented architecture. The different services that
are underlying our management products would be executable through a
tool like vCAC, through a command line interface, or through like a
REST API. There's kind of a mix-and-match opportunity to execute those services in different ways.
To
build that platform with the shared service model on top, we need to
start re-architecting some of our products in the back-end, so that we
have a common orchestration engine, a common DR backup and a common
policy engine. You don’t want one tool to undo the work that another
tool did yesterday. You can’t have conflicting robots going out and
doing automated tasks.
The general idea is to try to
further consolidate these different management functions into a single
platform. The overall goal is to try to help organizations maintain
control, but then also increase flexibility and speed for their business
users.
Gardner: Brian, is that something that
you think is going to be on your radar? Is management so distributed now
that you're looking for a more consolidated approach that’s inclusive?
The overall goal is to try to help organizations maintain control, but
then also increase flexibility and speed for their business users.
Pietrewicz: That would be wonderful. We're doing things many different ways. If you take the example of orchestration, we are using
Orchestrator,
PowerShell,
Perl, and starting to experiment with
Puppet.
It
would be really good if you could have one standardized way that you
approach orchestration, as an example, and how that might tie into all
the other pieces for back-end management, rather than handling it several
different ways. As Kurt was mentioning, one part starts to step on
another part. Having that be consolidated and consistent would be a huge
value.
Milne: The other part of the strategy is
also to make that work across environments. So the same tools and
services would be available if you are provisioning up to Amazon or to
your private cloud or hybrid cloud service, and even different
hypervisors.
We're
fully aware of the heterogeneous nature of the modern data center. So
we're shifting to try to create that kind of powerful common management
stack with that unified management experience across all of the
environment. It’s kind of a nirvana. When we talk to people, they say
that’s exactly what they want. So our vision is to kind of march towards
delivering on that.
Gardner: Kurt, I am trying to recall from VMworld whether this was offered on-premises, as a service from a cloud, or some combination?
Service offerings
Milne:
That’s the other interesting part of this. We're starting to go down
the path of offering a number of our management products as a service.
For example, at VMworld, we
announced the availability of a beta for our vCAC product as a
software as a service (SaaS),
so you can without installing any software get a service portal, get
that workflow and policy engine, and deploy infrastructure services
across different environments.
We'll be rolling out
betas for our other products in subsequent quarters over the next year
or so. Then potentially we could have the SaaS services interact with
and combine with the services that are available through the products
that are installed on-premise. Our goal is to get these out there and
then understand what the best use cases are, but that kind of mix and
match is part of the vision.
Gardner: It’s
interesting. We might have a reverse boomerang when it comes to the
management of all of this. Does that sound appealing Brian? Is that
something you would look to as a cloud service, comprehensive
management?
Our goal is to get these out there and then understand what the best use
cases are, but that kind of mix and match is part of the vision.
Pietrewicz: Absolutely, but it’s largely dependent on
return on investment (ROI).
It’s that balance of, when you get to a certain level in an IT shop,
it’s sometimes cheaper to do things in-house than it is to outsource it,
and sometimes not. You have to do the analysis on the ROI on what makes
more sense to bring it in or to use a SaaS.
As an
example, we completely outsourced all of our email, because it’s a lot
of work. It's very simple and easy to do as a SaaS solution, but it’s a
lot more work to do in-house. It’s definitely something that we would
look into.
Milne: In a mid-sized organization
that might have 300 different applications that the IT organization
supports, maybe 50 of those are IT tools. Already we've seen progress
with companies like
ServiceNow
that have a SaaS-based service desk. It makes sense to start to turn
more of those management products into a SaaS delivery model.
Gardner: Brian, any thoughts about others who are starting to move in
your direction, perhaps their own Lobo Cloud, their own portal
rationalizing these services, being able to measure them better. What in
20/20 hindsight do you have that you could recommend for them as they
go about this? Any learned lessons you could share?
Process orientation
Pietrewicz:
The biggest lesson learned, without a doubt, is the focus on the
process orientation, the ITIL model. The technology is really not that
hard. It’s determining what your service is, what are you trying to
deliver, and then how do you build that into a consistently delivered
service, complete with SLAs and service descriptions that meet the
customer needs. That's the most difficult part.
The
technical folks can definitely sling the technology. That doesn’t seem
to be that big of a deal. The partners and providers do a very good job
of putting together products that make it happen, but the hard part is
defining the processes and defining the services and making sure that
they are meeting the customer needs.
Gardner:
Kurt, any thoughts in reaction to what Brian said in terms of getting
started on the right path around cloud rationalization of your IT
organization?
Milne: One of the things that I've
seen is a lot of organizations go through this process that Brian has
described, trying to clearly define their services and figure out which
parts of those services they're going to automate.
The hard part is defining the processes and defining the services and making sure that they are meeting the customer needs.
A
lot of organizations start that service definition effort from an
inside-out perspective, get a bunch of IT guys together, and try to
define what you do on a daily basis in a service. That's hard.
The
easier approach is just to go talk to your customers and users and ask,
"If I were going to give you a button you could click to get what you
need, what would you put behind the button?" Then, you define your
services more from an outside-in perspective. It seems to be where
companies get anyway and you just shortcut a lot of teeth gnashing and
internal meetings when you do it that way.
Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: VMware.
You may also be interested in: