Thursday, October 8, 2020

The IT intelligence foundation for digital business transformation rests on HPE InfoSight AIOps

 

The next BriefingsDirect podcast explores how artificial intelligence (AI) increasingly supports IT operations.

One of the most successful uses of machine learning (ML) and AI for IT efficiency has been the InfoSight technology developed at Nimble Storage, now part of Hewlett Packard Enterprise (HPE).


Initially targeting storage optimization, HPE InfoSight has emerged as a broad and inclusive capability for AIOps across an expanding array of HPE products and services.

Please welcome a Nimble Storage founder, along with a cutting-edge machine learning architect, to examine the expanding role and impact of HPE InfoSight in making IT resiliency better than ever.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn more about the latest IT operations solutions that help companies deliver agility and edge-to-cloud business continuity, we’re joined by Varun Mehta, Vice President and General Manager for InfoSight at HPE and founder of Nimble Storage, and David Adamson, Machine Learning Architect at HPE InfoSight. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Varun, what was the primary motivation for creating HPE InfoSight? What did you have in mind when you built this technology?

Mehta: Various forms of call home were already in place when we started Nimble, and that’s what we had set up to do. But then we realized that the call home data was used to do very simple actions. It was basically to look at the data one time and try and find problems that the machine was having right then. These were very obvious issues, like a crash. If you had had any kind of software crash, that’s what call home data would identify.

Mehta
We found that if instead of just scanning the data one time, if we could store it in a database and actually look for problems over time in areas wider than just a single use, we could come up with something very interesting. Part of the problem until then was that a database that could store this amount of data cheaply was just not available, which is why people would just do the one-time scan.

The enabler was that a new database became available. We found that rather than just scan once, we could put everyone’s data into one place, look at it, and discover issues across the entire population. That was very powerful. And then we could do other interesting things using data science such as workload planning from all of that data. So the realization was that if the databases became available, we could do a lot more with that data.

Gardner: And by taking advantage of that large data capability and the distribution of analytics through a cloud model, did the scope and relevancy of what HPE InfoSight did exceed your expectations? How far has this now come?

Mehta: It turned out that this model was really successful. They say that, “imitation is the sincerest form of flattery.” And that was proven true, too. Our customers loved it, our competitors found out that our customers loved it, and it basically spawned an entire set of features across all of our competitors.

The reason our customers loved it -- followed by our competitors -- was that it gave people a much broader idea of the issues they were facing. We then found that people wanted to expand this envelope of understanding that we had created beyond just storage.

Data delivers more than a quick fix

And that led to people wanting to understand how their hypervisor was doing, for example. And so, we expanded the capability to look into that. People loved the solution and wanted us to expand the scope into far more than just storage optimization.

Gardner: David, you hear Varun describing what this was originally intended for. As a machine learning architect, how has HPE InfoSight provided you with a foundation to do increasingly more when it comes to AIOps, dependability, and reliability of platforms and systems?

The database is full of data that not only tracks everything longitudinally across the installed base, but also over time. The richness of that data gives us features we otherwise could not have conceived of. Many issues can now be automated away.
Adamson: As Varun was describing, the database is full of data that not only tracks everything longitudinally across the installed base, but also over time. The richness of that data set gives us an opportunity to come up with features that we otherwise wouldn’t have conceived of if we hadn’t been looking through the data. Also very powerful from InfoSight’s early days was the proactive nature of the IT support because so many simple issues had now been automated away.
 

That allowed us to spend time investigating more interesting and advanced problems, which demanded ML solutions. Once you’ve cleaned up the Pareto curve of all the simple tasks that can be automated with simple rules or SQL statements, you uncover problems that take longer to solve and require a look at time series and telemetry that’s quantitative in nature and multidimensional. That data opens up the requirement to use more sophisticated techniques in order to make actionable recommendations.

Gardner: Speaking of actionable, something that really impressed me when I first learned about HPE InfoSight, Varun, was how quickly you can take the analytics and apply them. Why has that rapid capability to dynamically impact what’s going on from the data proved so successful? 

Support to succeed

Mehta: It turned out to be one of the key points of our success. I really have to compliment the deep partnership that our support organization has had with the HPE InfoSight team.

The support team right from the beginning prided themselves on providing outstanding service. Part of the proof of that was incredible Net Promoter scores (NPS), which is this independent measurement of how satisfied customers are with our products. Nimble’s NPS score was 86, which is even higher than Apple. We prided ourselves on providing a really strong support experience to the customer.

Whenever a problem would surface, we would work with the support team. Our goal was for a customer to see a problem only once. And then we would rapidly fix that problem for every other customer. In fact, we would fix it preemptively so customers would never have to see it. So, we evolved this culture of identifying problems, creating signatures for these problems, and then running everybody’s data through the signatures so that customers would be preemptively inoculated from these problems. That’s why it became very successful.

Gardner: It hasn’t been that long since we were dealing with red light-green light types of IT support scenarios, but we’ve come a long way. We’re not all the way to fully automated, lights-out, machines running machines operations.

David, where do you think we are on that automated support spectrum? How has HPE InfoSight helped change the nature of systems’ dependability, getting closer to that point where they are more automated and more intelligent?

Adamson: The challenge with fully automated infrastructure stems from the variety of different components in the environments -- and all of the interoperability among those components. If you look at just a simple IT stack, they are typically applications on top of virtual machines (VMs), on top of hosts -- they may or may not have independent storage attached – and then the networking of all these components. That’s discounting all the different applications and various software components required to run them.

Adamson
There are just so many opportunities for things to break down. In that context, you need a holistic perspective to begin to realize a world in which the management of that entire unit is managed in a comprehensive way. And so we strive for observability models and services that collect all the data from all of those sources. If we can get that data in one place to look at the interoperability issues, we can follow the dependency chains.

But then you need to add intelligence on top of that, and that intelligence needs to not only understand all of the components and their dependencies, but also what kinds of exceptions can arise and what is important to the end users.

So far, with HPE InfoSight, we go so far as to pull in all of our subject matter expertise into the models and exception-handling automation. We may not necessarily have upfront information about what the most important parts of your environment are. Instead, we can stop and let the user provide some judgment. It’s truly about messaging to the user the different alternative approaches that they can take. As we see exceptions happening, we can provide those recommendations in a clean and interpretable way, so [the end user] can bring context to bear that we don’t necessarily have ourselves.

Gardner: And the timing for these advanced IT operations services is very auspicious. Just as we’re now able to extend intelligence, we’re also at the point where we have end-to-end requirements – from the edge, to the cloud, and back to the data center.

And under such a hybrid IT approach, we are also facing a great need for general digital transformation in businesses, especially as they seek to be agile and best react to the COVID-19 pandemic. Are we able yet to apply HPE InfoSight across such a horizontal architecture problem? How far can it go?

Seeing the future: End-to-end visibility

Mehta: Just to continue from where David started, part of our limitation so far has been from where we began. We started out in storage, and then as Nimble became part of HPE, we expanded it to compute resources. We targeted hypervisors; we are expanding it now to applications. To really fix problems, you need to have end-to-end visibility. And so that is our goal, to analyze, identify, and fix problems end-to-end.

That is one of the axis of development we’re pursuing. The other axis of development is that things are just becoming more-and-more complex. As businesses require their IT infrastructure to become highly adaptable they also need scalability, self-healing, and enhanced performance. To achieve this, there is greater-and-greater complexity. And part of that complexity has been driven by really poor utilization of resources.

Go back 20 years and we had standalone compute and storage machines that were not individually very well-utilized. Then you had virtualization come along, and virtualization gave you much higher utilization -- but it added a whole layer of complexity. You had one machine, but now you could have 10 VMs in that one place.

Now, we have containers coming out, and that’s going to further increase complexity by a factor of 10. And right on the horizon, we have serverless computing, which will increase the complexity another order of magnitude.

Complexity is increasing, interconnectedness is increasing, and yet the demands on the business to stay agile, competitive, and scalable are also increasing. It's really hard for IT administrators to stay on top of this. That's why you need end-to-end automation.
So, the complexity is increasing, the interconnectedness is increasing, and yet the demands on businesses to stay agile and competitive and scalable are also increasing. It’s really hard for IT administrators to stay on top of this. And that’s why you need end-to-end automation and to collect all of the data to actually figure out what is going on. We have a lot of work cut out for us.
 
There is another area of research, and David spends a lot of time working on this, which is you really want to avoid false positives. That is a big problem with lots of tools. They provide so many false positives that people just turn them off. Instead, we need to work through all of your data to actually say, “Hey, this is a recommendation that you really should pay attention to.” That requires a lot of technology, a lot of ML, and a lot of data science experience to separate the wheat from the chaff.

 

One of the things that’s happened with the COVID-19 pandemic response is the need for very quick response stats. For example, people have had to quickly set up web sites for contact tracing, reporting on the diseases, and for vaccines use. That shows an accelerated manner in how people need digital solutions -- and it’s just not possible without serious automation.

Gardner: Varun just laid out the complexity and the demands for both the business and the technology. It sounds like a problem that mere mortals cannot solve. So how are we helping those mere mortals to bring AI to bear in a way that allows them to benefit – but, as Varun also pointed out, allows them to trust that technology and use it to its full potential?

Complexity requires automated assistance

Adamson: The point Varun is making is key. If you are talking about complexity, we’re well beyond the point where people could realistically expect to log-in to each machine to find, analyze, or manage exceptions that happen across this ever-growing, complex regime.

Even if you’re at a place where you have the observability solved, and you’re monitoring all of these moving parts together in one place -- even then, it easily becomes overwhelming, with pages and pages of dashboards. You couldn’t employ enough people to monitor and act to spot everything that you need to be spotting.

You need to be able to trust automated exception [finding] methods to handle the scope and complexity of what people are dealing with now. So that means doing a few things.

People will often start with naïve thresholds. They create manual thresholds to give alerts to handle really critical issues, such as all the servers went down.

But there are often more subtle issues that show up that you wouldn’t necessarily have anticipated setting a threshold for. Or maybe your threshold isn’t right. It depends on context. Maybe the metrics that you’re looking at are just the raw metrics you’re pulling out of the system and aren’t even the metrics that give a reliable signal.


What we see from the data science side is that a lot of these problems are multi-dimensional. There isn’t just one metric that you could set a threshold on to get a good, reliable alert. So how do you do that right?

 

For the problems that IT support provides to us, we apply automation and we move down the Pareto chart to solve things in priority of importance. We also turn to ML models. In some of these cases, we can train a model from the installed base and use a peer-learning approach, where we understand the correlations between problem states and indicator variables well enough so that we can identify a root cause for different customers and different issues.

Sometimes though, if the issue is rare enough, scanning the installed base isn’t going to give us a high enough signal to the noise. Then we can take some of these curated examples from support and do a semi-supervised loop. We basically say, “We have three examples that are known. We’re going to train a model on them.” Maybe it’s a few tens of thousands of data points, but it’s still in the three examples, so there’s co-correlation that we are worried about. 


In that case we say: “Let me go fishing in that installed base with these examples and pull back what else gets flagged.” Then we can turn those back over to our support subject matter experts and say, “Which of these really look right?” And in that way, you can move past the fact that your starting data set of examples is very small and you can use semi-supervised training to develop a more robust model to identify the issues.

Gardner: As you are refining and improving these models, one of the benefits in being a part of HPE is to access growing data sets across entire industries, regions, and in fact the globe. So, Varun, what is the advantage of being part of HPE and extending those datasets to allow for the budding models to become even more accurate and powerful over time?

Gain a global point of view

Mehta: Being part of HPE has enabled us to leapfrog our competition. As I said, our roots are in storage, but really storage is just the foundation of where things are located in an organization. There is compute, networking, hypervisors, operating systems, and applications. With HPE, we certainly now cover the base infrastructure, which is storage followed by compute. At some point we will bring in networking. We already have hypervisor monitoring, and we are actively working on application monitoring.

HPE has allowed us to radically increase the scope of what we can look at, which also means we can radically improve the quality of the solutions we offer to our customers. And so it’s been a win-win solution, both for HPE where we can offer a lot of different insights into our products, and for our customers where we can offer them faster solutions to more kinds of problems.

Gardner: David, anything more to offer on the depth, breadth, and scope of data as it’s helping you improve the models?

Adamson: I certainly agree with everything that Varun said. The one thing I might add is in the feedback we’ve received over time. And that is, one of the key things in making the notifications possible is getting us as close as possible to the customer experience of the applications and services running on the infrastructure.

Gaining additional measurements from the applications themselves is going to give us the ability to differentiate ourselves, to find the important exceptions to the end user, what they really want us to take action on, the events that are truly business-critical.
We’ve done a lot of work to make sure we identify what look like meaningful problems. But we’re fundamentally limited if the scope of what we measure is only at the storage or hypervisor layer. So gaining additional measurements from the applications themselves is going to give us the ability to differentiate ourselves, to find the important exceptions to the end user, what they really want to take action on. That’s critical for us -- not sending people alerts they are not interested in but making sure we find the events that are truly business-critical.
 

Gardner: And as we think about the extensibility of the solution -- extending past storage into compute, ultimately networking, and applications -- there is the need to deal with the heterogeneity of architecture. So multicloud, hybrid cloud, edge-to-cloud, and many edges to cloud. Has HPE InfoSight been designed in a way to extend it across different IT topologies?

Across all architecture

Mehta: At heart, we are building a big data warehouse. You know, part of the challenge is that we’ve had this explosion in the amount of data that we can bring home. For the last 10 years, since InfoSight was first developed, the tools have gotten a lot more powerful. What we now want to do is take advantage of those tools so we can bring in more data and provide even better analytics.

The first step is to deal with all of these use cases. Beyond that, there will probably be custom solutions. For example, you talked about edge-to-cloud. There will be locations where you have good bandwidth, such as a colocation center, and you can send back large amounts of data. But if you’re sitting as the only compute in a large retail store like a Home Depot, for example, or a McDonald’s, then the bandwidth back is going to be limited. You have to live within that and still provide effective monitoring. So I’m sure we will have to make some adjustments as we widen our scope, but the key is having a really strong foundation and that’s what we’re working on right now.

Gardner: David, anything more to offer on the extensibility across different types of architecture, of analyzing the different sources of analytics?

Adamson: Yes, originally, when we were storage-focused and grew to the hypervisor level, we discovered some things about the way we keep our data organized. If we made it more modular, we could make it easier to write simple rules and build complex models to keep turnaround time fast. We developed some experience and so we’ve taken that and applied it in the most recent release of recommendations into our customer portal.


We’ve modularized our data model even further to help us support more use cases from environments that may or may not have specific components. Historically, we’ve relied on having Nimble Storage, they’re a hub for everything to be collected. But we can’t rely on that anymore. We want to be able to monitor environments that don’t necessarily have that particular storage device, and we may have to support various combinations of HPE products and other non-HPE applications.

Modularizing our data model to truly accommodate that has been something that we started along the path for and I think we’re making good strides toward.

The other piece is in terms of the data science. We’re trying to leverage longitudinal data as much as possible, but we want to make sure we have a sufficient set of meaningful ML offerings. So we’re looking at unsupervised learning capabilities that we can apply to environments for which we don’t have a critical mass of data yet, especially as we onboard monitoring for new applications. That’s been quite exciting to work on.

Gardner: We’ve been talking a lot about the HPE InfoSight technology, but there also has to be considerations for culture. A big part of digital transformation is getting silos between people broken down.

Is there a cultural silo between the data scientists and the IT operations people? Are we able to get the IT operations people to better understand what data science can do for them and their jobs? And perhaps, also allow the data scientists to understand the requirements of a modern, complex IT operations organization? How is it going between these two groups, and how well are they melding?

IT support and data science team up

Adamson: One of the things that Nimble did well from the get-go was have tight coupling between the IT support engineers and the data science team. The support engineers were fielding the calls from the IT operations guys. They had their fingers on the pulse of what was most important. That meant not only building features that would help our support engineers solve their escalations more quickly, but also things that we can productize for our customers to get value from directly.

Gardner: One of the great ways for people to better understand a solution approach like HPE InfoSight is through examples.  Do we have any instances that help people understand what it can do, but also the paybacks? Do we have metrics of success when it comes to employing HPE InfoSight in a complex IT operations environment?

Mehta: One of the examples I like to refer to was fairly early in our history but had a big impact. It was at the University Hospital of Basel in Switzerland. They had installed a new version of VMware, and a few weeks afterward things started going horribly wrong with their implementation that included a Nimble Storage device. They called VMware and VMware couldn’t figure it out. Eventually they called our support team and using InfoSight, our support team was able to figure it out really quickly. The problem turned out to be a result of a new version of VMware. If there was a hold up in the networking, some sort of bottleneck in their networking infrastructure, this VMware version would try really hard to get the data through.

We were able to preemptively alert other people who had the same combinations of VMware and Nimble Storage and say, "Guys, your should either upgrade to this new patch that VMware has made or just be aware that you are susceptible to this problem."
So instead of submitting each write once to the storage array once, it would try 64 times. Suddenly, their traffic went up by 64 times. There was a lot of pounding on the network, pounding on the storage system, and we were able to tell with our analytics that, “Hey this traffic is going up by a huge amount.” As we tracked it back, it pointed to the new version of VMware that had been loaded. We then connected with the VMware support team and worked very closely with all of our partners to identify this bug, which VMware very promptly fixed. But, as you know, it takes time for these fixes to roll out to the field.

We were able to preemptively alert other people who had the same combination of VMware on Nimble Storage and say, “Guys, you should either upgrade to this new patch that VMware has made or just be aware that you are susceptible to this problem.”

So that’s a great example of how our analytics was able to find a problem, get it fixed very quickly -- quicker than any other means possible -- and then prevent others from seeing the same problem.

Gardner: David, what are some of your favorite examples of demonstrating the power and versatility of HPE InfoSight?

Adamson: One that comes to mind was the first time we turned to an exception-based model that we had to train. We had been building infrastructure designed to learn across our installed base to find common resource bottlenecks and identify and rank those very well. We had that in place, but we came across a problem that support was trying to write a signature for. It was basically a drive bandwidth issue.

But we were having trouble writing a signature that would identify the issue reliably. We had to turn to an ML approach because it was fundamentally a multidimensional problem. If we looked across, we have had probably 10 to 20 different metrics that we tracked per drive per minute on each system. We needed to, from those metrics, come up with a good understanding of the probability that this was the biggest bottleneck on the system. This was not a problem we could solve by just setting a threshold.

So we had to really go in and say, “We’re going to label known examples of these situations. We’re going to build the sort of tooling to allow us to do that, and we’re going to put ourselves in a regime where we can train on these examples and initiate that semi-supervised loop.”

We actually had two to three customers that hit that specific issue. By the time we wanted to put that in place, we were able to find a few more just through modeling. But that set us up to start identifying other exceptions in the same way.

We’ve been able to redeploy that pattern now several times to several different problems and solve those issues in an automated way, so we don’t have to keep diagnosing the same known flavors of problems repeatedly in the future.

Gardner: What comes next? How will AI impact IT operations over time? Varun, why are you optimistic about the future?

Software eats the world 

Mehta: I think having a machine in the loop is going to be required. As I pointed out earlier, complexity is increasing by leaps and bounds. We are going from virtualization to containers to serverless. The number of applications keeps increasing and demand on every industry keeps increasing. 

Andreessen Horowitz, a famous venture capital firm once said, “Software is eating the world,” and really, it is true. Everything is becoming tied to a piece of software. The complexity of that is just huge. The only way to manage this and make sure everything keeps working is to use machines.

That’s where the challenge and opportunity is. Because there is so much to keep track of, one of the fundamental challenges is to make sure you don’t have too many false positives. You want to make sure you alert only when there is a need to alert. It is an ongoing area of research.

There’s a big future in terms of the need for our solutions. There’s plenty of work to keep us busy to make sure we provide the appropriate solutions. So I’m really looking forward to it.


There’s also another axis to this. So far, people have stayed in the monitoring and analytics loop and it’s like self-driving cars. We’re not yet ready for machines to take over control of our cars. We get plenty of analytics from the machines. We have backup cameras. We have radars in front that alert us if the car in front is braking too quickly, but the cars aren’t yet driving themselves.

 

It’s all about analytics yet we haven’t graduated from analytics to control. I think that too is something that you can expect to see in the future of AIOps once the analytics get really good, and once the false positives go away. You will see things moving from analytics to control. So lots of really cool stuff ahead of us in this space.

Gardner: David, where do you see HPE InfoSight becoming more of a game changer and even transforming the end-to-end customer experience where people will see a dramatic improvement in how they interact with businesses?

Adamson: Our guiding light in terms of exception handling is making sure that not only are we providing ML models that have good precision and recall, but we’re making recommendations and statements in a timely manner that come only when they’re needed -- regardless of the complexity.

A lot of hard work is being put into making sure we make those recommendation statements as actionable and standalone as possible. We’re building a differentiator through the fact that we maintain a focus on delivering a clean narrative, a very clear-cut, “human readable text” set of recommendations. 

And that has the potential to save a lot of people a lot of time in terms of hunting, pecking, and worrying about what’s unseen and going on in their environments.

Gardner: Varun, how should enterprise IT organizations prepare now for what’s coming with AIOps and automation? What might they do to be in a better position to leverage and exploit these technologies even as they evolve?

Pick up new tools

Mehta: My advice to organizations is to buy into this. Automation is coming. Too often we see people stuck in the old ways of doing things. They could potentially save themselves a lot of time and effort by moving to more modern tools. I recommend that IT organizations make use of the new tools that are available.


HPE InfoSight is generally available for free when you buy an HPE product, sometimes with only the support contract. So make use of the resources. Look at the literature with HPE InfoSight. It is one of those tools that can be fire-and-forget, which is you turn it on and then you don’t have to worry about it anymore.

It’s the best kind of tool because we will come back to you and tell you if there’s anything you need to be aware of. So that would be the primary advice I would have, which is to get familiar with these automation tools and analytics tools and start using them.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in:

Wednesday, September 23, 2020

How Unisys ClearPath mainframe apps now seamlessly transition to Azure Cloud without code changes

When applications are mission-critical, where they are hosted matters far less than keeping them operating smoothly.

As many organizations face a ticking time bomb to modernize mainframe applications, one solution is to find a dependable, repeatable way to transition to a public cloud without degrading these vulnerable and essential systems of record.

The next BriefingsDirect cloud adoption discussion explores the long-awaited means to solve the mainframe to cloud transition for essential but aging applications and data. We’re going to learn how Unisys and Microsoft can deliver ClearPath Forward assets to Microsoft Azure cloud without risky code changes.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn more about the latest on-ramps to secure and agile public cloud adoption, we welcome Chuck Lefebvre, Senior Director of Product Management for ClearPath Forward at Unisys, and Bob Ellsworth, Worldwide Director of Mainframe Transformation at Microsoft. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Bob, what’s driving the demand nowadays for more organizations to run more of their legacy apps in the public cloud?

Ellsworth: We see that more and more customers are embracing digital transformation, and they are finding the cloud an integral part of their digital transformation journey. And when we think of digital transformation, at first it might begin with optimizing operations, which is a way of reducing costs by taking on-premises workloads and moving them to the cloud.

But the journey just starts there. Customers now want to further empower employees to access the applications they need to be more efficient and effective, to engage with their customers in different ways, and to find ways of using cloud technologies to transform products, such as machine learning (ML), artificial intelligence (AI), and business intelligence (BI).

Gardner: And it’s not enough to just have some services or data in the cloud. It seems there’s a whole greater than the sum of the parts for organizations seeking digital transformation -- to get more of their IT assets into a cloud or digitally agile environment.

Destination of choice: Cloud

Ellsworth
Ellsworth: Yes, that’s absolutely correct. The beauty is that you don’t have to throw away what you have. You can take legacy workloads such as ClearPath workloads and move those into the cloud, but then continue the journey by embracing new digital capabilities such the advanced services such as ML, AI, or BI so you can extend the usefulness and benefits of those legacy applications.
 

Gardner: And, of course, this has been a cloud adoption journey for well over 10 years. Do you sense that something is different now? Are there more means available to get more assets into a cloud environment? Is this a tipping point?

Ellsworth: It is a tipping point. We’ve seen -- especially around the mainframe, which is what I focus on -- a huge increase in customer interest and selection of the cloud in the last 1.5 years as the preferred destination. And one of the reasons is that Azure has absolutely demonstrated its capability to run these mission- and business-critical workloads.

Gardner: Are these cloud transitions emerging differently across the globe? Is there a regional bias of some sort? Is the public sector lagging or leading? How about vertical industries? Where is this cropping up first and foremost?

Ellsworth: We’re seeing it occur in all industries; in particular, financial services. We find there are more mainframes in financial services, banking capital markets, and insurance than in any other industries.

So we see a propensity there where, again, the cloud has become a destination of choice because of its capability to run mission- and business-critical workloads. But in addition, we’re seeing this in state and local governments, and in the US Federal Government. The challenge in the government sector is the long cycle it takes to get funding for these projects. So, it’s not a lack of desire, it’s more the time it takes to move through the funding process.

Gardner: Chuck, I’m still surprised all these years into the cloud journey that there’s still such a significant portion of data and applications that are not in the cloud environment. What’s holding things back? What’s preventing enterprises from taking advantage of cloud benefits?

Lefebvre: A lot of it is inertia. And in some cases, incorrect assumptions about what would be involved in moving. That’s what’s so attractive about our Unisys ClearPath solution. We can help clients move their ClearPath workloads without change. We take that ClearPath software stack from MCP initially and move it and re-platform it on Microsoft Azure.

Learn How to Transition
ClearPath Workloads
To the Cloud
And that application and data comes across with no re-compilation, no refactoring of the data; it’s a straightforward transition. So, I think now that we have that in place, that transition is going to go a lot smoother and really enable that move to occur.
 

I also second what Bob said earlier. We see a lot of interest from our financial partners. We have a large number of banking application partners running on our ClearPath MCP environment, and those partners are ready to go and help their clients as an option to move their workloads into the Azure public cloud.

Pandemic puts focus on agility

Gardner: Has the disruption from the coronavirus and the COVID-19 disease been influencing this transition? Is it speeding it up? Slowing it down? Maybe some other form of impact?

Lefebvre
Lefebvre: I haven’t seen it affecting any, either acceleration or deceleration. In our client-base most of the people were primarily interested initially in ensuring their people could work from home with the environments they have in place.

I think now that that’s settled in, they’ve sorted out their virtual private networks (VPNs) and their always-on access, processes, that perhaps now we’ll see some initiatives evolving. I think, initially, it was just businesses supporting their employees working from home. 

My perspective is that that should be enabled equally as well, whether they are running their backend systems of record in a public cloud or on-premises. Either way would work for them.

Gardner: Bob, at Microsoft, are you seeing any impact from the pandemic in terms of how people are adopting cloud services?

Ellsworth: We’re actually seeing an increase in customer interest and adoption of cloud services because of COVID-19. We’re seeing that in particular in some of our solutions such as Teams for doing collaboration and webinars, and connecting with others remotely. We’re seeing a big increase there.

And Office 365, we’ve seen a huge increase in deployments of customers using the Office 365 technology. In addition, Azure; we’ve also seen a big increase in Azure consumption as customers are dealing with the application growth and requirements of running these applications.

As far as new customers that are considering moving to the cloud, I had thought originally, back in March when this was starting to hit, that our conversations would slow down as people dealt with more immediate needs. But, in fact, it was about a two-to-three-week slow down. But now, we’re seeing a dramatic increase in interest in having conversations about what are the right solutions and methods to move the workloads to the cloud.

So, the adoption is accelerating as customers look for ways to reduce cost, increase agility, and find new ways of running the workloads that they have today.

Gardner: Chuck, another area of impact in the market is around skills. There is often either a lack of programmers for some of these older languages or the skills needed to run your own data centers. Is there a skill factor that’s moving the transition to cloud?

As we see our clients showing interest in moving to the pubic cloud, they are now looking to do that for mainframe applications. Once they do that, no longer do they have to worry about the care and feeding of that mainframe infrastructure.
Lefebvre: Oh, there certainly is. One of the attractive elements of a public cloud is the infrastructure layer of the IT environment is managed by that cloud provider. So as we see our clients showing interest in moving to the public cloud -- first with things like, as Bob said, Office 365 and maybe file shares with SharePoint – they are now looking at doing that for mainframe applications. And when they do that, they no longer have to be worried about that talent to do the care and feeding of that infrastructure. As we move those clients in that direction, we’re going to take care of that ClearPath infrastructure, the day-to-day management of that environment, and that will be included as part of our offering.
 

We expect most clients – rather than managing it themselves in the cloud – will defer to us, and that will free up their staff to do other things. They will have retirements, but less risk.

Gardner: Bob, another issue that’s been top-of-mind for people is security. One of the things we found is that security can be a tough problem when you are transitioning, when you change a development environment, go from development to production, or move from on-premises to cloud. 

How are we helping people remain secure during a cloud transition, and also perhaps benefit from a better security posture once they make the transition?

Time to transition safely, securely

Ellsworth: We always recommend making security part of the planning process. When you’re thinking of transforming from a datacenter solution to the cloud, part of that planning is for the security elements. We always look to collaborate with our partners, such as Unisys, to help define that security infrastructure and deployment.

What’s great about the Azure solutions is we’ve focused on hybrid as the way of deploying customers’ workloads. Most customers aren’t ready to move everything to the cloud all at the same time. For that reason, and with the fact that we focus on hybrid, we allow a customer to deploy portions of the workload to the cloud and the other portions in their data center. Then, over time, they can transition to the cloud.

But during that process supporting your high levels of security for user access, identity management, or even controls of access to the right applications and data -- that’s all done through the planning and using technologies such as Microsoft Active Directory and synchronization with Azure Active Directory. So with that planning it’s so important to ensure successful deployments and ensure the high levels of security that customers require.

Gardner: Chuck, anything to offer on the security?

Lefebvre: Yes, we’ll be complementing everything Bob just described with our Unisys Stealth technology. It allows always-on access and isolation capabilities for deployment of any of our applications from Unisys, but in particular the ClearPath environment. And that can be completely deployed in Azure or, as Bob said, in a hybrid environment across an enterprise. So we are excited about that deployment of Stealth to complement the rigor that Microsoft applies to the security planning process.

Gardner: We’ve described what’s driving organizations to the cloud, the fact that it’s accelerating, that there’s a tipping point in what adoption can be accomplished safely and reliably. We’ve also talked about what’s held people back and their challenges.

Let’s now look at what’s different about the latest solutions for the cloud transition journey. For Unisys, Chuck, how are your customers reconciling the mainframe past with the cloud future?

No change in digital footprint

Lefebvre: We are able to transition ClearPath applications with no change. It’s been roughly 10 years since we’ve been deploying these systems on Intel platforms, and in the case of MCP hosting it on a Microsoft Windows Server kernel. That’s been in place under our Unisys Libra brand for more than 10 years now.

In the last couple of years, we’ve also been allowing clients to deploy that software stack on virtualized servers of their choice: on Microsoft Hyper-V and the VMware virtualization platforms. So it’s a natural transition for us to move that and offer that in Azure cloud. We can do that because of the layered approach in our technology. It’s allowed us to present an approach to our clients that is very risk-free, very straightforward.

Learn How to Transition
ClearPath Workloads
To the Cloud
The ClearPath software stack sits on a Windows kernel, which is also at the foundation level offered by the Azure hybrid infrastructure. The applications therefore don’t change a bit, literally. The digital footprint is the same. It’s just running in a different place, initially as platform-as-a-service (PaaS).

The cloud adoption transition is really a very low-risk, safe, and efficient journey to the public cloud for those existing solutions that our clients have on ClearPath.

Gardner: And you described this as an ongoing logical, cascading transition -- standing on the shoulders of your accomplishments -- and then continuing forward. How was that different from earlier migrations, or a lift-and-shift, approach? Why is today’s transition significantly different from past migrations?

Lefebvre: Well, a migration often involves third-parties doing a recompilation, a refactoring of the application, so taking the COBOL code, recompiling it, refactoring it into Java, and then breaking it up, and moving the data out of our data formats and into a different data structure. All of those steps have risk and disruption associated with them. I’m sure there are third-parties that have proven that. That can work. It just takes a long time and introduces risk.

For Unisys ClearPath clients who have invested years and years in those systems of record, that entire stack can now run in a public cloud using our approach -- as I said before -- with absolutely not a single bit of change to the application or the data.

Gardner: Bob, does that jibe with what you are seeing? Is the transition approach as Bob described it an advantage over a migration process as he described?

Ellsworth: Yes, Chuck described it very well. We see the very same thing. What I have found, -- and I’ve been working with Unisys clients since I joined Microsoft in 2001, early on going to the Unisys UNITE conference -- was that Unisys clients are very committed and dedicated to their platform. They like the solutions they are using. They are used to using those developer tools. They have built up the business-critical, mission-critical applications and workloads.

For those customers that continue to be committed to the platform, absolutely, this kind of what I call “re-platforming” could easily be called a “transition.” You are taking what you currently have and simply moving it onto the cloud. It is absolutely the lowest risk, the least cost, and the quickest time-to-deployment approach.

The vast majority of committed Unisys customers want to stay on the platform, and this provides the fastest way to get to the cloud -- with less risk and quickest benefits.
For those customers, just like with every platform, when you have an interest to transform to a different platform, there are other methods available. But I would say the vast majority of committed Unisys customers want to stay on the platform, and this provides the fastest way to get to the cloud -- with the less risk and the quickest benefits.

Gardner: Chuck, the process around cloud adoption has been going on for a while. For those of us advocating for cloud 10 or 12 years ago, we were hoping that it would get to the point where it would be a smooth transition. Tell me about the history and the benefits of how ClearPath Forward and Azure had come together specifically? How long have Microsoft and Unisys been at this? Why is now, as we mentioned earlier, a tipping point?

Lefebvre: We’ve been working on this for a little over a year. We did some initial work with two of our financial application partners and North America banking partners and the initial testing was very positive. Then as we were finishing our engineering work to do the validation, our internal Unisys IT organization, which operates about 25 production applications to run the business, went ahead in parallel with us and deployed half of those on MCP in Azure, using the very process that I described earlier.

Today, they are running 25 production applications. About half of them have been there for nine months and the other half for the last two months. They are supporting things like invoicing our customers, tracking our supply chain status, and so, a number of critical applications. 

We have taken that journey not just from an engineering point of view, but we’ve proven it to ourselves. We drank our own champagne, so to speak, and that’s given us a lot of confidence. It’s the right way to go, and we expect our clients will see those benefits as well.

Gardner: We haven’t talked about the economics too much. Are you finding, now that you’ve been doing this for a while, that there is a compelling economic story? A lot of people are fearful that a transition or migration would be very costly, that they won’t necessarily save anything by doing this, and so maybe are resistant. But what’s the dollars’ and cents’ impact that you have been seeing now that you’ve been doing this while transitioning ClearPath to Azure?

Rapid returns

Lefebvre: Yes, there are tangible financial benefits that our IT organization has measured. In these small isolated applications, they calculated about a half-a-million dollars in savings across three years in their return on investment (ROI) analysis. And that return was nearly immediate because the transition for them was mostly about planning the outage period to ensure a non-stop operation and make sure we always supported the business. There wasn’t actually a lot of labor, just more planning time. So that return was almost immediate.

Gardner: Bob, anything to offer on the economics of making a smooth transition to cloud?

Ellsworth: Yes, absolutely. I have found a couple of catalysts for customers as far as cost savings. If a customer is faced with a potential hardware upgrade -- perhaps the server they are running on is near end-of-life -- by moving the workload to the cloud and only paying for the consumption of what you use, it allows you to avoid the hardware upgrade costs. So you get some nice and rapid benefits in cost avoidance.

In addition, for workloads such as test and development environments, or user acceptance testing environments, in addition to production uses, the beauty of the cloud pricing is you only pay for what you are consuming.

 

So for those test and development systems, you don’t need to have hardware sitting in the corner waiting to be used during peak periods. You can spin up an environment in the cloud, do all of your testing, and then spin it back down. You get some very nice cost savings by not having dedicated hardware for those test and development environments.

Gardner: Let’s dig into the technology. What’s under the hood that’s allowing this seamless cloud transition, Chuck?

Secret sauce

Lefebvre: Underneath the hood is the architecture that we have transformed to over the last 10 years where we are already running our ClearPath systems on Intel-based hardware on a Microsoft Windows Server kernel. That allows that environment to be used and re-platformed in the same manner.

To accomplish that, originally, we had some very clever technology that allows the Unisys compilers generating unique instructions to be emulated on an Intel-based, Windows-based server.

That’s really the fundamental underpinning that first allowed those clients to run on Intel servers instead of on proprietary Unisys-designed chips. Once that’s been completed, we’re able to be much more flexible on where it’s deployed. The rigor to which Microsoft has ensured that Windows is Windows -- no matter if it’s running on a server you buy, whether it’s virtualized on Hyper-V, or virtualized in Azure -- really allows us to achieve that seamless operation of running in any of those three different models and environments.

Gardner: Where do you see the secret sauce, Bob? Is the capability to have Windows be pure, if you will, across the hybrid spectrum of deployment models?

Learn How to Transition
ClearPath Workloads
To the Cloud
Ellsworth: Absolutely, the secret sauce as Chuck described was that transformation from proprietary instruction sets to standard Intel instruction sets for their systems, and then the beauty of running today on-premises on Hyper-V or VMware as a virtual machine (VM). 

And then the great thing is with the technologies available, it’s very, very easy to take VMs running in the data center and migrate them to infrastructure as a service (IaaS) VMs running in the cloud. So, seamless transformation and making that migration.

You’re taking everything that’s running in your production system, or test and development systems, and simply deploying them up in the cloud’s VM instead of on-premises. So, a great process. Definitely, the earlier investment that was made allows that capability to be able to utilize the cloud.

Gardner: Do you have early adopters who have gone through this? How do they benefit?

Private- and public-sector success stories

Lefebvre: As I indicated earlier, our own Unisys IT operation has many production applications running our business on MCP. Those have all been moved from our own data center on an MCP Libra system to now running in the Microsoft Azure cloud. Our Unisys IT organization has been a long-time partner and user of Microsoft Office 365 and SharePoint in the cloud. Everything has now moved. This, in fact, was one of the last remaining Unisys IT operations that was not in the public cloud. That was part of our driver, and they are achieving the benefits that we had hoped for.

We also have two external clients, a banking partner is about to deploy a disaster recovery (DR) instance of their on-premises MCP banking application. That’s coming from our partner, Fiserv. Fiserv’s premier banking application is now available for running in Azure on our MCP systems. One of the clients is choosing to host a DR instance in Azure to support their on-premises production workload. They like that because, as Bob said, they only have to pay for it when they fire it up if they need to use that DR environment.

We have another large state government project that we’re just about to sign, where that client will be doing some of their ClearPath MCP workload and transition to and manage that in an Azure public cloud.

Once that contract is signed and we get agreement from that organization, we will be using that as one of our early use case studies.

Gardner: The public sector, with all of their mainframe apps, seems like a no-brainer to me for these transitions to the cloud. Any examples from the public sector that illustrate that opportunity?

Ellsworth: We have a number of customers, specifically on the Unisys MCP platform, that are evaluating moving their workloads from their data centers into the cloud. We don’t have a production system as far as I know yet, but they’re in the late stages of making that decision.

There are so many ways of utilizing the cloud, for things like DR, at a very low cost, instead of having to have a separate data center or failover system. Customers can even leave their production on-premises in the short-term and stand up their test and development in the cloud and run MCP system in that way.

And then, once they’re in the cloud, they gain the capability to set up a high-availability DR system or high-availability production system, either within the same Azure data center, or failover from one system to another if they have an outage, and all at a very low cost. So, there are great benefits.

One other benefit is elasticity. When I talk about customers, they say, “Well, gee, I have this end-of-month process and I need a larger mainframe then because of those occasional higher capacity requirements. Well, the beauty of the cloud is the capability to grow and shrink those VMs when you need more capacity for such end-of-month process, for example.

Again, you don’t have to pre-purchase the hardware. You really only pay for the consumption of the capacity when you need it. So, there are great advantages and that’s what we talk to customers about. They can get benefits from considering deploying new systems in the cloud. Those are some great examples of why we’re in late-stage conversations with several customers about deploying the solution.

Increased data analytics

Gardner: I supposed it’s a little bit early to address this, but are there higher-order benefits when these customers do make the cloud transition? You mentioned earlier AI, ML, and getting more of your data into an executable environment where you can take advantage of analytics across more and larger data sets.

Is there another shoe to drop when it comes to the ROI? Will they be able to do things with their data that just couldn’t have been done before, once you make a transition to cloud?

Ellsworth: Yes, that’s absolutely correct. When you move the systems up to the cloud, you’re now closer to all the new workloads and the advanced cloud services you can utilize to, for example, analyze all that data. It’s really about turning more data into intelligent action. 

In the past, when you built custom applications, you had to pretty much code everything yourself. Today, you consume services. There's no reason to build an application fro scratch. You consume services form the cloud.
Now, if you think of back to the 1980s and 1990s, or even 2000s, when you were building custom applications, you had to pretty much code everything yourselves. Today, the way you build an application is to consume services. There’s no reason for a customer to build a ML application from scratch. Instead, you consume ML services from the cloud. So, once you’re in the cloud, it opens up a world of possibilities to being able to continue that digital business transformation journey.

Lefebvre: And I can confirm that that’s a key element for our product proposition as well as from a ClearPath point of view. We have some existing technology, a particular component called Data Exchange, that does an outstanding change and data capture model. We can pump the data coming into that backend system of record and using, Kafka, for example, feed that data directly into an AI or ML application that’s already in place.

One of the key areas for future investment -- now that we have done the re-platforming to PaaS and IaaS – is extending our ePortal technology and other enabling software to ensure that these ClearPath applications really fit in well and leverage that cloud architecture. That’s the direction we see a lot of benefit in as we bring these applications into the public Azure cloud.

The cloud journey begins

Gardner: Chuck, if you are a ClearPath Forward client, you have these apps and data, what steps should you be taking now in order to put yourself in an advantageous position to make the cloud transition? Are there pre-steps before the journey? Or how should you be thinking in order to take advantage of these newer technologies?

Lefebvre: First of all, they should engage with their Unisys and Microsoft contacts that work with your organization to begin consultation on that journey. Data backup, data replication, DR, those elements around data and your policy with respect to data are the things that are likely going to change the most as you move to a different platform -- whether that’s from a Libra system to an on-premises virtualized infrastructure or to Azure.


What you’ve done for replication with a certain disk subsystem probably won’t be there any longer. It’ll be done in a different way, and likely it’ll be done in a better way. The way you do your backups will be done differently.

Now, we have partnered with Dynamic Solutions International (DSI) and they offer a virtualized virtual tape solution so that you can still use your backup scripts on MCP to do backups in exactly the same way in Azure. But you may choose to alter the way you do backups.

So, your strategy for data and how you handle that, which is so very important to these enterprise class mainframe applications, that’s probably the place where you’ll need to do the most thinking and planning, around data handling.

Gardner: For those familiar with BriefingsDirect, we like to end our discussions with a forward-looking vision, an idea of what’s coming next. So when it comes to migrating, transitioning, getting more into the cloud environments -- be that hybrid or pure public cloud -- what’s going to come next in terms of helping people make the transition but also giving them the best payoff when they get there?

The cloud journey continues

Ellsworth: It’s a great question, because you should think of the world of opportunity, of possibility. I look back at my 47 years in the industry and it’s been incredible to see the transformations that have occurred, the technology advancements that have occurred, and they are coming fast and furious. There’s nothing slowing it down.

And so, when we see the cloud today, a lot of customers are still considering the cloud for strategy and for building any new solutions. You go into the cloud first and have to justify staying on-premises, and then customers move to a cloud-only strategy where they’re able to not only deploy new solutions but migrate their existing workloads such as ClearPath up to the cloud. They get to a point where they would be able to shut down most of what they run in their data centers and get out of that business of operating IT infrastructure and having operation support provided for them as-a-service.

Learn How to Transition
ClearPath Workloads
To the Cloud
Next, they move into transforming through cultural change in their own staff. Today the people that are managing, maintaining, and running new systems will have an opportunity to learn new skills and new ways of doing things, such as cloud technology. What I see over the next two to three years is a continuation of that journey, the transformation not just of the solutions the customers use, but also the culture of the people that operate and run those solutions.

Gardner: Chuck, for your installed base across the world, why should they be optimistic about the next two or three years? What’s your vision for how their situation is only going to improve?

Lefebvre: Everything that we’ve talked about today is focused on our ClearPath MCP market and the technology that those clients use. As we go forward into 2021, we’ll be providing similar capabilities for our ClearPath OS 2200 client base, and we’ll be growing the offering.


Today, we’re starting with the low-end of the customer base: development, test, DR, and the smaller images. But as the Microsoft Azure cloud matures, as it scales up to handle our scaling needs for our larger clients, we’ll see that maturing. We’ll be offering the full range of our products in the Azure cloud, right on up to our largest systems.

That builds confidence across the board in our client base; in Microsoft and in Unisys. We want to crawl, then walk, and then run. That journey, we believe, is the safest way to go. And as I mentioned earlier, this initial workload transformation is occurring through a re-platforming approach. The real exciting work is bringing cloud-native capabilities to do better integration of those systems of record, with better systems of engagement, that the cloud-native technology is offering.

And we have some really interesting pieces under development now that will make that additional transformation straightforward. Our clients will be able to leverage that – and continue to extend that back-end investment in those systems. So we’re really excited about the future.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsors: Unisys and Microsoft.

A discussion on how many organizations face a reckoning to move mainframe applications to a cloud model without degrading the venerable and essential systems of record. Copyright Interarbor Solutions, LLC, 2005-2020. All rights reserved.

You may also be interested in: