Thursday, July 16, 2020

AWS and Unisys join forces to accelerate and secure the burgeoning move to cloud

https://www.unisys.com/offerings/cloud-and-infrastructure-services/cloudforte/cloudforte-for-aws

A powerful and unique set of circumstances are combining in mid-2020 to make safe and rapid cloud adoption more urgent and easier than ever.

Dealing with the novel coronavirus pandemic has pushed businesses to not only seek flexible IT hosting models, but to accommodate flexible work, hasten applications’ transformation, and improve overall security while doing so.

This next BriefingsDirect cloud adoption best practices discussion examines how businesses plan to further use cloud models to cut costs, manage operations remotely, and gain added capability to scale their operations up and down.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. 

To learn more about the latest on-ramps to secure an agile cloud adoption, please welcome  Anupam Sahai, Vice President and Cloud Chief Technology Officer at Unisys, and Ryan Vanderwerf, Partner Solutions Architect at Amazon Web Services (AWS). The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.


Here are some excerpts:

Gardner: Anupam, why is going to the public cloud an attractive option more now than ever?

Sahai
Sahai: There are multiple driving factors leading to these tectonic shifts. One is that the whole IT infrastructure is moving to the cloud for a variety of business and technology reasons. And then, as a result, the entire application infrastructure -- along with the underlying application services infrastructure -- is also moving to the cloud.

The reason is very simple because of what cloud brings to the table. It brings a lot of capabilities, such as providing scalability in a cost-effective manner. It makes IT and applications behave as a utility and obviates the need for every company to host local infrastructure, which otherwise becomes a huge operations and management challenge.

So, a number of business and technological factors, along with the COVID-19 pandemic situation, which essentially makes us work remotely, and having cloud-based services and applications available as a utility makes them easy to consume and use.

Public cloud on everyone’s horizon 

Gardner: Ryan, have you seen in your practice over the past several months more willingness to bring more apps into the public cloud? Are we seeing more migration to the cloud?

Vanderwerf: We’ve definitely had a huge uptick in migration. As people can’t be in an office, things like workspaces and doing remote desktops, have also seen a huge increase. People are trying to find ways to be elastic, cost-efficient, and make sure they’re not spending too much money.

Vanderwerf
Following up on what Anupam said, the reasons people are moving in the cloud haven’t changed. They have just been accelerated because they need agility and to speed-up access to the resources they need. They need cost savings by not having to maintain data centers by themselves.

By being more elastic, they can provision only for what they’re using and not have stuff running and costing money when you don’t need to. They can also deploy globally in minutes, which is a big deal across many regions, and allows people to innovate faster.

And right now, there’s a need to innovate faster, get more revenue, and cut costs – especially in times where fluctuation in demand goes up and down. You have to be ready for it.

Gardner: Yes, I recently spoke with a CIO who said that when the pandemic hit, they had to adjust workloads and move many from a certain set of apps that they weren’t going to be using as much to a whole other set that they were going to be using a lot more. And if it weren’t for the cloud, they just never would have been able to do that. So agility saved them a tremendous amount of hurt.

Anupam, why when we seek such cloud agility do we also have to think about lower risk and better security?

Sahai: Risk and security are critical because you’re talking about commercial, mission-critical workloads that have potentially sensitive data. As we move to the cloud, you should think three different trajectories. And some of this, of course, is being accelerated because of the COVID-19 pandemic.
Learn More About 
Unisys CloudForte
One of the cloud-migration trajectories, as Ryan said earlier, is the need for elastic computing, cost savings, performance, and efficiencies when building, deploying, and managing applications. But as we move applications and infrastructure to the cloud, there is a need to ensure that the infrastructure falls under what is called the shared responsibility model, where the cloud service provider protects and secures infrastructure up to a certain level and then the customers have their responsibility, a shared responsibility, to ensure that they’re protecting their workloads, applications, and critical data. They also have to comply with the regulations that those customers need to adhere to. 

In such a shared responsibility model, customers need to work very closely with the service providers, such as AWS, to ensure they are taking care of all security and compliance-related issues.

https://www.unisys.com/offerings/cloud-and-infrastructure-services/cloudforte/cloudforte-for-aws
You know, security breaches in the cloud -- while less than compared to on-premises-related deployments -- are still pretty rampant. That’s because some of the cloud security hygiene-related issues are still not being taking care of. That’s why solutions have to manage security and compliance for both the infrastructure and the apps as they move from on-premises to the cloud.

Gardner: Ryan, shared responsibility in practice can be complex when it’s hard to know where one party’s responsibility begins and ends. It cuts across people, process, and even culture.

When doing cloud migrations, how should we make sure there are no cracks for things to fall through? How do we make sure that we segue from on-premises to cloud in a way that the security issues are maintained throughout?

Stay safe with best-practices

Vanderwerf: Anupam is exactly right about the shared responsibility model. AWS manages and controls the components from the host operating system and virtualization layer down to physically securing the facilities. But it is up to AWS customers to build secure applications and manage their hygiene.

We have programs to help customers make sure they’re using those best practices. We have a well-architected program. It’s available on the AWS Management Console, and we have several lenses if you’re doing specific things like serverless, Internet of things (IoT), or analytics, for example.
Solutions architects can help the customer review all of their best practices and do a deep-dive examination with their teams to raise any flags that people might not be aware of and help find solutions.

Things like that have to be focused toward the business, but solutions architects can help the customer review all of their best practices and do a deep-dive examination with their teams to raise any flags that people might not be aware of and help them find solutions to remedy them.

We also have an AWS Technical Baseline Review that we do for partners. In it we make sure that partners are also following best practices around security and make sure that the correct things are in place for a good experience for their customers as well.


Gardner: Anupam, how do we ensure security-as-a-culture from the beginning and throughout the lifecycle of an application, regardless of where it’s hosted or resides? DevSecOps has become part of what people are grappling with. Does the security posture need to be continuous?

Sahai: That’s a very critical point. But first I want to double-click on what Ryan mentioned about the shared responsibility model. If you look at the overall challenges that customers face in migrating or moving to the cloud, there is certainly the security and compliance part of it that we mentioned.

There is also the cost governance issue and making sure it’s a well-architected framework architecture. The AWS Well-Architected Framework (WAF), for example, is supported by Unisys.

https://www.prnewswire.com/news-releases/new-security-and-optimization-features-in-unisys-cloudforte-bolster-services-delivered-on-amazon-web-services-300967623.html

Additionally, there are a number of ongoing issues around optimization, cost governance, security, compliance governance, and optimization of workloads that are critical for our customers. Unisys does a Cloud Success Barometer study every year and, and what we find is very interesting.

One thing is clear, about 90 percent of organizations are transitioned to the cloud. So no surprise there. But in the journey to the cloud what we also found is that 60 percent of the organizations are unable to move to the cloud, or hold on to their cloud migrations, because of some of these unexpected roadblocks. And so that’s where partners like Unisys and AWS are coming together to offer visibility and solutions to address them. Those challenges remain, and, of course, we are able to help address them.

Coming back to the DevSecOps question, let’s take a step back and understand why DevOps came into being. It was basically because of the migration to the cloud that we had the need to break down the silos between development and operations to deploy infrastructure-as-code. That’s why DevOps essentially brings about faster, shorter development cycles; faster deployment, faster innovation.

Studies have shown that DevOps leads to at least 60 percent faster innovation and turnaround time compared to traditional approaches, not to mention the cost savings and the IT headcount savings when you merge the dev and ops organizations.
As DevOps goes mainstream, and as cloud-centric applications are becoming mainstream, there is a need to inject security into the DevOps cycle. Having DevSecOps is key.

But as DevOps goes mainstream, and as cloud-centric applications are becoming mainstream, there is a need to inject security into the DevOps cycle. So, having DevSecOps is key. You want to enable developers, operations, and security professionals to work together on yet another silo, to break them down and merge with the DevOps team.

But we also need to provide tools that are amenable to the DevOps processes, continuous integration/continuous delivery (CI/CD) tools that enable the speed and agility needed for DevOps, but also injecting security -- without slowing them down. It is a challenge, and that’s why the all-new field of DevSecOps enables security and compliance injection into the DevOps cycle. It is very, very critical.

Gardner: Right, you want to have security but without giving up agility and speed. How have Unisys and AWS come together to ease and reduce the risk of cloud adoption while greasing the skids to the on-ramps to cloud adoption?

Smart support on the cloud journey

Sahai: Unisys in December 2019 announced CloudForte capabilities with the AWS cloud. A number of capabilities were announced that help customers adopt cloud without worrying about security and compliance.

CloudForte today provides a comprehensive solution to help customers manage their customer cloud journeys, whether it’s greenfield or brownfield; and there is hybrid cloud support, of course, for the AWS cloud along with multi-cloud support from a deployment perspective.

The solution combines production services that enable three primary use cases: Cloud migration, as we talked about, and apps migration using DevSecOps. We’ve codified that in terms of best practices, reference architecture, and well-architected principles, and we have wrapped that in advisory services and deployment services as well.
Learn More About 
Unisys CloudForte
The third use case is around cloud posture management, which is understanding and optimizing existing deployments, including hybrid cloud deployments, to ensure you’re managing costs, managing security and compliance, and also taking care of any other IT-related issues around governance of resources to make sure that you migrate to the cloud in a smart and secure manner.

Gardner: Ryan, why did AWS get on-board with CloudForte? What was it about it that was attractive to you in helping your customers?

Vanderwerf: We are all about finding solutions that help our customers and enabling our partners to help their customers. With the shared responsibility model, that’s on the customer, and CloudForte has really good risk management and a portfolio of applications and services to help people get ahold of that responsibility themselves.

Instead of customers trying to go on their own -- or just following general best practices – Unisys also has the tooling in place to help customers. That’s pretty important because with DevSecOps, people suffer from a lack of business agility, security agility, and face the risks around change to their businesses. People fear that.
With the shared responsibility model, that's on the customer, and CloudForte has really good risk management and a portfolio of apps and services to help people get ahold of that responsibility themselves.

These tools have really helped customers manage that journey. We have a good feeling about being secure and being compliant, and the dashboards they have inside of it are very informative, as a matter of fact.

Gardner: Of course, Unisys has been around for quite a while. They have had a very large and consistent installed base over the years. Are the tooling, services, and value in CloudForte bringing in a different class of organization, or different parts of organizations, into AWS?

Vanderwerf: I think so, especially in the enterprise area where they have a lot of things to wrangle on the journey to the cloud -- and it’s not easy. When you’re migrating as much as you can to a cloud setting – seeking to keep control over assets and making sure there are no rogue things running -- it’s a lot for an enterprise IT manager to handle. And so, the more tools they have in their tool-belt to manage that is way better than them trying to cook up their own stuff.


Gardner: Anupam, did you have a certain type of organization, or part of an organization, in mind when you crafted CloudForte for AWS?

Sahai: Let’s take a step back and understand the kind of services we offer. Our services are tailored and applicable for both enterprises and the public sector. We offer advisory services to begin with, which essentially allows us to pass-through products. You have the CloudForte Navigator product, which allows us to assess the current posture of the customer and understand the application capabilities the customer has, whether it needs a transformation, and, of course, this is all driven by business outcomes that the customers desires.

https://securitybrief.eu/story/unisys-delivers-new-cloud-security-features-on-aws

Second, through CloudForte we bring best practices, reference architectures, and blueprints for the various customer journeys that I mentioned earlier. Greenfield or brownfield opportunities, whatever the stage of adoption, we have created a template to help with the specific migration and customer journey.

Once customers are able and ready to get on-boarded, we enable DevSecOps using CI/CD tools, best practices, and tools to ensure the customers use a well-architected framework. We also have a set of accelerators provided by Unisys that enable customers to get on-boarded with guardrails provided. So, in short, the security policies, compliance policies, organizational framework, and the organizational architectures are all reflected in the deployment.

Then, once it's up and running, we manage and operate the hybrid cloud security and compliance posture to ensure that any deviations, any drifts, are monitored and remediated to ensure they are continuously having an acceptable posture.

https://www.unisys.com/offerings/cloud-and-infrastructure-services/cloudforte/cloudforte-for-aws
Finally, we also have AIOps capabilities, which include AI-enabled outcomes that the customer is looking for. We use artificial intelligence and machine learning (AI/ML) technologies to optimize the resources. We drive cost savings through resource optimization. We also have an instant management capability to bring down costs dramatically using some those analytics and AIOps capabilities.

So our objective is to drive digital transformation for customers using a combination of products and services that CloudForte has, and working in close conjunction with what AWS offers, so that we create a compelling offering that’s complementary to each other, but very compelling from a business outcomes perspective.

Gardner: The way you describe them, it sounds like these services would be applicable to almost any organization, regardless of where they are on their journey to the cloud. Tell us about some of the secret sauce under the hood. The Unisys Stealth technology, in particular, is unique in how it maintains cloud security.

Stealth solutions for hybrid security 

Sahai: The Unisys Stealth technology is very compelling, especially in the hybrid cloud security sense. As we discussed earlier, the shared responsibility model requires customers to take care of and share the responsibility to make sure that workloads in the cloud infrastructure are compliant and secure.

And we have a number of tools in that regard. One is the CloudForte Cloud Compliance Director solution, which allows you to assess and manage your security and compliance posture for the cloud infrastructure. So it’s a cloud security posture management solution.

Then we also have the Stealth solution, essentially a zero trust, micro-segmentation capability that leverages the identity, or the user roles, in an organization to establish a community that’s trusted and is capable of doing certain actions. It creates communities of interest that allow and secure through a combination of micro-segmentation and identity management.

https://www.marketscreener.com/UNISYS-CORPORATION-14744/news/Unisys-Achieves-Amazon-Web-Services-Managed-Service-Provider-and-Amazon-Web-Services-Well-Architec-30855799/

Think of that as a policy management and enforcement solution that essentially manipulates the OS native stacks to enforce policies and rules that otherwise are very hard to manage.

If you take Stealth and marry that with CloudForte compliance, some of the accelerators, and Navigator, you have a comprehensive Unisys solution for hybrid cloud security, both on-premises and in the AWS cloud infrastructure and workloads environment.

Gardner: Ryan, it sounds like zero trust and micro-segmentation augment the many services that AWS already provides around identity and policy management. Do you agree that the zero trust and micro-segmentation aspects of something like Stealth dovetail very well with AWS services?

Vanderwerf: Oh, yes, absolutely. And in addition to that, we have a lot of other security tools like AWS WAF, AWS Shield, Security Hub, Macie, IAM Access Analyzer and Inspector. And I am sure under the hood they are using some of these services directly.

The more power you have the better. And it’s tough to manage. Some people are just getting into cloud and they have challenges. It’s not always technical, sometimes it's just communications issues at a company or lack of sponsorship or resource allocation or undefined key performance indicators (KPI). So all these things, or even just timing, those are all important for a security situation.

Gardner: All those spinning parts, those services, that’s where the professional services come in so that organizations don’t have to feel like they are doing it alone. How does the professional services and technical support fit into helping organizations go about these cloud journeys?

Sahai: Unisys is trusted by our customers to get things right. So we say that we do cloud correctly, and we do cloud right, and that includes a combination of trusted advisory services. That means everything from identifying legacy assets, to billing, and to governance, and then using a combination of products and services to help customers transform as they move to the cloud.
Our cloud-trained people and expertise speeds up the migrations, gives visibility, and provides operational improvements. Thereby we are able to do cloud right and in a secure fashion by establishing security practices, trust through security and compliance, and AIOps.

Our cloud-trained people and expertise speeds up the migrations, gives visibility, and provides operational improvements. Thereby we are able to do the cloud right and in a secure fashion by establishing security practices, establishing trust through a combination of micro-segmentation, security, and compliance ops, AIOps, and that certainly is the combination of products and services that we offer today.

And our customers tell us we are rated very highly, 95 percent-plus in terms of customer satisfaction. It’s a testament to the fact that our professional services -- along with our products – complements the AWS services and products that customers need to deliver their business outcomes.

Gardner: Anupam, do you have any examples of organizations that leveraged both AWS and Unisys CloudForte? What have they been doing and what did they get from it?

Student success supported 

Sahai: I have a number of examples where a combination of CloudForte and AWS deployments are happening. One is right here where I live in the San Francisco Bay Area. The business challenge they faced was to enhance the student learning experience and deliver technology services critical to student success and graduation initiatives. And given the COVID-19 scenario, you can understand why cloud becomes an important factor in that.

Unisys cloud and infrastructure services, using CloudForte, helped them deploy a hybrid cloud model with AWS. We had Ansible for automation, ServiceNow for IT service management (ITSM), AIOps, and we deployed a logarithm and a portfolio of tools and services.

They were then able to accelerate their capability to offer critical administrative services, such as student scheduling and registration, to about half-a-million students and 52,000 faculty and staff members across 23 campuses. It delivered 30 percent better performance while realizing about 33 percent cost savings and 40 percent growth in usage of these services. So, great outcomes, great cost savings, and you are talking about reduction of about $4.5 million in computed storage costs and about $3 million in cost avoidance.
Learn More About 
Unisys CloudForte
So this is an example of a customer who leveraged the power of the AWS Cloud and the CloudForte products and services to deliver these business outcomes, which is a win-win situation for us. So that’s one example.

Gardner: Ryan, what do you expect for the next level of cloud adoption benefits? Is the AIOps something that we are going to be doubling-down on? Or are there other services? How do you see the future of cloud adoption improving?

The future is integrated 

Vanderwerf: It’s making sure everything is able to integrate. Like, for example, with a hybrid cloud situation we now have AWS Outposts. Now people can run a rack of servers in their data center and be connected directly to the cloud.

Some things don’t make sense always to go to cloud. Perhaps machinery running analytics, for example, has very low latency requirements. You could still write native applications to work with the cloud in AWS and run those apps locally.

Also, AIOps is huge because so many people are doing AI/ML in their workloads, from deciding security posture threats, to finding whether machines are breaking down. There are so many options in data analytics and then wrangling all these things together with data lakes. Definitely, the future is about better integrating all of these things.

AI/MLOps is really popular now because there are so many data scientists and people integrating ML into things. They need some sort of organizational structure to keep that organized, just like CI/CD did for DevOps. And all of those areas continue to grow. At AWS, we have 175-plus services, and they are always coming up with new ones every day. I don’t see that slowing down anytime soon.

Gardner: Anupam, for your future outlook, to this point that Ryan raised about integration, how do you see organizations like Unisys helping to manage the still growing complexity around the adoption and operations in the cloud and hybrid cloud environments?

Sahai: Yes, that is a huge challenge. As Ryan mentioned, hybrid cloud is here to stay. Not everything will move to the cloud. And while cloud migration trends will continue, there will be some core set of apps that will be staying on-premises. So leveraging AWS Outposts, as he said, to help with the hybrid cloud journeys will be important. And Unisys offers hybrid cloud and multi-cloud offerings that we are certainly committed to.
Security and compliance issues are not going away, unfortunately. Cloud breaches are out there. And so there is a need to actively manage and be proactive about managing your security and compliance posture. Customers are going to work with AWS and Unisys to fortify both their defense and offense proactively.

The other thing is that security and compliance issues are not going away, unfortunately. Cloud breaches are out there. And so there is a need to actively manage and be proactive about managing your security and compliance posture. And so that’s another area that I think our customers are going to be working together with AWS and Unisys to help them fortify not just their defenses, but also the offense -- to be proactive in dealing with these threats and breaches and preventing them.

The third area is around AIOps, and this whole notion of AI-enabled CloudForte, and we see AI and ML to be prorating every path of the customer journey. Not just in AIOps, which is the operations and management piece, which is a critical part of what we do, but AI in enabling the customer journeys in terms of predicting.

So let’s say a customer is trying to move to the cloud, we want to be able to use predictions to predict what their customer journey would look like if they move to the cloud and to be proactive about predicting and remediating issues that might come up.

And, of course, AI is fueled by the data revolution -- the data lakes, the data buses -- that we have today to transport data seamlessly across applications, across hybrid cloud infrastructures, and to tie all of this together. You have the app migration, the CI/CD, and the DevSecOps capabilities that are part of the CloudForte advisory and product services.


We are enabling customers to move to the cloud without compromising speed, agility, and security and compliance, whether they are moving infrastructure to the cloud, using infrastructure as code, or moving applications to the cloud using applications as code by leveraging the micro-services infrastructure, the cloud native infrastructure that AWS provides -- and Kubernetes included.

We have support for a lot of these capabilities today, and we will continue to evolve them to make sure no matter where the customer is in their customer journey to the cloud -- whatever the stage of evolution -- we have a compelling set of production services that customers can use to get to the cloud and stay there with the help of Unisys and AWS.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: Unisys and Amazon Web Services.

You may also be interested in:

Wednesday, July 1, 2020

How REI used automation to cloudify infrastructure and rapidly adjust its digital pandemic response

https://www.rei.com/about-rei

Like many retailers, Recreational Equipment, Inc. (REI) was faced with drastic and rapid change when the COVID-19 pandemic struck. REI’s marketing leaders wanted to make sure that their online e-commerce capabilities would rise to the challenge. They expected a nearly overnight 150 percent jump in REI’s purely digital business.

Fortunately REI’s IT leadership had already advanced their systems to heightened automation, which allowed the Seattle-based merchandiser to turn on a dime and devote much more of its private cloud to the new e-commerce workload demands.

The next BriefingsDirect Voice of Innovation interview uncovers how REI kept its digital customers and business leadership happy, even as the world around them was suddenly shifting.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To explore what works for making IT agile and responsive enough to re-factor a private cloud at breakneck speed, we’re joined by Bryan Sullins, Senior Cloud Systems Engineer at REI in Seattle. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.


Here are some excerpts:

Gardner: When the pandemic required you to hop-to, how did REI manage to have the IT infrastructure to actually move at the true pace of business? What put you in a position to be able to act as you did?

Digital retail demands rise 

Sullins: In addition to the pandemic stay-at-home orders a couple months ago, we also had a large sale previously scheduled for the middle of May. It’s the largest sale of the year, our anniversary sale.

Sullins
And ramping up to that, our marketing and sales department realized that we would have a huge uptick in online sales. People really wanted to get outside, because people could go outside without breaking any of the social distancing rules.

For example, bicycle sales were up 310 percent compared to the same time last year. So in ramping up for that, we anticipated our online presence at rei.com was going to go up by 150 percent, but we wanted to scale up by 200 percent to be sure. In order to do that, we had to reallocate a bunch of ESXi hosts in VMware vSphere. We either had to stand up new ones or reallocate from other clusters and put them into what we call our digital retail presence.

As a result of our fully automated process, using Hewlett Packard Enterprise (HPE) OneView, Synergy, and Image Streamer, we were able to reallocate 6 out of the 17 total hosts needed. We were able to do that in 18 minutes, all at once -- and that’s single touch, that’s launching the automation and then pulling them from one cluster and decommissioning them and placing them all the way into the digital retail clusters.

We also had to move some from our legacy platform, they aren’t at HPE Synergy yet, and those took an additional three days. But those are in transition, we are moving through to that fully automated platform all around.

Gardner: That’s amazing because just a few years ago that sort of rapid and automated transition would have been unheard of. Even at a slow pace you weren’t guaranteed to have the performance and operations you wanted.

If you were not able to do this using automation – if the pandemic had hit, heaven forbid, five or seven years ago – what would have been the outcome?
We needed to make sure we had the infrastructure capacity so that nothing failed under a heavy load. We were able to do it in the time-frame, and be able to get some sleep.

Sullins: There were actually two outcomes from this. The first is the fairly obvious issue of not being able to handle the online traffic on our rei.com retail presence. It could have been that people weren’t able to put stuff into a shopping cart, or inventory decrement, and so on. It could have been a very broad range of things. We needed to make sure we had the infrastructure capacity so that none of that fails under a heavy load. That was the first part.

Gardner: Right, and when you have people in the heat of a purchasing moment, if you’re not there and it’s not working, they have other options. Not only would you lose that sale, you might lose that customer, and your brand suffers as well.

Sullins: Oh, without a doubt, without a doubt.

The other issue, of course, would have been if we did not meet our deadline. We had just under a week to get this accomplished. And if we had to do this without a fully automated approach, we would have had to return to our managers and say, “Yeah, so like we can’t do it that quickly.” But with our approach, we were able to do it all in the time frame -- and be able to get some sleep in the interim. So it was a win-win.

Gardner: So digital transformation pays off after all?

Sullins: Without a doubt.

Gardner: Before we learn more about your journey to IT infrastructure automation, tell us about REI, your investments in advanced automation, and why you consider yourself a data-driven digital business?

Automation all the way 

Sullins: Well, a lot of that precedes me by quite a bit. Going back to the early 2000s, based on what my managers tell me, there was a huge push for REI become an IT organization that just happens to do retail. The priority is on IT being a driving force behind everything we do, and that is something that, at the time, REI really needed to do. There are other competitors, which we won’t name, but you probably know who they are. REI needed to stay ahead of that curve.

https://www.rei.com/
So since then there have been constant sweeping and cyclical changes for that digital transformation. The most recent one is the push for automating all things. So that’s the priority we have. It’s our marching orders.

Gardner: In addition to your company, culture, and technology, tell us about yourself, Bryan. What is it about your background and personal development that led you to be in a position to act so forthrightly and swiftly?

Sullins: I got my start in IT back in 1999. I was a public school teacher before that, and then I made the transition to doing IT training. I did IT training from 1999 to about 2012. During those years, I got a lot of technology certifications, because in the IT training world you have to.

I began with what was, at the time, called the Microsoft Certified Solutions Expert (MCSE) certification. Then I also did the Linux Professional Institute. I really glommed on to Linux. I wanted to set myself apart from the rest of the field back then, so I went all-in on Linux.

And then, 2008-2009-ish, I jumped on the VMware train and went all-in on VMware and did the official VMware curriculum. I taught that for about three years. Then, in 2012, I made the transition from IT training into actually doing this for real as an engineer working at Dell. At the time, Dell had an infrastructure-as-a-service (IaaS) healthcare cloud that was fairly large – 1,200-plus ESXi hosts. We were also responsible for the storage and for the 90-plus storage area network (SAN) arrays as well.
In a large environment, you really have to automate. It's been the focus of my career. I typically jump right into new technology.

In an environment that large, you really have to automate. I cut my teeth on automating through PowerCLI and Ansible. Since then, about 2015, it’s been the focus of my career. I’m not saying I’m a guru, by any means, but it’s been a focus of my career.

Then, in 2018, REI came calling. I jumped on that opportunity because they are a super-awesome company, and right off the bat I got free reign over: if you want to automate it, then you automate it. And I have been doing that ever since August of 2018.

Gardner: What helped you make the transition from training to cloud engineer?

Sullins: I typically jump right into new technology. I don’t know if that comes from the training or if that’s just me as a person. But one of the positives I’ve gotten from the training world is that you learn a 100 percent of the feature base that’s available with said technology. I was able to take what I learned and knew from VMware and then say, “Okay, well, now I am going to get the real-world experience to back that up as well.” So it was a good transition.

Gardner: Let’s look at how other organizations can anticipate the shift to automation. What are some of the challenges that organizations typically face when it comes to being agile with their infrastructure?

Manage resistance to cloud 

Sullins: The challenges that I have seen aren’t usually technical. Usually the technology that people use to automate things are ready at hand. Many are free; like Ansible, for example, is free. PowerCLI is free. Jenkins is free.

So, people can start doing that tomorrow. But the real challenge is in changing people’s mindset about a more automated approach. I think that it’s tough to overcome. It’s what I call provisioning by council. More traditional on-premises approaches have application owners who want to roll out x number of virtual machines (VMs), with all their particular specs and whatnot. And then a council of people typically looks at that and kind of scratches their chin and says, “Okay, we approve.” But if you need to scale up, that council approach becomes a sort of gate-keeping process.

https://www.hpe.com/us/en/solutions/infrastructure/composable-infrastructure.html

With a more automated approach, like we have at REI, we use a cloud management platform to automate the processes. We use that to enable self-service VMs instead of having a roll out by council, where some of the VMs can take days or weeks roll out because you have a lot of human beings touching it along the way. We have a lot of that process pre-approved, so everybody has already said, “Okay, we are okay with the roll out. We are okay with the way it’s done.” And then we can roll that out in 7 to 10 minutes rather than having a ticket-based model where somebody gets to it when they can. Self-service models are able to do that much better.

But that all takes a pretty big shift in psychology. A lot of people are used to being the gatekeeper. It can make them uncomfortable to change. Fortunately for me, a lot of the people at REI are on-board with this sort of approach. But I think that resistance can be something a lot of people run into.

Gardner: You can’t just buy automation in a box off of a shelf. You have to deal with an accumulation of manual processes and habits. Why is moving beyond the manual processes culture so important?

Sullins: I call it a private cloud because that means there is a healthy level of competition between what’s going in the public cloud and what we do in that data center.

The public cloud team has the capability of “selling” their solution side-by-side with ours. When you have application owners who are technically adept -- and pretty much all of them are at REI -- they can be tempted to say, “Well, I don’t want to wait a week or two to get a VM. I want to create one right now out on the public cloud.”
There is a healthy level of competition between what's going in the public cloud and what we do in the date center. We offer our customers a spectrum of services. And now they can do that in an automated way. That's a big win.

That’s a big challenge for us. So what we are trying to accomplish -- and we have had success so far through the transition – is to offer our customers a spectrum of services. So that’s great.

The stakeholders consuming that now gain flexibility. They can say, “Okay, yeah, I have this application. I want to run it in the public cloud, but I can’t based on the needs for that application. We have to run it on-premises.” And now they can do that in an automated way. That’s a big win, and that’s what people expect now, quite honestly.

Gardner: They want the look and feel of a public cloud but with all the benefits of the private cloud. It’s up to you to provide that. Let’s find out how you did.

How did you overcome the challenges that we talked about and what are the investments that you made in tools, platforms, and an ecosystem of players that accomplished it?

Sullins: As I mentioned previously, a lot of our utilities are “free,” the Ansibles of the world, PowerCLI, and whatnot. We also use Morpheus to do self-service and the implications behind automating things on what I call the front end, the customer face. The issue you have there is you don’t get that control of scaling up before you provision the VM. You have to monitor and then roll it out on the backend. So you have to monitor for usage and then scale up on the backend, and seamlessly. The end users aren’t supposed to know that you are scaling up. I don’t want them to know. It’s not their job to know. I want to remain out of their way.


In order to do that, we’ve used a combination of technologies. HPE actually has a GitHub link for a lot of Ansible playbooks that plug right in. And then the underlying hardware adjacent management ecosystem platform is HPE OneView with HPE Synergy and Image Streamer. With a combination of all of those technologies we were able to accomplish that 18-minute roll-out of our various titles.

Gardner: Even though you have an integrated platform and solutions approach, it sounds like you have also made the leap from ushering pets through the process into herding cattle. If you understand my metaphor, what has allowed you to stop treating each instance as a pet into being able to herd this stuff through on an automated basis?

From brittle pets to agile cattle 

Sullins: There is a psychological challenge with that. In the more traditional approach – and the VMware shop listeners are going to be very well aware of this -- I may need to have a four-node cluster with a number of CPUs, a certain amount of RAM, and so on. And that four-node cluster is static. Yes, if I need to add a fifth down the line I can do that, but for that four-node cluster, that’s its home, sometimes for the entire lifecycle of that particular host.

https://www.rei.com/
With our approach, we treat our ESXi hosts as cattle. The HPE OneView-Synergy-Image Streamer technology allows us to do that in conjunction with those tools we mentioned previously, for the end point in particular.

So rather than have a cluster, and it’s static and it stays that way -- it might have a naming convention that indicates what cluster it’s in and where -- in reality we have cattle-based DNS names for ESXi hosts. At any time, the understanding throughout the organization, or at least for the people who need to know, is that any host can be pulled from one cluster automatically and placed into another, particularly when it comes to resource usage on that cluster. My dream is that the robots will do this automatically.

So if you had a cluster that goes into the yellow, with its capacity usage based on a threshold, the robot would interpret that and say, “Oh, well, I have another cluster over here with a host that is underutilized. I’m going to pull it into the cluster that’s in the yellow and then bring it back into the green again.” This would happen all while we sleep. When we wake up in the morning, we’d say, “Oh, hey, look at that. The robots moved that over.”

Gardner: Algorithmic operations. It sounds very exciting.

Automation begets more automation 

Sullins: Yes, we have the push-button automation in place for that. It’s the next level of what that engine is that’s going to make those decisions and do all of those things.

Gardner: And that raises another issue. When you take the plunge into IT automation, you are making your way down the Chisholm Trail with your cattle, all of a sudden it becomes easier along the way. The automation begets more automation. As you learn and grow, does it become more automated along the way?

Sullins: Yes. Just to put an exclamation point on this topic, imagine the situation we opened the podcast with, which is, “Okay, we have to reallocate a bunch of hosts for rei.com.” If it’s fully automated, and we have robots making those decisions, the response is instantaneous. “Oh, hey, we want to scale up by 200 percent on rei.com.” We can say, “Okay, go ahead, roll out your VM. The system will react accordingly. It will add physical hosts as you see fit, and we don’t have to do anything, we have already done the work with the automation.” Right?

https://h20195.www2.hpe.com/v2/GetPDF.aspx/c04815217.pdf
But to the automation begetting automation, which is a great way of putting it, by the way, there are always opportunities for more automation. And on a career side note, I want to dispel the myth that you automate your way out of a job. That is a complete and total myth. I’m not saying it doesn’t happen, where people get laid off as a result of automation. I’m not saying that doesn’t happen, but that’s relatively rare because when you automate something, that automation is going to need to be maintained because things change over time.

The other piece of that is a lot of times you have different organizations at various states of automation. Once you get your head above water to where it's, “Okay, we have this process and now it's become trivial because it's been automated.” We can now concentrate on automating either more things -- or you have new things that need to be automated. And whether that’s the process for only VMs, a new feature base, monitoring, or auto-scaling -- whatever it is -- you have the capability of from day one to further automate these processes.

Gardner: What was it specifically about the HPE OneView and Synergy that allowed you to move past the manual processes, firefighting, and culture of gatekeeping into more herding of cattle and being progressively automated?

Sullins: It was two things. The Image Streamer was number one. To date, we don’t run PXE boots infrastructure, not that we can't, it’s just not something that we have traditionally done. We needed a more standard process for doing that, and Image Streamer fit that and solved that problem.

The second piece is the provided Ansible playbooks that HPE has to kick off the entire process. If you are somewhat versed in how HPE does things through OneView, you have a server profile that you can impose on a blade, and that can be fully automated through Ansible.
Image Streamer allows us to say, "Okay, we build a gold image. We can apply that gold image to any frame in the cluster." We needed a more standard process, and Image Streamer solved that problem.

And, by the way, you don’t have to use Image Streamer to use Ansible automation. This is really more of an HPE OneView approach, whereby you can actually use it to do automated profiles and whatnot. But the Image Streamer is really what allows us to say, “Okay, we build a gold image. We can apply that gold image to any frame in the cluster.” That’s the first part of it, and the rest is configuring the other side.

Gardner: Bryan, it sounds like the HPE Composable Infrastructure approach works well with others. You are able to have it your way because you like Ansible, and you have a history of certain products and skills in your organization. Does the HPE Composable Infrastructure fit well into an ecosystem? Is it flexible enough to integrate with a variety of different approaches and partners?

Sullins: It has been so far, yes. We have anticipated leveraging HPE for our bare metal Linux infrastructure. One of the additional driving forces and big initiatives right now is Kubernetes. We are going all-in on Kubernetes in our private cloud, as well as in some of our worker nodes. We eventually plan on running those as bare metal. And HPE OneView, along with Image Streamer, is something that we can leverage for that as well. So there is flexibility, absolutely, yes.

Coordinating containers 

Gardner: It’s interesting, you have seen the transition from having VMware and other hypervisor sprawl to finding a way to manage and automate all of that. Do you see the same thing playing out for containers, with the powerful endgame of being able to automate containers, too?

Sullins: Right. We have been utilizing Rancher as part of our coordination tool for our Kubernetes infrastructure and utilizing vSphere for that. So we are using that.

As far as the containerization approach, REI has been doing containers before containers was a big thing. Our containerization platform has been around since at least 2015. So REI has been pretty cutting edge as far as that is concerned.

https://www.rei.com/about-rei

And now that Kubernetes has won the orchestration wars, as it were, we are looking to standardize that for people who want to do things online, which is to say, going back to the digital transformation journey.

Basically, the industry has caught up with what our super-awesome developers have done with containerization. But we are looking to transition the heavy lifting of maintaining a platform away from the developers. Now that we have a standard approach with Kubernetes, they don’t have to worry so much about it. They can just develop what they need to develop. It will be a big win for us.

Gardner: As you look back at your automation journey, have you developed a philosophy about automation? How this should this best work in the future?

Trust as foundation of automation 

Sullins: Right. Have you read Gene Kim’s The Unicorn Project? Well, there is also his The Phoenix Project. My take from that is the whole idea of trust, of trusting other people. And I think that is big.

I see that quite a bit in multiple organizations. For REI, we are going to work as a team and we trust each other. So we have a pretty good culture. But I would imagine that in some places that is still big challenge.

https://www.hpe.com/us/en/home.html
And if you take a look at The Unicorn Project, a lot of the issues have to do with trusting other human beings. Something happened, somebody made a mistake, and it caused an outage. So they lock it up and lock it away and say only certain people can do that. And then if you multiply that happening multiple times -- and then different individuals walking that down -- it leads to not being able to automate processes without somebody approving it, right?

Gardner: I can't imagine you would have been capable, when you had to transition your private cloud for more online activity, if you didn’t have that trust built into your culture.

Sullins: Yes, and the big challenge that might still come up is the idea of trusting your end users, too. Once you go into the realm of self-service, you come up on the typical what-ifs. What if somebody adds a zero and they meant to only roll out 4 VMs but they roll out 40? That’s possible. How do you create guardrails that are seamless? If you can, then you can trust your users. You decrease the risk and can take that leap of faith that bad things won’t happen.

Gardner: Tell us about your wish list for what comes next. What you would like HPE to be doing?

Small steps and teamwork rewards 

Sullins: My approach is to first automate one thing and then work out from there. You don’t have to boil the ocean. Start with something small and work your way up.

As far as next steps, we want auto scaling a physical layer and having the robots do all of that. The robots will scale up and down our requesters while we sleep.

We will continue to do application programming interface (API)-capable automation with anything that has a REST API. If we can connect to that and manipulate it, we can do pretty much whatever automation we want.

https://www.briefingsdirectblog.com/2019/09/hpe-strategist-mark-linesch-on-surging.html

We are also containerizing all things. So if any application can be containerized properly, containerize it if you can.

As far as what decision-making engine we have to do the auto-scaling on the physical layer, we haven’t really decided upon what that is. We have some ideas but we are still looking for that.

Gardner: How about more predictive analytics using artificial intelligence (AI) with the data that you have emanating from your data center? Maybe AIOps?

Sullins: Well, without a doubt. I, for one, haven’t done any sort of deep dive into that, but I know it’s all the rage right now. I would be open to pretty much anything that will encompass what I just talked about. If that’s HPE InfoSight, then that’s what it is. I don’t have a lot of experience quite honestly with InfoSight as of yet. We do have it installed in a proof of concept (POC) form, although a lot of the priorities for that have been shifted due to COVID-19. We hope to revisit that pretty soon, so absolutely.


Gardner: To close out, you were ahead of the curve on digital transformation. That allowed you to be very agile when it came time to react to the COVID-19 pandemic.  What did that get you? Do you have any results?

Sullins: Yes, as a matter of fact, our boss’s boss, his boss -- so three bosses up from me -- he actually sits in on our load testing. It was an all-hands-on-deck situation during that May online sale. He said that it was the most seamless one that he had ever seen. There were almost no issues with this one.
We had done what we needed on the infrastructure side to make sure that we met dynamic demands. It was very successful. We went past our goals, so it was a win-win all the way around.

What I attribute that to is, yes, we had done what we needed on the infrastructure side to make sure that we met dynamic demands. Also, everybody worked as a team. Everybody, all the way up the stacks, from our infrastructure contribution, to the hypervisor and hardware layer, all the way on up to the application layer and the containers, and all of our DevOps stuff. It was very successful. We went past our goals of what we had thought for the sale, so it was a win-win all the way around.

Gardner: Even though you were going through this terrible period of adjustment, that’s very impressive.

Sullins: Yes.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.

You may also be interested in: