Wednesday, August 30, 2017

Inside story on developing the ultimate SDN-enabled hybrid cloud object storage environment

The next BriefingsDirect inside story interview explores how a software-defined data center (SDDC)-focused systems integrator developed an ultimate open-source object storage environment.

We’re now going to learn how Key Information Systems crafted a storage capability that may have broad extensibility into such realms as hybrid cloud and multi-cloud support. 

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. 

Here to help us better understand a new approach to open-source object storage is Clayton Weise, Director of Cloud Services at Key Information Systems in Agoura Hills, California. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: What prompted you to improve on the way that object storage is being offered as a service? How might this become a new business opportunity for you?

Weise: About a year ago, at Hewlett Packard Enterprise (HPE) Discover, I was wandering the event floor. We had just gotten out of a meeting with SwitchNAP, which is a major data center in Las Vegas. We had been talking to them about some preferred concepts and deployments for storage for their clients.
That discussion evolved into realizing that there are number of clients inside of Switch and their ecosystem that could make use of storage that was more locally based, that needed to be closer at hand. There were cost savings that could be gained if you have a connection within the same data center, or within the same fiber network.

Pulling data in and out of a cloud

Under this model, there would be significantly less expensive ways of pulling data in and out of a cloud, since you wouldn’t have transfer fees as you normally would. There would also be an advantage to privacy, and to cutting latency, and other beneficial things because of a private network all run by Switch and through their fiber network. So we looked at this and thought this might be interesting.

In discussions with the number of groups within HPE while wandering the floor at Discover, we found that there were some pretty interesting ways that we could play games with the network to allow clients to not have to uproot the way they do things, or force them to do things, for lack of a better term, “Our way.”  

If you go to Amazon Web Services or you go to Microsoft Azure, you do it the Microsoft way, or you do it the Amazon way. You don’t really have a choice, since you have to follow their guidelines.
They generally use object storage as an inexpensive way to store archival or less-frequently accessed data. Cloud storage became an alternative to tape and long-term storage. 

Where we saw value is, there are times in the mid-market space for clients -- ranging from a couple of hundred million dollars up to maybe a couple of billion dollars in annual revenue -- where they generally use object storage as kind of an inexpensive way to store archival, or less-frequently accessed, data. So [the cloud storage] became an alternative to tape and long-term storage.

We've had this massive explosion of unstructured data, files, and all sorts of things. We have a number of clients in medical and finance, and they have just seen this huge spike in data.

The challenge is: To deploy your own object storage is a fairly complex operation, and it requires a minimum number of petabytes to get started. In that mid-market, they are not typically measuring their storage in that petabytes level.

These customers are more typically in the tens to hundreds of terabytes range, and so they need an inexpensive way to offload that data and put it somewhere where it makes sense. In the medical industry particularly, there's a lot of concern about putting any kind of patient data up in a public cloud environment -- even with encryption.

We thought that if we are in the same data center, and it is a completely private operation that exists within these facilities, that will fulfill the total need -- and we can encrypt the data.

But we needed a way to support such private-cloud object storage that would be multitenant. Also, we just have had better luck working with open standards. The challenge with dealing with proprietary systems is you end up locked into a standard, and if you pick wrong, you find yourself having to reinvent everything later on.

I come from a networking background; I was an Internet plumber for many years. We saw the transition then on our side when routing protocols first got introduced. There were proprietary routing protocols, and there were open standards, and that’s what we still use today.

Transition to
HPE Data Center Networking

So we took a similar approach in object storage as a private-cloud service. We went down the open source path in terms of how we handled the provisioning. We needed something that integrated well with that. We needed a system that had the multitenancy, that understood the tenancy, and that is provided by OpenStack. We found a solution from HPE called Distributed Cloud Networking (DCN) that allows us to carve up the network in all sorts of interesting ways, and that way we don't have to dictate to the client how to run it.

Many clients are still running traditional networks. The adoption of Virtual Extensible LAN (VXLAN) and other types of SDDC within the network is still pretty low, especially in the mid-market space. So to go to a client and dictate that they have to change how they run the network it is not going to work.

And we wanted it to be as simple as possible. We wanted to treat this as much as we could as a flat network. By using a combination of DCN, Altoline switches from HPE, and some of other software, we were able to give clients a complete network carrying regular Virtual Local Area Networks (VLANs) across it. We then could tie this together in a hybrid fashion, whereby the customers can actually treat our cloud environment as a natural extension of their existing networks, of their existing data centers.

Gardner: You are calling this hybrid storage as a service. It’s focused on object storage at this point, and you can take this into different data center environments. What are some of the sweet spots in the market?
The object service becomes a very inexpensive way to store large amounts of data, and unlike tape -- with object as a service, everything is accessible easily. 

Weise: The areas where we are seeing the most interest have been backup and archive. It’s an alternative to tape. The object service becomes a very inexpensive way to store large amounts of data, and unlike tape -- where it's inconvenient to access the data -- with object as a service everything is accessible very, very easily.

For customers that cannot directly integrate into that object service as supported by their backup software, we can make use of object gateways to provide a method that's more like traditional access. It looks like a file, or file share, and you edit the file share to be written to the object storage, and so it acts as a go-between. For backup and archive, it makes a really, really great solution.

The other two areas where we seen the most interest have been in the medical space, specifically for large medical image files and archival. We’re working now specifically to build that type of solution, with HIPAA compliance. We have gone through the audits and compliance verification.

The second use-case has been in the media and entertainment industry. In fact, they are the very first to consume this new system and put in hundreds of terabytes worth of storage -- they are an entertainment industry client in Burbank, California. A lot of these guys are just shuffling along on external drives.

For them it’s often external arrays, and it's a lot more Mac OS users. They needed something that was better, and so hybrid object storage as a service has created a great opportunity for them and allows them to collaborate.

They have a location in Burbank, and then they brought up another office in the UK. There is yet another office for them coming up in Europe. The object storage approach allows a kind of central repository, an inexpensive place to place the data -- but it also allows them to be more collaborative as well.

Gardner: We have had a weak link in cloud computing storage, which has been the network -- and you solved some of those issues. You found a prime use-case with backup and archival, but it seems to me that given the storage capabilities that we've seen that this has extensibility. So where it might go next in terms of a storage-as-a service that hybrid cloud providers would use? Where can this go?

Carving up the network 

Weise: It’s an interesting question because one of the challenges we have all faced in the world of cloud is we have virtualized servers and virtualized storage, meaning there is disaggregation; there is a separation between the workload that’s running and the actual hardware it’s running on.

In many cases, and for almost all clients in the mid-market, that level of virtualization has not occurred at the network level. We are still nailed to things. We are all tied down to the cable, to the switch port, and to the human that can figure those things out. It’s not as flexible or as extensible as some of the other solutions that are out there.

In our case, when we build this out, the real magic is with the network. That improved connection might be a cost savings for a client -- especially from a bandwidth standpoint. But as you get a private cross-connect into that environment to make use of, in this case, storage as a service, we can now carve that up in a number of different ways and allow the client to use it for other things.

For example, if they want to have burst capability within the environments, they can have it -- and it’s on the same network as their existing system. So that’s where it gets really interesting: Instead of having to have complex virtual guest package configurations, and tiny networks, and dealing with some the routing of other pieces, you can literally treat our cloud environment as if it's a network cable thrown over the wall -- and it becomes just an extension of the existing network.

We can secure that traffic and ensure that there is high-performance, low-latency and complete separation of tenancy. If you have Coke and Pepsi as clients, they will never see each other.
That opens up some additional possibilities. Some things to work on eventually would be block storage, file storage, right there existing on the same network. We can secure that traffic and ensure that there is high-performance, low-latency and complete separation of tenancy. So if you have Coke and Pepsi as clients, they will never see each other.

Gardner: Very cool. You can take this object storage benefit -- and by the way, the cost of that can be significantly lower because you don’t have egress charges and some of the other unfriendly aspects of economics of public cloud providers. But you also have an avenue into a true hybrid cloud environment, where you can move data but also burst workloads and manage that accordingly. Now, what about making this work toward a multi-cloud capability?

Transition to
HPE Data Center Networking

Weise: Right. So this is where HPE’s DCN software-defined networking (SDN) really starts to shine and separates itself from the pack. We can tie environments together regardless of where they are. If there is a virtual endpoint or physical appliance; if it's at a remote location that can be deployed, which can act as a gateway -- that links everything together.

We can take a client network that's going from their environment into our environment, we can deploy a small virtual machine inside of a public cloud, and it will tie the networks together and allow them to treat it all as the same. The same policy enforcement engine and things that they use to segregate traffic in microsegmentation and service chaining can be done just as easily in the public cloud environment.

One of the reasons we went to Switch was because they have multiple locations. So in the case of our object storage, we deployed the objects across all three of their data center sites. So a single repository that’s written the data is distributed among three different regions. This protects against a possible regional outage that could mean data is inaccessible, and this is the kind of recent thing that we in the US have seen, where clients were down anywhere from 6 to 16 hours.

One big network, wherever you are

This eliminates that. But the nice thing is because of the network technology that theywere using from HPE, it allowed us to treat that all as one big network -- and we can carve that up and virtualize it. So clients inside of the data center -- maybe they need resources for disaster recovery or for additional backups or those things -- it's all part of that. We can tie-in from a network standpoint and regardless of where you want to exist -- if you are in Vegas, you may want to recover in Reno, or you may want to recover in Grand Rapids. We can make that network look exactly the same in your location.

You want to recover in AWS? You want to recover in Azure? We can tie it in that way, too. So it opens up these great possibilities that allows this true hybrid cloud -- and not as a completely separate entity.

Gardner: Very cool. Now there’s nothing wrong, of course, with Switch, but there are other fiber and data center folks out there. Some names that begin with “E” come to mind that you might want to drop in this and that should even increase the opportunity for distribution.

Weise: That’s right. So this initial deployment is focused on Switch, but we do a grand scheme to work this into other data centers. There are a handful of major data center operators out there, including the one that starts with an “E” along with another that starts with a “D.” We do have plans to expand this, or use this as a success use-case.

As this continues to grow, and we get some additional momentum and some good feedback, and really refine the offering to make sure we know exactly what everything needs to be, then we can work with those other data center providers.

Whenever clients deploy their workloads in those public clouds, that means there is equipment that has not been collocated inside one of your facilities.
From the data center operators’ perspective, if you're one of those facilities, you are at war with AWS or with Azure. Because whenever clients deploy their workloads in those public clouds, that means there is equipment that has not been collocated inside one of your facilities.

So they have a vested interest in doing this, and there is a benefit to the clients inside of those facilities too because they get to live inside of the ecosystem that exists within those data centers, and the private networks that they carry in there deliver the same benefits to all in that ecosystem.

We do plan to use this hybrid cloud object storage as a service capability as a model to deploy in several other data center environments. There is not only a private cloud, but also a multitenant private cloud that could be operative for clients that have a large enough need. You can talk about this in a multi-petabyte scale, or you talk about thousands of virtual machines. Then it's a question of should you do a private cloud deployment just for you? The same technology, fulfilling the same requirements, and the same solutions could still be used. 

Partners in time

Gardner: It sounds like it makes sense, on the back of a napkin basis, for you and HPE to get together and brand something along these lines and go to market together with it.

Weise: It certainly does. We've had some great discussions with them. Actually there is a group that was popular in Europe that is now starting to take its growth here in US called Cloud28+.

We had some great discussions with them. We are going to be joining that, and it’s a great thing as well.

The goal is building out this sort of partner network, and working with HPE to do that has been extremely supportive. In addition to these crazy ideas, I also have a really crazy timeline for deployment. When we initially met with HPE and talked about what we wanted to do, they estimated that I should reserve about 6 to 8 weeks for planning and then another 1.5 months for deployment.

Transition to
HPE Data Center Networking

I said, “Great we have 3 weeks to do the whole thing,” and everyone thought we were crazy. But we actually had it completed in a little over 2.5 weeks. So we have a huge amount of thanks to HPE, and to their technical services group who were able to assist us in getting this going extremely quickly.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy. Sponsor: Hewlett Packard Enterprise.


You may also be interested in:

Tuesday, August 22, 2017

How IoT and OT collaborate to usher in the data-driven factory of the future

The next BriefingsDirect Internet of Things (IoT) technology trends interview explores how innovation is impacting modern factories and supply chains.

We’ll now learn how a leading-edge manufacturer, Hirotec, in the global automotive industry, takes advantage of IoT and Operational Technology (OT) combined to deliver dependable, managed, and continuous operations.

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

Here to help us to find the best factory of the future attributes is Justin Hester, Senior Researcher in the IoT Lab at Hirotec Corp. in Hiroshima, Japan. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: What's happening in the market with business and technology trends that’s driving this need for more modern factories and more responsive supply chains?

Hester: Our customers are demanding shorter lead times. There is a drive for even higher quality, especially in automotive manufacturing. We’re also seeing a much higher level of customization requests coming from our customers. So how can we create products that better match the unique needs of each customer?

As we look at how we can continue to compete in an ever-competitive environment, we are starting to see how the solutions from IoT can help us.

Gardner: What is it about IoT and Industrial IoT (IIoT) that allows you to do things that you could not have done before?

Hester

Hester: Within the manufacturing space, a lot of data has been there for years; for decades. Manufacturing has been very good at collecting data. The challenges we've had, though, is bringing in that data in real-time, because the amount of data is so large. How can we act on that data quicker, not on a day-by-day basis or week-by-week basis, but actually on a minute-by-minute basis, or a second-by-second basis? And how do we take that data and contextualize it?

It's one thing in a manufacturing environment to say, “Okay, this machine is having a challenge.” But it’s another thing if I can say, “This machine is having a challenge, and in the context of the factory, here's how it's affecting downstream processes, and here's what we can do to mitigate those downstream challenges that we’re going to have.” That’s where IoT starts bringing us a lot of value.

The analytics, the real-time contextualization of that data that we’ve already had in the manufacturing area, is very helpful.

Gardner: So moving from what may have been a gather, batch, analyze, report process -- we’re now taking more discrete analysis opportunities and injecting that into a wider context of efficiency and productivity. So this is a fairly big change. This is not incremental; this is a step-change advancement, right?

A huge step-change 

Hester: It’s a huge change for the market. It's a huge change for us at Hirotec. One of the things we like to talk about is what we jokingly call the Tuesday Morning Meeting. We talk about this idea that in the morning at a manufacturing facility, everyone gets together and talks about what happened yesterday, and what we can do today to make up for what happened yesterday.
Why don't we get the data to the right people with the right context and let them make a decision so they can affect what's going on, instead of waiting until tomorrow to react? 

Instead, now we’re making that huge step-change to say,  “Why don't we get the data to the right people with the right context and let them make a decision so they can affect what's going on, instead of waiting until tomorrow to react to what's going on?” It’s a huge step-change. We’re really looking at it as how can we take small steps right away to get to that larger goal.

In manufacturing areas, there's been a lot of delay, confusion, and hesitancy to move forward because everyone sees the value, but it's this huge change, this huge project. At Hirotec, we’re taking more of a scaled approach, and saying let's start small, let’s scale up, let’s learn along the way, let's bring value back to the organization -- and that's helped us move very quickly.

Gardner: We’d like to hear more about that success story but in the meantime, tell us about Hirotec for those who don't know of it. What role do you play in the automotive industry, and how are you succeeding in your markets?

Hester: Hirotec is a large, tier-1 automotive supplier. What that means is we supply parts and systems directly to the automotive original equipment manufacturers (OEMs), like Mazda, General Motors, FCA, Ford, and we specialize in door manufacturing, as well as exhaust system manufacturing. So every year we make about 8 million doors, 1.8 million exhaust systems, and we provide those systems mainly to Mazda and General Motors, but also we provide that expertise through tooling.

For example, if an automotive OEM would like Hirotec’s expertise in producing these parts, but they would like to produce them in-house, Hirotec has a tooling arm where we can provide that tooling for automotive manufacturing. It's an interesting strategy that allows us to take advantage of data both in our facilities, but then also work with our customers on the tooling side to provide those lessons learned and bring them value there as well.

Gardner: How big of a distribution are we talking about? How many factories, how many countries; what’s the scale here?

Hester: We are based in Hiroshima, Japan, but we’re actually in nine countries around the world, currently with 27 facilities. We have reached into all the major continents with automotive manufacturing: we’re in North America, we’re in Europe, we’re all throughout Asia, in China and India. We have a large global presence. Anywhere you find automotive manufacturing, we’re there supporting it.

Discover How the 
IoT Advantage
Works in Multiple Industries 

Gardner: With that massive scale, very small improvements can turn into very big benefits. Tell us why the opportunity in a manufacturing environment to eke out efficiency and productivity has such big payoffs.

Hester: So especially in manufacturing, what we find when we get to those large scales like you're alluding to is that a 1 percent or 2 percent improvement has huge financial benefits. And so the other thing is in manufacturing, especially automotive manufacturing, we tend to standardize our processes, and within Hirotec, we’ve done a great job of standardizing that world-class leadership in door manufacturing.

And so what we find is when we get improvements not only in IoT but anywhere in manufacturing, if we can get 1 percent or 2 percent, not only is that a huge financial benefit but because we standardized globally, we can move that to our other facilities very quickly, doubling down on that benefit.

Gardner: Well, clearly Hirotec sees this as something to really invest in, they’ve created the IoT Lab. Tell me a little bit about that and how that fits into this?

The IoT Lab works

Hester: The IoT Lab is a very exciting new group, it's part of our Advanced Engineering Center (AEC). The AEC is a group out of our global headquarters and this group is tasked with the five- to 10-year horizon. So they're able to work across all of our global organizations with tooling, with engineering, with production, with sales, and even our global operations groups. Our IoT group goes and finds solutions that can bring value anywhere in the organization through bringing in new technologies, new ideas, and new solutions.

And so we formed the IoT Lab to find how can we bring IoT-based solutions into the manufacturing space, into the tooling space, and how actually can those solutions not only help our manufacturing and tooling teams but also help our IT teams, our finance teams, and our sales teams.

Gardner: Let's dig back down a little bit into why IT, IoT and Operational Technology (OT) are into this step-change opportunity, looking for some significant benefits but being careful in how to institute that. What is required when you move to a more an IT-focused, a standard-platform approach -- across all the different systems -- that allows you to eke these great benefits?

Tell us about how IoT as a concept is working its way into the very edge of the factory floor.

Discover How the 
IoT Advantage
Works in Multiple Industries 

Hester: One of the things we’re seeing is that IT is beginning to meld, like you alluded to, with OT -- and there really isn't a distinction between OT and IT anymore. What we're finding is that we’re starting to get to these solution levels by working with partners such as PTC and Hewlett Packard Enterprise (HPE) to bring our IT group and our OT group all together within Hirotec and bring value to the organization.

What we find is there is no longer a need in OT that becomes a request for IT to support it, and also that IT has a need and so they go to OT for support. What we are finding is we have organizational needs, and we’re coming to the table together to make these changes. And that actually within itself is bringing even more value to the organization.

Instead of coming last-minute to the IT group and saying, “Hey, we need your support for all these different solutions, and we’ve already got everything set, and you are just here to put it in,” what we are seeing, is that they bring the expertise in, help us out upfront, and we’re finding better solutions because we are getting experts both from OT and IT together.

We are seeing this convergence of these two teams working on solutions to bring value. And they're really moving everything to the edge. So where everyone talks about cloud-based computing -- or maybe it’s in their data center -- where we are finding value is in bringing all of these solutions right out to the production line.

We are doing data collection right there, but we are also starting to do data analytics right at the production line level, where it can bring the best value in the fastest way.

Gardner: So it’s an auspicious time because just as you are seeking to do this, the providers of technology are creating micro data centers, and they are creating Edgeline converged systems, and they are looking at energy conservation so that they can do this in an affordable way -- and with storage models that can support this at a competitive price.

What is it about the way that IT is evolving and providing platforms and systems that has gotten you and The IoT Lab so excited?

Excitement at the edge  

Hester: With IoT and IT platforms, originally to do the analytics, we had to go up to the cloud -- that was the only place where the compute power existed. Solution providers now are bringing that level of intelligence down to the edge. We’re hearing some exciting things from HPE on memory-driven computing, and that's huge for us because as we start doing these very complex analytics at the edge, we need that power, that horsepower, to run different applications at the same time at the production line. And something like memory-driven solutions helps us accomplish that.

It's one thing to have higher-performance computing, but another to gain edge computing that's proper for the factory environment. 
It's one thing to have higher-performance computing, but another thing to gain edge computing that's proper for the factory environment. In a manufacturing environment it's not conducive to a standard servers, a standard rack where it needs dust protection and heat protection -- that doesn't exist in a manufacturing environment.

The other thing we're beginning to see with edge computing, that HPE provides with Edgeline products, is that we have computers that have high power, high ability to perform the analytics and data collection capabilities -- but they're also proper for the environment.

I don't need to build out a special protection unit with special temperature control, humidity control – all of which drives up energy costs, which drives up total costs. Instead, we’re able to run edge computing in the environment as it should be on its own, protected from what comes in a manufacturing environment -- and that's huge for us.

Gardner: They are engineering these systems now with such ruggedized micro facilities in mind. It's quite impressive that the very best of what a data center can do, can now be brought to the very worst types of environments. I'm sure we'll see more of that, and I am sure we'll see it get even smaller and more powerful.

Do you have any examples of where you have already been able to take IoT in the confluence of OT and IT to a point where you can demonstrate entirely new types of benefits? I know this is still early in the game, but it helps to demonstrate what you can do in terms of efficiency, productivity, and analytics. What are you getting when you do this well?

IoT insights save time and money

Hester: Taking the stepped strategy that we have, we actually started at Hirotec very small with only eight machines in North America and we were just looking to see if the machines are on, are they running, and even from there, we saw a value because all of a sudden we were getting that real-time contextualized insight into the whole facility. We then quickly moved over to one of our production facilities in Japan, where we have a brand-new robotic inspection system, and this system uses vision sensors, laser sensors, force sensors -- and it's actually inspecting exhaust systems before they leave the facility.

We very quickly implemented an IoT solution in that area, and all we did was we said, “Hey, we just want to get insight into the data, so we want to be able to see all these data points. Over 400 data points are created every inspection. We want to be able to see this data, compared in historical ways -- so let’s bring context to that data, and we want to provide it in real-time.”

Discover How the 
IoT Advantage
Works in Multiple Industries 

What we found from just those two projects very quickly is that we're bringing value to the organization because now our teams can go in and say, “Okay, the system is doing its job, it's inspecting things before they leave our facility to make sure our customers always get a high-quality product.” But now, we’re able to dive in and find different trends that we weren't able to see before because all we were doing is saying, “Okay, this system leaves the facility or this system doesn't.”

And so already just from that application, we’ve been able to find ways that our engineers can even increase the throughput and the reliability of the system because now they have these historical trends. They were able to do a root-cause analysis on some improvements that would have taken months of investigation; it was completed in less than a week for us.

And so that's a huge value -- not only in that my project costs go down but now I am able to impact the organization quicker, and that's the big thing that Hirotec is seeing. It’s one thing to talk about the financial cost of a project, or I can say, “Okay, here is the financial impact,” but what we are seeing is that we’re moving quicker.

And so, we're having long-term financial benefits because we’re able to react to things much faster. In this case, we’re able to reduce months of investigation down to a week. That means that when I implement my solution quicker, I'm now bringing that impact to the organization even faster, which has long-term benefits. We are already seeing those benefits today.

Gardner: You’ll obviously be able to improve quality, you’ll be able to reduce the time to improving that quality, gain predictive analytics in your operations, but also it sounds like you are going to gain metadata insights that you can take back into design for the next iteration of not only the design for the parts but the design for the tooling as well and even the operations around that. So that intelligence at the edge can be something that is a full lifecycle process, it goes right back to the very initiation of both the design and the tooling.

Data-driven design, decisions 

As you loop this data back to our engineering teams -- what kind of benefits can we see, how can we improve our processes, how can we drive out into the organization?
Hester: Absolutely, and so, these solutions, they can't live in a silo. We're really starting to look at these ideas of what some people call the Digital Thread, the Digital Twin. We’re starting to understand what does that mean as you loop this data back to our engineering teams -- what kind of benefits can we see, how can we improve our processes, how can we drive out into the organization?

And one of the biggest things with IoT-based solutions is that they can't stay inside this box, where we talked about OT to IT, we are talking about manufacturing, engineering, these IoT solutions at their best, all they really do is bring these groups together and bring a whole organization together with more contextualized data to make better decisions faster.

And so, exactly to your point, as we are looping back, we’re able to start understanding the benefit we’re going to be seeing from bringing these teams together.

Gardner: One last point before we close out. It seems to me as well that at a macro level, this type of data insight and efficiency can be brought into the entire supply chain. As you're providing certain elements of an automobile, other suppliers are providing what they specialize in, too, and having that quality control and integration and reduced time-to-value or mean-time-to-resolution of the production issues, and so forth, can be applied at a macro level.

So how does the automotive supplier itself look at this when it can take into consideration all of its suppliers like Hirotec are doing?

Start small 

Hester: It's a very early phase, so a lot of the suppliers are starting to understand what this means for them. There is definitely a macro benefit that the industry is going to see in five to 10 years. Suppliers now need to start small. One of my favorite pictures is a picture of the ocean and a guy holding a lighter. It [boiling the ocean] is not going to happen. So we see these huge macro benefits of where we’re going, but we have to start out somewhere.

Discover How the 
IoT Advantage
Works in Multiple Industries 

A lot of suppliers, what we’re recommending to them, is to do the same thing we did, just start small with a couple of machines, start getting that data visualized, start pulling that data into the organization. Once you do that, you start benefiting from the data, and then start finding new use-cases.

As these suppliers all start doing their own small projects and working together, I think that's when we are going to start to see the macro benefits but in about five to 10 years out in the industry.

Tuesday, August 15, 2017

DreamWorks Animation crafts its next era of dynamic IT infrastructure

The next BriefingsDirect Voice of the Customer thought leader interview examines how DreamWorks Animation is building a multipurpose, all-inclusive, and agile data center capability.

Learn here why a new era of responsive and dynamic IT infrastructure is demanded, and how one high-performance digital manufacturing leader aims to get there sooner rather than later. 

Listen to the podcast. Find it on iTunes. Get the mobile app. Read a full transcript or download a copy.

Here to describe how an entertainment industry innovator leads the charge for bleeding-edge IT-as-a-service capabilities is Jeff Wike, CTO of DreamWorks Animation in Glendale, California. The discussion is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us why the older way of doing IT infrastructure and hosting apps and data just doesn't cut it anymore. What has made that run out of gas?

Wike: You have to continue to improve things. We are in a world where technology is advancing at an unbelievable pace. The amount of data, the capability of the hardware, the intelligence of the infrastructure are coming. In order for any business to stay ahead of the curve -- to really drive value into the business – it has to continue to innovate.

Gardner: IT has become more pervasive in what we do. I have heard you all refer to yourselves as digital manufacturing. Are the demands of your industry also a factor in making it difficult for IT to keep up?

Wike: When I say we are a digital manufacturer, it’s because we are a place that manufacturers content, whether it's animated films or TV shows; that content is all made on the computer. An artist sits in front of a workstation or a monitor, and is basically building these digital assets that we put through simulations and rendering so in the end it comes together to produce a movie.

Wike
That's all about manufacturing, and we actually have a pipeline, but it's really like an assembly line. I was looking at a slide today about Henry Ford coming up with the first assembly line; it's exactly what we are doing, except instead of adding a car part, we are adding a character, we’re adding a hair to a character, we’re adding clothes, we’re adding an environment, and we’re putting things into that environment.

We are manufacturing that image, that story, in a linear way, but also in an iterative way. We are constantly adding more details as we embark on that process of three to four years to create one animated film.

Gardner: Well, it also seems that we are now taking that analogy of the manufacturing assembly line to a higher plane, because you want to have an assembly line that doesn't just make cars -- it can make cars and trains and submarines and helicopters, but you don't have to change the assembly line, you have to adjust and you have to utilize it properly.

So it seems to me that we are at perhaps a cusp in IT where the agility of the infrastructure and its responsiveness to your workloads and demands is better than ever.

Greater creativity, increased efficiency

Wike: That's true. If you think about this animation process or any digital manufacturing process, one issue that you have to account for is legacy workflows, legacy software, and legacy data formats -- all these things are inhibitors to innovation. There are a lot of tools. We actually write our own software, and we’re very involved in projects related to computer science at the studio.

We’ll ask ourselves, “How do you innovate? How can you change your environment to be able to move forward and innovate and still carry around some of those legacy systems?”

How HPE Synergy
Infrastructure Operations

And one of the things we’ve done over the past couple of years is start to re-architect all of our software tools in order to take advantage of massive multi-core processing to try to give artists interactivity into their creative process. It’s about iterations. How many things can I show a director, how quickly can I create the scene to get it approved so that I can hand it off to the next person, because there's two things that you get out of that.

One, you can explore more and you can add more creativity. Two, you can drive efficiency, because it's all about how much time, how many people are working on a particular project and how long does it take, all of which drives up the costs. So you now have these choices where you can add more creativity or -- because of the compute infrastructure -- you can drive efficiency into the operation.

So where does the infrastructure fit into that, because we talk about tools and the ability to make those tools quicker, faster, more real-time? We conducted a project where we tried to create a middleware layer between running applications and the hardware, so that we can start to do data abstraction. We can get more mobile as to where the data is, where the processing is, and what the systems underneath it all are. Until we could separate the applications through that layer, we weren’t really able to do anything down at the core.

Core flexibility, fast

We want to be able to change how we are using that infrastructure -- examine usage patterns, the workflows -- and be able to optimize.
Now that we have done that, we are attacking the core. When we look at our ability to replace that with new compute, and add the new templates with all the security in it -- we want that in our infrastructure. We want to be able to change how we are using that infrastructure -- examine usage patterns, the workflows -- and be able to optimize.

Before, if we wanted to do a new project, we’d say, “Well, we know that this project takes x amount of infrastructure. So if we want to add a project, we need 2x,” and that makes a lot of sense. So we would build to peak. If at some point in the last six months of a show, we are going to need 30,000 cores to be able to finish it in six months, we say, “Well, we better have 30,000 cores available, even though there might be times when we are only using 12,000 cores.” So we were buying to peak, and that’s wasteful.

What we wanted was to be able to take advantage of those valleys, if you will, as an opportunity -- the opportunity to do other types of projects. But because our infrastructure was so homogeneous, we really didn't have the ability to do a different type of project. We could create another movie if it was very much the same as a previous film from an infrastructure-usage standpoint.

By now having composable, or software-defined infrastructure, and being able to understand what the requirements are for those particular projects, we can recompose our infrastructure -- parts of it or all of it -- and we can vary that. We can horizontally scale and redefine it to get maximum use of our infrastructure -- and do it quickly.

Gardner: It sounds like you have an assembly line that’s very agile, able to do different things without ripping and replacing the whole thing. It also sounds like you gain infrastructure agility to allow your business leaders to make decisions such as bringing in new types of businesses. And in IT, you will be responsive, able to put in the apps, manage those peaks and troughs.

Does having that agility not only give you the ability to make more and better movies with higher utilization, but also gives perhaps more wings to your leaders to go and find the right business models for the future?

Wike: That’s absolutely true. We certainly don't want to ever have a reason to turn down some exciting project because our digital infrastructure can’t support it. I would feel really bad if that were the case.

In fact, that was the case at one time, way back when we produced Spirit: Stallion of the Cimarron. Because it was such a big movie from a consumer products standpoint, we were asked to make another movie for direct-to-video. But we couldn't do it; we just didn’t have the capacity, so we had to just say, “No.” We turned away a project because we weren’t capable of doing it. The time it would take us to spin up a project like that would have been six months.

The world is great for us today, because people want content -- they want to consume it on their phone, on their laptop, on the side of buildings and in theaters. People are looking for more content everywhere.

Yet projects for varied content platforms require different amounts of compute and infrastructure, so we want to be able to create content quickly and avoid building to peak, which is too expensive. We want to be able to be flexible with infrastructure in order to take advantage of those opportunities.

HPE Synergy
Infrastructure Operations

Gardner: How is the agility in your infrastructure helping you reach the right creative balance? I suppose it’s similar to what we did 30 years ago with simultaneous engineering, where we would design a physical product for manufacturing, knowing that if it didn't work on the factory floor, then what's the point of the design? Are we doing that with digital manufacturing now?

Artifact analytics improve usage, rendering

We always look at budgets, and budgets can be money budgets, they can be rendering budgets, they can be storage budgets, and networking -- all of those things are commodities that are required to create a project. 
Wike: It’s interesting that you mention that. We always look at budgets, and budgets can be money budgets, it can be rendering budgets, it can be storage budgets, and networking -- I mean all of those things are commodities that are required to create a project.

Artists, managers, production managers, directors, and producers are all really good at managing those projects if they understand what the commodity is. Years ago we used to complain about disk space: “You guys are using too much disk space.” And our production department would say, “Well, give me a tool to help me manage my disk space, and then I can clean it up. Don’t just tell me it's too much.”

One of the initiatives that we have incorporated in recent years is in the area of data analytics. We re-architected our software and we decided we would re-instrument everything. So we started collecting artifacts about rendering and usage. Every night we ran every digital asset that had been created through our rendering, and we also collected analytics about it. We now collect 1.2 billion artifacts a night.

And we correlate that information to a specific asset, such as a character, basket, or chair -- whatever it is that I am rendering -- as well as where it’s located, which shot it’s in, which sequence it’s in, and which characters are connected to it. So, when an artist wants to render a particular shot, we know what digital resources are required to be able to do that.

One of the things that’s wasteful of digital resources is either having a job that doesn't fit the allocation that you assign to it, or not knowing when a job is complete. Some of these rendering jobs and simulations will take hours and hours -- it could take 10 hours to run.

At what point is it stuck? At what point do you kill that job and restart it because something got wedged and it was a dependency? And you don't really know, you are just watching it run. Do I pull the plug now? Is it two minutes away from finishing, or is it never going to finish?

Just the facts

Before, an artist would go in every night and conduct a test render. And they would say, “I think this is going to take this much memory, and I think it's going to take this long.” And then we would add a margin of error, because people are not great judges, as opposed to a computer. This is where we talk about going from feeling to facts.

So now we don't have artists do that anymore, because we are collecting all that information every night. We have machine learning that then goes in and determines requirements. Even though a certain shot has never been run before, it is very similar to another previous shot, and so we can predict what it is going to need to run.

By doing that machine learning and taking the guesswork out of the allocation of resources, we were able to save 15 percent of our render time, which is huge.
Now, if a job is stuck, we can kill it with confidence. By doing that machine learning and taking the guesswork out of the allocation of resources, we were able to save 15 percent of our render time, which is huge.

I recently listened to a gentleman talk about what a difference of 1 percent improvement would be. So 15 percent is huge, that's 15 percent less money you have to spend. It's 15 percent faster time for a director to be able to see something. It's 15 percent more iterations. So that was really huge for us.

Gardner: It sounds like you are in the digital manufacturing equivalent of working smarter and not harder. With more intelligence, you can free up the art, because you have nailed the science when it comes to creating something.

Creative intelligence at the edge

Wike: It's interesting; we talk about intelligence at the edge and the Internet of Things (IoT), and that sort of thing. In my world, the edge is actually an artist. If we can take intelligence about their work, the computational requirements that they have, and if we can push that data -- that intelligence -- to an artist, then they are actually really, really good at managing their own work.

It's only a problem when they don't have any idea that six months from now it's going to cause a huge increase in memory usage or render time. When they don't know that, it's hard for them to be able to self-manage. But now we have artists who can access Tableau reports everyday and see exactly what the memory usage was or the compute usage of any of the assets they’ve created, and they can correct it immediately.

On Megamind, a film DreamWorks Animation released several years ago, it was prior to having the data analytics in place, and the studio encountered massive rendering spikes on certain shots. We really didn't understand why.

After the movie was complete, when we could go back and get printouts of logs to analyze, we determined that these peaks in rendering resources were caused by his watch. Whenever the main character’s watch was in a frame, the render times went up. We looked at the models, and well-intended artists had taken a model of a watch and every gear was modeled, and it was just a huge, heavy asset to render.

But it was too late to do anything about it. But now, if an artist were to create that watch today, they would quickly find out that they had really over-modeled that watch. We would then need to go in and reduce that asset down, because it's really not a key element to the story. And they can do that today, which is really great.

HPE Synergy
Infrastructure Operations

Gardner: I am a big fan of animated films, and I am so happy that my kids take me to see them because I enjoy them as much as they do. When you mention an artist at the edge, it seems to me it’s more like an army at the edge, because I wait through the end of the movie, and I look at the credits scroll -- hundreds and hundreds of people at work putting this together.

So you are dealing with not just one artist making a decision, you have an army of people. It's astounding that you can bring this level of data-driven efficiency to it.

Movie-making’s mobile workforce

If you capture information, you can find so many things that we can really understand better about our creative process to be able to drive efficiency and value into the entire business.
Wike: It becomes so much more important, too, as we become a more mobile workforce. 

Now it becomes imperative to be able to obtain the information about what those artists are doing so that they can collaborate. We know what value we are really getting from that, and so much information is available now. If you capture it, you can find so many things that we can really understand better about our creative process to be able to drive efficiency and value into the entire business.

Gardner: Before we close out, maybe a look into the crystal ball. With things like auto-scaling and composable infrastructure, where do we go next with computing infrastructure? As you say, it's now all these great screens in people's hands, handling high-definition, all the networks are able to deliver that, clearly almost an unlimited opportunity to bring entertainment to people. What can you now do with the flexible, efficient, optimized infrastructure? What should we expect?

Wike: There's an explosion in content and explosion in delivery platforms. We are exploring all kinds of different mediums. I mean, there’s really no limit to where and how one can create great imagery. The ability to do that, the ability to not say “No” to any project that comes along is going to be a great asset.

We always say that we don't know in the future how audiences are going to consume our content. We just know that we want to be able to supply that content and ensure that it’s the highest quality that we can deliver to audiences worldwide.

Gardner: It sounds like you feel confident that the infrastructure you have in place is going to be able to accommodate whatever those demands are. The art and the economics are the variables, but the infrastructure is not.

Wike: Having a software-defined environment is essential. I came from the software side; I started as a programmer, so I am coming back into my element. I really believe that now that you can compose infrastructure, you can change things with software without having to have people go in and rewire or re-stack, but instead change on-demand. And with machine learning, we’re able to learn what those demands are.

I want the computers to actually optimize and compose themselves so that I can rest knowing that my infrastructure is changing, scaling, and flexing in order to meet the demands of whatever we throw at it.