Wednesday, September 24, 2014

University of New Mexico delivers efficient IT services by centralizing on secure, managed cloud automation

The latest BriefingsDirect discussion focuses on one of the toughest balancing acts in seeking the best of cloud computing benefits. This balance comes from obtaining the proper degree of centralization or "common good" for infrastructure efficiency, while preserving a sufficient culture of decentralization for agility, innovation, and departmental-level control.

The requirement for empowering centralization is no more evident than in a large university setting, where support and consensus must be preserved among such constituencies as faculty, staff, students, and researchers -- across an expansive educational community.

But the typical IT model does not support localized agility when it takes weeks to spin up a server, if online services lack automation, or if manual processes hold back efficient ongoing IT operations. Too much IT infrastructure redundancy also means weak security, high costs, lack of agility, and slow upgrades.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

We're joined by an IT executive from the University of New Mexico (UNM) to learn more about moving to a streamlined and automated private cloud model to gain a common good benefit, while maintaining a vibrant and reassured culture of innovation. We're also joined by a VMware executive to learn more about the latest ways to manage cloud architectures and processes to attain the best of cloud efficiencies, while empowering improved services delivery and process agility.

They are: Brian Pietrewicz, Director of Computing Platforms at the University of New Mexico in Albuquerque, and Kurt Milne, Director of Product Marketing in the Management Business Unit at VMware. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us about your IT organization at the university and how you've been able to do change, but at the same time not alienate your users, who are, I imagine, used to having things their way.

Pietrewicz: At the University of New Mexico, it's a highly decentralized organization. In most cases, the departments are responsible for their own IT. In most cases, that means they don't have the resources to effectively run IT, in particular, things like data centers, servers, storage, disaster recovery (DR), and backups.

Pietrewicz
What we're doing to improve the process is providing infrastructure as a service (IaaS) to those groups so that they don’t have to worry about the heavy lifting of the infrastructure pieces that I mentioned before. They can stay focused on their core mission, whether that’s physics, or psychology, or who knows what.

So we offer IaaS. We're running a VMware stack, and we're also running vCloud Automation Center (vCAC). We've deployed the Self-Service Portal. We give departments, faculty members, or departmental IT folks the ability to go into the portal and deploy their own machines at will.

Then, they are administrators of that machine. They also have additional management features through the vCAC console so that they can effectively do whatever they need to do with the server, but not have to worry about any of the underlying infrastructure.

Gardner: That sounds like the best of both worlds. In a sense, you're a service provider in the organization, getting the benefits of centralization and efficiency, but allowing them to still have a lot of hands-on control, which I assume that they want.

Pietrewicz: Correct. The other part is the agility, the ability for them to be able to react quickly, to consume infrastructure on demand as they need it, and have the benefit of all the things that virtualization brings with redundant infrastructure, lower cost of ownership, and those sorts of things.

New expectations

Milne: It’s an interesting time to be in the IT space, because there's this new set of expectations being imposed on IT by the business to be strategic, to quickly adopt new technology, and boost innovation.

Milne
At the same time, IT still has the full set of responsibilities they've always had -- to stay secure, to avoid legacy debt, to drive operational excellence so they maintain uptime, security, and quality of service for transactional systems and business-critical systems.

It’s really an interesting paradox. How do you do these two things that are seemingly mutually exclusive -- go fast, but at the same time, stay in control?

Brian’s approach is what I call it "push button IT," where you give folks a button to push and they get what they need when they want it. But if IT controls the button and they control what happens when the user pushes the button, IT is able to maintain control. It’s really the best of both worlds.

Gardner: Brian, tell us a little bit about how long you have been there and what it was like before you began this journey?

Pietrewicz: I've been at UNM for about two-and-a-half years, and I can tell you the number one complaint. We suffer from a lot of the same problems that other large IT shops have, with funding and things like that. But the primary issue that we had when I walked in the door was customers being upset because we didn't have clearly-defined services, and we had sold these services to these customers.

We had sold virtual machines (VMs) with database backups, and all kinds of interesting things, with no service-level agreements (SLAs), no processes, nothing wrapped around it. The delivery of these services was completely inconsistent.

So I started out down the new path. The first thing that we did was to make the services more consistent. Just to give you an example, deploying a virtual machine for a customer. The way that it was when I got here was that a ticket came into the service desk. It went to a single technician, and then whichever technician got that ticket figured out their own way of getting that machine deployed.
At the same time, IT still has the full set of responsibilities they've always had -- to stay secure, to avoid legacy debt, to drive operational excellence.

As the next step in that process, we went through and, instead of just having it being done a different way by whoever received the ticket, we identified all the steps associated. In looking at all the steps associated, we identified over a 100 manual steps that went though six different completely separate groups inside of our organization.

Those included operating system, storage, virtualization, security, and networking for firewall changes. In all those various groups that deploy their individual piece of that puzzle, it was being done differently every time. Our deployment times were taking as long as three weeks. You can imagine how painful that is when it takes 20 minutes to spin up a VM -- but it was taking three weeks to deploy it to a customer.

We identified all the steps and defined the process very, very clearly; exactly what it takes to deploy a VM. The interesting thing that came out of that was that it gave us the content necessary to be able to start developing a true service description and an SLA.

Ticketing system

It also made it so that it was consistent. We did a few things after we did the process development. We generated workflows within our ticketing system, so that all that happened was a ticket was put in and then it auto-generated all the necessary tickets to deploy the VM, so it happened in a very consistent way.

That dropped the deployment time from three weeks down to about three days, because it still had to go through certain approval process and things like that with security.

For the next step we said, "Okay, how can we do this better?" We looked at all of those steps that we put in place and found that they were all repetitive, manual steps that could be easily automated. So enters VMware vCAC.

We took all the steps, after we had them clearly defined, and we automated all the steps that we could. We couldn’t automate all of them, for example, sending information to our billing system to bill the customer back. From vCAC we shoot an email over to our ticketing system, that generates a ticket. Then, the billing information is still entered manually, and we are working on an upgrade to that.

UNM is approximately 45,000 faculty, staff, and students. We have about 100 either departments or affiliates, and today, we're running about 660 VMs for our organization. For central IT, we're between 98 percent and 99 percent virtualized.

When I first got here, the services were not defined and the processes were not defined. Since then, we have clearly defined the processes, narrowed those down into the very specific processes and tasks that had to be done, and then we automated. We're going through the process of automating every step in that process.
ITIL is very challenging to implement, but it's extremely helpful, because it gives you a framework to work within.

Now, we have a thing we call Lobo Cloud -- our mascot is the Lobo. Customers can now go online and deploy a machine within 20 minutes. So basically everything has transformed from extremely inconsistent service and taking as long as three weeks to deploy, to now it being the equivalent going into McDonald’s and ordering a Big Mac. It’s extremely consistent and down from three weeks to 20 minutes.

Gardner: I assume Brian that you've adopted some industry-standard methods, perhaps a framework, that gave you some guidance on this. How does your service delivery policy adhere to an industry standard like ITIL?

Pietrewicz: That’s what we use. We follow ITIL and we're at varying levels of maturity with it. ITIL is very challenging to implement, but it's extremely helpful, because it gives you a framework to work within, to start narrowing down these process, defining services, setting SLAs. It gives you a good overarching framework to work within.

The absolute hardest part of all of this is implementing the ITIL framework, identifying your processes, identifying what your service is, and identifying your SLA. Walking through all of that is exponentially harder than putting the technology in place.

Gardner: It seems to me that not only are you going to get faster servers, response times, and automation, but there are some other significant benefits to this approach. I'm thinking about security, disaster recovery (DR), the ability to budget better through an OPEX model, and then ultimately reduce total costs.

Is it too soon or have some of these other benefits that I have heard about typically when people move to a more automated cloud approach? How is that working for you?

Less expensive

Pietrewicz: We don’t really have good statistics on it. For the folks that had machines sitting underneath their desks and in closets before, we don’t have a lot of the statistics to know exactly the cost and the time they were spending on that.

Anybody who works with virtualization quickly learns that once you hit a certain size, it becomes significantly less expensive. You become far more agile and you get a huge number of benefits. Some of them are things that you mentioned -- the deployment time, DR, the ability to automate, the taking advantage of economies of scale.

Instead of deploying one $10,000 server per application, you're now loading up 70 machines on a $15,000 server. All of those things come into play. But we really don’t have good statistics, because we didn’t really have any good processes before we started.

What’s interesting now is that our next step in the process is to automate our billing process. Once we do that, we're going to have everything from our virtual infrastructure deployed into our billing system and either a charge-back or a show-back methodology.
The same kind of tools and processes that can automate the delivery of those services can also automate tearing down those services when they're done.

So we'll have complete detailed costs of all of our infrastructure associated with every department and every application that is using our service. We'll be able to really show the total cost of ownership (TCO).

Milne: Brian, it sounds like you're on a path that a lot of our customers are on. What we see typically is that there is a change in consumption behavior when your customers know that they can get IaaS on demand. They stop hoarding resources. The same kind of tools and processes that can automate the delivery of those services can also automate tearing down those services when they're done.

Virtualization by itself increases capacity utilization quite a bit, but then going to this kind of services delivery, service consumption for infrastructure, actually further increases utilization and drives down over-provisioning.

Adding that cost transparency to that service will further change your consumers' behavior and the ability to get it when you need it and only pay for what you use drives down the amount of resources that you have to keep in your data center.

Pietrewicz: Absolutely. It’s amazing what happens when you have to pay for something and it’s very visible.

Milne: I always feel that if IT is free that really changes the supply and demand equation, if you study economics. People don’t know what to do with free. They typically take too much.

Economic behavior

Pietrewicz: Right. This really starts driving basic economic and social behavior into the equation in IT. It’s a difficult thing for organizations to get their head around, and they're sort of getting it here at the university. It’s not completely in place. The way that we look at it is as a, "We'll build it, and they'll come" kind of thing.

Most folks have figured out that they can really save that money. Instead of going out and buying a $10,000 server, they can buy a $1,000 VM from us that does the exact same thing. If they don’t want it any more, they can turn it off and not pay any more. All of those things come into play.

Another piece on that is the university was experimenting with a thing called reliability centered maintenance (RCM), which is a budgeting process that works toward the bottom line of a particular organization. That means that people have to be transparent and make clear decisions about where they're spending their money. That's also starting to drive adoption.

Ancillary benefits

Gardner: We talked about some of the ancillary benefits of your approach, but there are some direct benefits when you go to a cloud model, which gives you more options. You can have your private cloud. You can look to public cloud and other hosting models, and then you can start to see a path or a vision towards a hybrid cloud environment, where you might actually move workloads around based on the right infrastructure approach for the right job at the right time. Any thoughts about where your clouds goals are vis-à-vis the hybrid potential?

Pietrewicz: We have a few things in play that we're actively working. Today, we have people using various cloud providers. The interesting part about that they're just paying for it with a credit card out of their department, and the university doesn’t have any clear way of knowing exactly what’s out there. We don’t really have any good security mechanisms in place for determining whether there's any sensitive data being stored out there inadvertently.

We're working with a lot of the cloud providers that we are already spending money with and we are already working with to develop consolidated accounts. One, we can save money through economies of scale. And two, we can get some visibility into what folks are actually using the cloud for. And then three, IT would like to act as an adviser to be able to point out for the various cloud providers that are out there -- this particular provider is good at functionality or this particular provider is good at security.
We envision setting up hybrid cloud services with those public cloud providers to be able to move the workloads back and forth when necessary.

The first step is to corral the use of public cloud for UNM and create an escorting process to the cloud. The second step is going to be a hybrid cloud that we'll set up from our private cloud here on site. We envision setting up hybrid cloud services with those public cloud providers to be able to move the workloads back and forth when necessary.

The other major benefit that we very much look forward to is being able to do DR in the cloud and taking advantage of the ability to replicate data and then spin up systems as you need them, rather than having a couple of million dollars in equipment sitting, waiting, and hoping you never use it. Things that you have to refresh every four years so that you have a viable DR plan.

Gardner: Is vCloud Automation Center something that will be useful in moving to this hybrid model? The one button to push, as it were, on the private cloud, will that become a one button to push in the hybrid model as well?

Pietrewicz: It will. I mentioned those various cloud service providers. Most of them are compatible with the vCloud Connector, so that you can simply just connect up that hybrid cloud service and with a little bit of work, be able to massage your portal.

We can have a menu option of public cloud providers through our portal that they could just select and say that they want to get a vCHS, Amazon, or Terremark, and then potentially move workloads back and forth. So vCAC and vCloud Connector are all at the center of it.

The other interesting piece that we're working on and going to try to figure out as part of this is that we really want to start looking into NSX and/or VIX to be able to provide very clear security boundaries, basically multi-tenancy, and then potentially be able to move those multi-tenant environments back and forth in the cloud or extend them from public to private cloud as well.

Software-defined networking

Gardner: Brian, you mentioned multi-tenancy earlier, and of course, there is a lot going on with software-defined data center, networking, and storage. What is it about it that’s interesting to you and why is this a priority for you, software-defined networking (SDN), for example?

Pietrewicz: SDN is the next sort of step in being able to truly automate your IaaS and your virtual environment. If you want to be able to dynamically deploy systems and have them be in a SAN box that is multi-tenant by customer, you really need to have an SDN-type solution, or at least that’s extremely helpful to do that.

One of the things that we are looking at next is to be able to implement something like NSX, so that we can deploy the equivalent of what’s a virtual wire, a multi-tenant environment, to individual customers, so that they can only see their stuff and can’t see their neighbors and vice versa.

The key is the ability to orchestrate that on demand and not have to deal with the legacy VLAN and firewall kind of issues that you have with the legacy environment.

Gardner: It’s interesting how a lot of these major trends -- service delivery, cloud, private cloud, DR, and SDN -- are interrelated. It’s a complex bundle, but the payoffs, when you do this inclusively, are pretty impressive.
From VMware’s perspective, that kind of network virtualization capability is critical for our hybrid cloud service.

Pietrewicz: Whenever you get to the point of abstracting things to the software level, you provide the ability to automate. When you have the ability to automate, you get tremendous flexibility. That sometimes can be an issue in and of itself, just making decisions on how you want to do something. But along with that flexibility, you get the ability to automate just about anything that you want or need to be able to do.

The second piece to that is that we're really excited about figuring out, when we build the hybrid cloud model, how we might be able to extend those tenants into the cloud, either as active running workloads or in a DR model, so that the multi-tenancy is retained.

Milne: From VMware’s perspective, that kind of network virtualization capability is critical for our hybrid cloud service. It’s that capability that NSX provides that creates that seamless experience from your data center out to the hybrid cloud.

As you said, Brian, that kind of network configuration, allocation, and reallocation of IP addresses, when you are moving things from one data center to another, is not something you want to do on a manual basis. So NSX is a key component of our hybrid cloud vision. It’s something that lot of the other cloud providers just don’t have.

Pietrewicz: I see it as the next frontier in IT. I think that when SDN starts taking off, it’s going to be a game changer in ways that we are not even recognizing yet, and that’s one example. Moving a workload from one network to another network is extremely powerful.

Cloud broker

Gardner: Kurt, this sounds as if not only is Brian transitioning into being a service provider to his constituencies, but now he's also becoming a cloud broker. Is this typical of what you're seeing in the market as well?

Milne: It is. Some of our customers will take a step to try to get their arms around shadow IT, users going around IT, to just offer that provisioning option through the IT portal. So it’s like, "You're using Amazon? That’s fine. We can help you do that." So putting a button in the service catalog deploys the kind of work that they've been doing in a public cloud like Amazon, but it has to come through IT. Then, IT is aware of it.

There's a saying I like. It’s called the "cloud boomerang." A lot of times, the IT customers will put thing out in the public cloud, but like a boomerang, it seems to always come back. The customer wants to integrate it with an existing system or they realize that they have to support it up in the cloud. A lot of times, those rogue deployments make their way back to the IT organization. So putting an Amazon service in the vCAC portal and not changing anything else is a nice first step in corralling that.
Now, we're taking that next step and combining a lot of those capabilities into a single platform.

Pietrewicz: That is exactly what we're seeing. At a university, because there isn’t really governance, it’s more like build a good service and hope they come. We take the approach of trying to enable it. We want to make it very transparent and say that they can use Amazon or vCHS, but there's a better way to do it. If you do it through the portal, you may be able to move those workloads back and forth.

We are actually seeing exactly what you mentioned, Kurt. Folks are reaching the limitations of using some of the cloud providers, because they need to get access to data back here at UNM and are actually doing the boomerang approach. They started out there and now they're migrating their machines into our IaaS so that they can get access to the data that they need.

Gardner: Kurt, we heard some very interesting things at VMworld recently around the cloud-management platform. Why don’t you tell us a little bit about that and how that fits into what we've been discussing in terms of this ongoing maturity and evolution that a large organization like the University of New Mexico is well into?

Milne: We recently announced the vRealizeSuite, which is a cloud management platform. So we're moving our product management strategy to a common platform.

Over the years, VMware has either built or acquired quite a few different management products. We've combined those products into a number of suites, like our automation, operations, and our business management suites. Now, we're taking that next step and combining a lot of those capabilities into a single platform.

There are a couple of guiding ideas there. We see in organizations like Brian’s is that the lines between the automated provisioning of those workloads automation, provisioning those workloads, and the ongoing operations and maintenance and support of those workloads, is really starting to blur.

So you have automation tasks that might happen when you're doing a support call. Maybe you want to provision some more resources, and there are operations tasks like checking system health that you might want to do as a step in an automation routine.

Shared services

Our product strategy change is to move toward a shared-services model, similar to a service-oriented architecture. The different services that are underlying our management products would be executable through a tool like vCAC, through a command line interface, or through like a REST API. There's kind of a mix-and-match opportunity to execute those services in different ways.

To build that platform with the shared service model on top, we need to start re-architecting some of our products in the back-end, so that we have a common orchestration engine, a common DR backup and a common policy engine. You don’t want one tool to undo the work that another tool did yesterday. You can’t have conflicting robots going out and doing automated tasks.

The general idea is to try to further consolidate these different management functions into a single platform. The overall goal is to try to help organizations maintain control, but then also increase flexibility and speed for their business users.

Gardner: Brian, is that something that you think is going to be on your radar? Is management so distributed now that you're looking for a more consolidated approach that’s inclusive?
The overall goal is to try to help organizations maintain control, but then also increase flexibility and speed for their business users.

Pietrewicz: That would be wonderful. We're doing things many different ways. If you take the example of orchestration, we are using Orchestrator, PowerShell, Perl, and starting to experiment with Puppet.

It would be really good if you could have one standardized way that you approach orchestration, as an example, and how that might tie into all the other pieces for back-end management, rather than handling it several different ways. As Kurt was mentioning, one part starts to step on another part. Having that be consolidated and consistent would be a huge value.

Milne: The other part of the strategy is also to make that work across environments. So the same tools and services would be available if you are provisioning up to Amazon or to your private cloud or hybrid cloud service, and even different hypervisors.

We're fully aware of the heterogeneous nature of the modern data center. So we're shifting to try to create that kind of powerful common management stack with that unified management experience across all of the environment. It’s kind of a nirvana. When we talk to people, they say that’s exactly what they want. So our vision is to kind of march towards delivering on that.

Gardner: Kurt, I am trying to recall from VMworld whether this was offered on-premises, as a service from a cloud, or some combination?

Service offerings

Milne: That’s the other interesting part of this. We're starting to go down the path of offering a number of our management products as a service. For example, at VMworld, we announced the availability of a beta for our vCAC product as a software as a service (SaaS), so you can without installing any software get a service portal, get that workflow and policy engine, and deploy infrastructure services across different environments.

We'll be rolling out betas for our other products in subsequent quarters over the next year or so. Then potentially we could have the SaaS services interact with and combine with the services that are available through the products that are installed on-premise. Our goal is to get these out there and then understand what the best use cases are, but that kind of mix and match is part of the vision.

Gardner: It’s interesting. We might have a reverse boomerang when it comes to the management of all of this. Does that sound appealing Brian? Is that something you would look to as a cloud service, comprehensive management?
Our goal is to get these out there and then understand what the best use cases are, but that kind of mix and match is part of the vision.

Pietrewicz: Absolutely, but it’s largely dependent on return on investment (ROI). It’s that balance of, when you get to a certain level in an IT shop, it’s sometimes cheaper to do things in-house than it is to outsource it, and sometimes not. You have to do the analysis on the ROI on what makes more sense to bring it in or to use a SaaS.

As an example, we completely outsourced all of our email, because it’s a lot of work. It's very simple and easy to do as a SaaS solution, but it’s a lot more work to do in-house. It’s definitely something that we would look into.

Milne: In a mid-sized organization that might have 300 different applications that the IT organization supports, maybe 50 of those are IT tools. Already we've seen progress with companies like ServiceNow that have a SaaS-based service desk. It makes sense to start to turn more of those management products into a SaaS delivery model.

Gardner: Brian, any thoughts about others who are starting to move in your direction, perhaps their own Lobo Cloud, their own portal rationalizing these services, being able to measure them better. What in 20/20 hindsight do you have that you could recommend for them as they go about this? Any learned lessons you could share?

Process orientation

Pietrewicz: The biggest lesson learned, without a doubt, is the focus on the process orientation, the ITIL model. The technology is really not that hard. It’s determining what your service is, what are you trying to deliver, and then how do you build that into a consistently delivered service, complete with SLAs and service descriptions that meet the customer needs. That's the most difficult part.

The technical folks can definitely sling the technology. That doesn’t seem to be that big of a deal. The partners and providers do a very good job of putting together products that make it happen, but the hard part is defining the processes and defining the services and making sure that they are meeting the customer needs.

Gardner: Kurt, any thoughts in reaction to what Brian said in terms of getting started on the right path around cloud rationalization of your IT organization?

Milne: One of the things that I've seen is a lot of organizations go through this process that Brian has described, trying to clearly define their services and figure out which parts of those services they're going to automate.
The hard part is defining the processes and defining the services and making sure that they are meeting the customer needs.

A lot of organizations start that service definition effort from an inside-out perspective, get a bunch of IT guys together, and try to define what you do on a daily basis in a service. That's hard.

The easier approach is just to go talk to your customers and users and ask, "If I were going to give you a button you could click to get what you need, what would you put behind the button?" Then, you define your services more from an outside-in perspective. It seems to be where companies get anyway and you just shortcut a lot of teeth gnashing and internal meetings when you do it that way.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: VMware.

You may also be interested in:

Monday, September 22, 2014

The Open Group panel: Internet of things poses massive opportunities and obstacles

What The Open Group refers to as Open Platform 3.0 encompasses the combined impacts of cloud, big data, mobile, and social. But to each of these now we can add a new cresting wave of complexity and scale as we consider the rapid explosion of new devices, sensors, and myriad endpoints that will be connected using internet protocols, standards and architectural frameworks.

This so-called Internet of Things means more data, more cloud connectivity and management, and an additional tier of “things” that are going to be part of the mobile edge -- and extending that mobile edge ever deeper into even our own bodies.

Yet the Internet of Things is more than the “things” – it means a higher order of software platforms. For example, if we are going to operate data centers with new dexterity thanks to software-defined networking (SDN) and storage (SDS) -- indeed the entire data center being software-defined (SDDC) -- then why not a software-defined automobile, or factory floor, or hospital operating room -- or even a software-defined city block or neighborhood?

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

And so how does this all actually work? Does it easily spin out of control? Or does it remain under proper management and governance? Do we have unknown unknowns about what to expect with this new level of complexity, scale, and volume of input devices?

To help answer these questions, The Open Group and BriefingsDirect recently assembled a distinguished panel at The Open Group Boston Conference 2014 to explore the practical implications and limits of the Internet of Things.

The panelist are: Said Tabet, Chief Technology Officer for Governance, Risk and Compliance Strategy at EMC, and a primary representative to the Industrial Internet Consortium; Penelope Gordon, Emerging Technology Strategist at 1Plug Corporation; Jean-Francois Barsoum, Senior Managing Consultant for Smarter Cities, Water and Transportation at IBM, and Dave Lounsbury, Chief Technical Officer at The Open Group. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Jean-Francois, we have heard about this notion of "cities as platforms," and I think the public sector might offer us some opportunity to look at what is going to happen with the Internet of Things, and then extrapolate from that to understand what might happen in the private sector.

Hypothetically, the public sector has a lot to gain. It doesn't have to go through the same confines of a commercial market development, profit motive, and that sort of thing. Tell us a little bit about what the opportunity is in the public sector for smart cities.

Barsoum: It's immense. The first thing I want to do is link to something that Marshall Van Alstyne (Professor at Boston University and Researcher at MIT) had talked about, because I was thinking about his way of approaching platforms and thinking about how cities represent an example of that.

Barsoum
You don't have customers; you have citizens. Cities are starting to see themselves as platforms, as ways to communicate with their customers, their citizens, to get information from them and to communicate back to them. But the complexity with cities is that as a good a platform as they could be, they're relatively rigid. They're legislated into existence and what they're responsible for is written into law. It's not really a market.

Chris Harding (Forum Director of The Open Group Open Platform 3.0) earlier mentioned, for example, water and traffic management. Cities could benefit greatly by managing traffic a lot better.

Part of the issue is that you might have a state or provincial government that looks after highways. You might have the central part of the city that looks after arterial networks. You might have a borough that would look after residential streets, and these different platforms end up not talking to each other.

They gather their own data. They put in their own widgets to collect information that concerns them, but do not necessarily share with their neighbor. One of the conditions that Marshall said would favor the emergence of a platform had to do with how much overlap there would be in your constituents and your customers. In this case, there's perfect overlap. It's the same citizen, but they have to carry an Android and an iPhone, despite the fact it is not the best way of dealing with the situation.

The complexities are proportional to the amount of benefit you could get if you could solve them.

More hurdles

Gardner: More hurdles, more interoperability issues, and when you say commensurate, you're saying that the opportunity is huge, but the hurdles are huge and we're not quite sure how this is going to unfold.

Barsoum: That's right.

Gardner: Let's go to an area where the opportunity outstrips the challenge, manufacturing. Said, what is the opportunity for the software-defined factory floor for recognizing huge efficiencies and applying algorithmic benefits to how management occurs across domains of supply-chain, distribution, and logistics. It seems to me that this is a no-brainer. It's such an opportunity that the solution must be found.

Tabet: When it comes to manufacturing, the opportunities are probably much bigger. It's where we can see a lot of progress that has already been done and still work is going on. There are two ways to look at it.

Tabet
One is the internal side of it, where you have improvements of business processes. For example, similar to what Jean-Francois said, in a lot of the larger companies that have factories all around the world, you'll see such improvements on a factory base level. You still have those silos at that level.

Now with this new technology, with this connectedness, those improvements are going to be made across factories, and there's a learning aspect to it in terms of trying to manage that data. In fact, they do a better job. We still have to deal with interoperability, of course, and additional issues that could be jurisdictional, etc.

However, there is that learning that allows them to improve their processes across factories. Maintenance is one of them, as well as creating new products, and connecting better with their customers. We can see a lot of examples in the marketplace. I won't mention names, but there are lots of them out there with the large manufacturers.

Gardner: We've had just-in-time manufacturing and lean processes for quite some time, trying to compress the supply chain and distribution networks, but these haven't necessarily been done through public networks, the internet, or standardized approaches.

But if we're to benefit, we're going to need to be able to be platform companies, not just product companies. How do you go from being a proprietary set of manufacturing protocols and approaches to this wider, standardized interoperability architecture?

Tabet: That's a very good question, because now we're talking about that connection to the customer. With the airline and the jet engine manufacturer, for example, when the plane lands and there has been some monitoring of the activity during the whole flight, at that moment, they'll get that data made available. There could be improvements and maybe solutions available as soon as the plane lands.

Interoperability

That requires interoperability. It requires Platform 3.0 for example. If you don't have open platforms, then you'll deal with the same hurdles in terms of proprietary technologies and integration in a silo-based manner.

Gardner: Penelope, you've been writing about the obstacles to decision-making that might become apparent as big data becomes more prolific and people try to capture all the data about all the processes and analyze it. That's a little bit of a departure from the way we've made decisions in organizations, public and private, in the past.

Of course, one of the bigger tenets of Internet of Things is all this great data that will be available to us from so many different points. Is there a conundrum of some sort? Is there an unknown obstacle for how we, as organizations and individuals, can deal with that data? Is this going to be chaos, or is this going to be all the promises many organizations have led us to believe around big data in the Internet of Things?

Gordon: It's something that has just been accelerated. This is not a new problem in terms of the decision-making styles not matching the inputs that are being provided into the decision-making process.

Gordon
Former US President Bill Clinton was known for delaying making decisions. He's a head-type decision-maker and so he would always want more data and more data. That just gets into a never-ending loop, because as people collect data for him, there is always more data that you can collect, particularly on the quantitative side. Whereas, if it is distilled down and presented very succinctly and then balanced with the qualitative, that allows intuition to come to fore, and you can make optimal decisions in that fashion.

Conversely, if you have someone who is a heart-type or gut-type decision-maker and you present them with a lot of data, their first response is to ignore the data. It's just too much for them to take in. Then you end up completely going with whatever you feel is correct or whatever you have that instinct that it's the correct decision. If you're talking about strategic decisions, where you're making a decision that's going to influence your direction five years down the road, that could be a very wrong decision to make, a very expensive decision, and as you said, it could be chaos.

It just brings to mind to me Dr. Seuss’s The Cat in the Hat with Thing One and Thing Two. So, as we talk about the Internet of Things, we need to keep in mind that we need to have some sort of structure that we are tying this back to and understanding what are we trying to do with these things.
If you have someone who is a heart-type or gut-type decision-maker and you present them with a lot of data, their first response is to ignore the data.

Gardner: Openness is important, and governance is essential. Then, we can start moving toward higher-order business platform benefits. But, so far, our panel has been a little bit cynical. We've heard that the opportunity and the challenges are commensurate in the public sector and that in manufacturing we're moving into a whole new area of interoperability, when we think about reaching out to customers and having a boundary that is managed between internal processes and external communications.

And we've heard that an overload of data could become a very serious problem and that we might not get benefits from big data through the Internet of Things, but perhaps even stumble and have less quality of decisions.

So Dave Lounsbury of The Open Group, will the same level of standardization work? Do we need a new type of standards approach, a different type of framework, or is this a natural path and course what we have done in the past?

Different level

Lounsbury: We need to look at the problem at a different level than we institutionally think about an interoperability problem. Internet of Things is riding two very powerful waves, one of which is Moore's Law, that these sensors, actuators, and network get smaller and smaller. Now we can put Ethernet in a light switch right, a tag, or something like that.

Lounsbury
Also, Metcalfe's Law that says that the value of all this connectivity goes up with the square of the number of connected points, and that applies to both the connection of the things but more importantly the connection of the data.

The trouble is, as we have said, that there's so much data here. The question is how do you manage it and how do you keep control over it so that you actually get business value from it. That's going to require us to have this new concept of a platform to not only to aggregate, but to just connect the data, aggregate it, correlate it as you said, and present it in ways that people can make decisions however they want.

Also, because of the raw volume, we have to start thinking about machine agency. We have to think about the system actually making the routine decisions or giving advice to the humans who are actually doing it. Those are important parts of the solution beyond just a simple "How do we connect all the stuff together?"

Gardner: We might need a higher order of intelligence, now that we have reached this border of what we can do with our conventional approaches to data, information, and process.

Thinking about where this works best first in order to then understand where it might end up later, I was intrigued again this morning by Professor Van Alstyne. He mentioned that in healthcare, we should expect major battles, that there is a turf element to this, that the organization, entity or even commercial corporation that controls and manages certain types of information and access to that information might have some very serious platform benefits.
The question is how do you manage it and how do you keep control over it so that you actually get business value from it.

The openness element now is something to look at, and I'll come back to the public sector. Is there a degree of openness that we could legislate or regulate to require enough control to prevent the next generation of lock-in, which might not be to a platform to access to data information and endpoints? Where is it in the public sector that we might look to a leadership position to establish needed openness and not just interoperability.

Barsoum: I'm not even sure where to start answering that question. To take healthcare as an example, I certainly didn't write the bible on healthcare IT systems and if someone did write that, I think they really need to publish it quickly.

We have a single-payer system in Canada, and you would think that would be relatively easy to manage. There is one entity that manages paying the doctors, and everybody gets covered the same way. Therefore, the data should be easily shared among all the players and it should be easy for you to go from your doctor, to your oncologist, to whomever, and maybe to your pharmacy, so that everybody has access to this same information.

We don't have that and we're nowhere near having that. If I look to other areas in the public sector, areas where we're beginning to solve the problem are ones where we face a crisis, and so we need to address that crisis rapidly.

Possibility of improvement

In the transportation infrastructure, we're getting to that point where the infrastructure we have just doesn't meet the needs. There's a constraint in terms of money, and we can't put much more money into the structure. Then, there are new technologies that are coming in. Chris had talked about driverless cars earlier. They're essentially throwing a wrench into the works or may be offering the possibility of improvement.

On any given piece of infrastructure, you could fit twice as many driverless cars as cars with human drivers in them. Given that set of circumstances, the governments are going to find they have no choice but to share data in order to be able to manage those. Are there cases where we could go ahead of a crisis in order to manage it? I certainly hope so.

Gardner: How about allowing some of the natural forces of marketplaces, behavior, groups, maybe even chaos theory, where if sufficient openness is maintained there will be some kind of a pattern that will emerge? We need to let this go through its paces, but if we have artificial barriers, that might be thwarted or power could go to places that we would regret later.

Barsoum: I agree. People often focus on structure. So the governance doesn't work. We should find some way to change the governance of transportation. London has done a very good job of that. They've created something called Transport for London that manages everything related to transportation. It doesn't matter if it's taxis, bicycles, pedestrians, boats, cargo trains, or whatever, they manage it.
In the transportation infrastructure, we're getting to that point where the infrastructure we have just doesn't meet the needs.

You could do that, but it requires a lot of political effort. The other way to go about doing it is saying, "I'm not going to mess with the structures. I'm just going to require you to open and share all your data." So, you're creating a new environment where the governance, the structures, don't really matter so much anymore. Everybody shares the same data.

Gardner: Said, to the private sector example of manufacturing, you still want to have a global fabric of manufacturing capabilities. This is requiring many partners to work in concert, but with a vast new amount of data and new potential for efficiency.

How do you expect that openness will emerge in the manufacturing sector? How will interoperability play when you don't have to wait for legislation, but you do need to have cooperation and openness nonetheless?

Tabet: It comes back to the question you asked Dave about standards. I'll just give you some examples. For example, in the automotive industry, there have been some activities in Europe around specific standards for communication.

The Europeans came to the US and started to have discussions, and the Japanese have interest, as well as the Chinese. That shows, because there is a common interest in creating these new models from a business standpoint, that these challenges they have to be dealt with together.

Managing complexity

When we talk about the amounts of data, what we call now big data, and what we are going to see in about five years or so, you can't even imagine. How do we manage that complexity, which is multidimensional? We talked about this sort of platform and then further, that capability and the data that will be there. From that point of view, openness is the only way to go.

There's no way that we can stay away from it and still be able to work in silos in that new environment. There are lots of things that we take for granted today. I invite some of you to go back and read articles from 10 years ago that try to predict the future in technology in the 21st century. Look at your smart phones. Adoption is there, because the business models are there, and we can see that progress moving forward.

Collaboration is a must, because it is a multidimensional level. It's not just manufacturing like jet engines, car manufacturers, or agriculture, where you have very specific areas. They really they have to work with their customers and the customers of their customers.
Adoption is there, because the business models are there, and we can see that progress moving forward.

Gardner: Dave, I have a question for both you and Penelope. I've seen some instances where there has been a cooperative endeavor for accessing data, but then making it available as a service, whether it's an API, a data set, access to a data library, or even analytics applications set. The Ocean Observatories Initiative is one example, where it has created a sensor network across the oceans and have created data that then they make available.

Do you think we expect to see an intermediary organization level that gets between the sensors and the consumers or even controllers of the processes? Is there's a model inherent in that that we might look to -- something like that cooperative data structure that in some ways creates structure and governance, but also allows for freedom? It's sort of an entity that we don't have yet in many organizations or many ecosystems and that needs to evolve.

Lounsbury: We're already seeing that in the marketplace. If you look at the commercial and social Internet of Things area, we're starting to see intermediaries or brokers cropping up that will connect the silo of my android ecosystem to the ecosystem of package tracking or something like that. There are dozens and dozens of these cropping up.

In fact, you now see APIs even into a silo of what you might consider a proprietary system and what people are doing is to to build a layer on top of those APIs that intermediate the data.

This is happening on a point-to-point basis now, but you can easily see the path forward. That's going to expand to large amounts of data that people will share through a third party. I can see this being a whole new emerging market much as what Google did for search. You could see that happening for the Internet of Things.

Gardner: Penelope, do you have any thoughts about how that would work? Is there a mutually assured benefit that would allow people to want to participate and cooperate with that third entity? Should they have governance and rules about good practices, best practices for that intermediary organization? Any thoughts about how data can be managed in this sort of hierarchical model?

Nothing new

Gordon: First, I'll contradict it a little bit. To me, a lot of this is nothing new, particularly coming from a marketing strategy perspective, with business intelligence (BI). Having various types of intermediaries, who are not only collecting the data, but then doing what we call data hygiene, synthesis, and even correlation of the data has been around for a long time.

It was an interesting, when I looked at recent listing of the big-data companies, that some notable companies were excluded from that list -- companies like Nielsen. Nielsen's been collecting data for a long time. Harte-Hanks is another one that collects a tremendous amount of information and sells that to companies.

That leads into the another part of it that I think there's going to be. We're seeing an increasing amount of opportunity that involves taking public sources of data and then providing synthesis on it. What remains to be seen is how much of the output of that is going to be provided for “free”, as opposed to “fee”. We're going to see a lot more companies figuring out creative ways of extracting more value out of data and then charging directly for that, rather than using that as an indirect way of generating traffic.

Gardner: We've seen examples of how this has been in place. Does it scale and does the governance or lack of governance that might be in the market now sustain us through the transition into Platform 3.0 and the Internet of Things.
Having standards is going to increasingly become important, unless we really address a lot of the data illiteracy that we have.

Gordon: That aspect is the lead-on part of “you get what you pay for”. If you're using a free source of data, you don't have any guarantee that it is from authoritative sources of data. Often, what we're getting now is something somebody put it in a blog post, and then that will get referenced elsewhere, but there was nothing to go back to. It's the shaky supply chain for data.

You need to think about the data supply and that is where the governance comes in. Having standards is going to increasingly become important, unless we really address a lot of the data illiteracy that we have. A lot of people do not understand how to analyze data.

One aspect of that is a lot of people expect that we have to do full population surveys, as opposed representative sampling to get much more accurate and much more cost-effective collection of data. That's just one example, and we do need a lot more in governance and standards.

Gardner: What would you like to see changed most in order for the benefits and rewards of the Internet of Things to develop and overcome the drawbacks, the risks, the downside? What, in your opinion, would you like to see happen to make this a positive, rapid outcome? Let's start with you Jean-Francois.

Barsoum: There are things that I have seen cities start to do now. There are couple of examples: Philadelphia is one and Barcelona does this too. Rather than do the typical request for proposal (RFP), where they say, "This is the kind of solution we're looking for, and here are our parameters. Can l you tell us how much it is going to cost to build," they come to you with the problem and they say, "Here is the problem I want to fix. Here are my priorities, and you're at liberty to decide how best to fix the problem, but tell us how much that would cost."

If you do that and you combine it with access to the public data that is available -- if public sector opens up its data -- you end up with a very powerful combination that liberates a lot of creativity. You can create a lot of new business models. We need to see much more of that. That's where I would start.

More education

Tabet: I agree with Jean-Francois on that. What I'd like to add is that I think we need to push the relation a little further. We need more education, to your point earlier, around the data and the capabilities.

We need these platforms that we can leverage a little bit further with the analytics, with machine learning, and with all of these capabilities that are out there. We have to also remember, when we talk about the Internet of Things, it is things talking to each other.

So it is not human-machine communication. Machine-to-machine automation will be further than that, and we need more innovation and more work in this area, particularly more activity from the governments. We've seen that, but it is a little bit frail from that point of view right now.

Gardner: Dave Lounsbury, thoughts about what need to happen in order to keep this on the tracks?
Thank you for mentioning the machine-to-machine part, because there are plenty of projections that show that it's going to be the dominant form of Internet communication, probably within the next four years.

Lounsbury: We've touched on lot of them already. Thank you for mentioning the machine-to-machine part, because there are plenty of projections that show that it's going to be the dominant form of Internet communication, probably within the next four years.

So we need to start thinking of that and moving beyond our traditional models of humans talking through interfaces to set of services. We need to identify the building blocks of capability that you need to manage, not only the information flow and the skilled person that is going to produce it, but also how you manage the machine-to-machine interactions.

Gordon: I'd like to see not so much focus on data management, but focus on what is the data managing and helping us to do. Focusing on the machine-to-machine and the devices is great, but it should be not on the devices or on the machines… it should be on what can they accomplish by communicating; what can you accomplish with the devices and then have a reverse engineer from that.

Gardner: Let's go to some questions from the audience. The first one asks about a high order of intelligence which we mentioned earlier. It could be artificial intelligence, perhaps, but they ask whether that's really the issue. Is the nature of the data substantially different, or we are just creating more of the same, so that it is a storage, plumbing, and processing problem? What, if anything, are we lacking in our current analytics capabilities that are holding us back from exploiting the Internet of Things?

Gordon: I've definitely seen that. That has a lot to do with not setting your decision objectives and your decision criteria ahead of time so that you end up collecting a whole bunch of data, and the important data gets lost in the mix. There is a term "data smog."

Most important

The solution is to figure out, before you go collecting data, what data is most important to you. If you can't collect certain kinds of data that are important to you directly, then think about how to indirectly collect that data and how to get proxies. But don't try to go and collect all the data for that. Narrow in on what is going to be most important and most representative of what you're trying to accomplish.

Gardner: Does anyone want to add to this idea of understanding what current analytics capabilities are lacking, if we have to adopt and absorb the Internet of Things?

Barsoum: There is one element around projection into the future. We've been very good at analyzing historical information to understand what's been happening in the past. We need to become better at projecting into the future, and obviously we've been doing that for some time already.

But so many variables are changing. Just to take the driverless car as an example. We've been collecting data from loop detectors, radar detectors, and even Bluetooth antennas to understand how traffic moves in the city. But we need to think harder about what that means and how we understand the city of tomorrow is going to work. That requires more thinking about the data, a little bit like what Penelope mentioned, how we interpret that, and how we push that out into the future.

Lounsbury: I have to agree with both. It's not about statistics. We can use historical data. It helps with lot of things, but one of the major issues we still deal with today is the question of semantics, the meaning of the data. This goes back to your point, Penelope, around the relevance and the context of that information – how you get what you need when you need it, so you can make the right decisions.
As soon as you talk about interoperability in the health sector, people start wondering where is their data going to go.

Gardner: Our last question from the audience goes back to Jean-Francois’s comments about the Canadian healthcare system. I imagine it applies to almost any healthcare system around the world. But it asks why interoperability is so difficult to achieve, when we have the power of the purse, that is the market. We also supposedly have the power of the legislation and regulation. You would think between one or the other or both that interoperability, because the stakes are so high, would happen. What's holding it up?

Barsoum: There are a couple of reasons. One, in the particular case of healthcare, is privacy, but that is one that you could see going elsewhere. As soon as you talk about interoperability in the health sector, people start wondering where is their data going to go and how accessible is it going to be and to whom.

You need to put a certain number of controls over top of that. What is happening in parallel is that you have people who own some data, who believe they have some power from owning that data, and that they will lose that power if they share it. That can come from doctors, hospitals, anywhere.

So there's a certain amount of change management you have to get beyond. Everybody has to focus on the welfare of the patient. They have to understand that there has to be a priority, but you also have to understand the welfare of the different stakeholders in the system and make sure that you do not forget about them, because if you forget about them they will find some way to slow you down.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: The Open Group.

You may also be interested in:

Wednesday, September 10, 2014

How Waste Management builds a powerful services continuum across IT operations, infrastructure, development, and processes

It's only been a few years since Waste Management's IT organization began rebuilding their quality assurance processes from the ground up.

"Our availability scorecard was pretty bad. Our services were down. At times, we didn’t know that our services were down. Our first indication of a problem was from customers calling us," remembers Gautam Roy, Vice President of Infrastructure, Operations and Technical Services at Waste Management in Houston, Texas.

"Now, fast-forward a few years -- with making the appropriate choices and investments in technology, such as in people and processes -- and our scorecard is very good. We know of the problems rapidly. We proactively detect problems and fix the problems before they impact our customers," he says.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

To learn how Waste Management came to deliver 4 9s availability for its critical applications, BriefingsDirect sat down with Roy at the recent HP Discover conference in Las Vegas. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Roy: Water Management is an environmental services company. We have primarily three lines of business. First is waste service. This is our traditional waste pickup, transfer, and disposal. Our second line of business is renewable energy or green energy, and our third is recycling.

Roy
What makes Waste Management different from others in the waste industry is that we also invest quite a lot of effort in next-generation waste technology. We invest in companies like Agilyx, which converts very hard-to-recycle waste, such as plastic, into crude oil. We convert organic food waste into natural gas. We pressurize, scrub, and dry municipal solid waste into solid fuel, which burns cleaner than coal.

And we're quite diverse, a global company. We have operations in the US and Canada, Asia, and Europe. We have our renewable energy plants. There is quite a large array of technology and IT to support these business processes to ensure consistent business-services availability.

Gardner: As with many organizations, gaining greater visibility into operations -- having earlier detection of problems, and therefore earlier remediation -- means better performance. What were some of the drivers for your organization specifically to mature your IT operations?

Business transformation

Roy: I'll give a few business reasons, and a couple of technology reasons. From the business side, we began business transformation a couple of years ago. We wanted to ensure that we unlocked the value for our customers and for us, and to institutionalize the benefits for Waste Management.

Customer care, providing outstanding, world-class customer service is aligned completely with our business strategy. Business services availability is crucial, it's in our DNA. Our IT business service availability scorecard a few years ago wasn't too good. So we had to put the focus on people, process, and technology to ensure that we provide a very consistent service set to our customers.

Gardner: Moving across the spectrum of development, test, and operations can be challenging for many organizations. You have put in place standardized processes to measure, organize, and perform better across the DevOps spectrum. Tell us how you accomplished that. How did you get there?

Roy: That's a very good question. For us, IT business-service availability is really not about having a great monitoring solution. It starts even before the services are in production. It starts with partnership with our business and business requirements. It starts with having a great development methodology and a robust testing program. It starts with architecture processes, standardization, and communication. All those things have to be in place. And you have to have security services and a monitoring solution to wrap it up.
We try to approach it from the front end, instead of chasing it from the back end.

What we are trying to do is to not fight the issue at the back-end. If a service is down, our monitoring software picks it up, our operational team and engineering team jumps on it, we are able to fix the problem ASAP before it impacts the customer. Great. But, boy, wouldn’t it be nice if those services aren't going down in the first place? So we try to approach it from the front-end, instead of just chasing it from the back-end.

Gardner: So it’s Application Lifecycle Management (ALM) and Business Service Management (BSM), not one or the other, but really both -- and simultaneously?

Roy: Exactly, ALM, BSM, testing, and security products. We also want to make sure that the services are not down from intentional disruption. We want to make sure that we produce code with quality and velocity, and code that is consistent with the experience of our customer.

With our operational processes, ITIL and Lean IT, we want to make sure that the change management and incident management are followed to our prescription. We want to make sure that the disaster-recovery (DR) program, the high-availability (HA) program, the security operation center (SOC), the network operation center (NOC), and the command centers are all working together to ensure that the services are up 24/7, 365.

Gardner: And when you do this well, when you have put in place many of the capabilities that we have been describing, do you have any sense of payback? Do you keep score?

Availability scorecard

Roy: A few years ago, when we were not as good at it, we started rebuilding this all from the ground up, and our availability scorecard was pretty bad. Our services were down. At times, we didn’t know that our services were down. Our first indication of a problem was from customers calling us.

Now, fast-forward a few years, with making the appropriate choices and investments in technology -- such as in people and processes --  and our scorecard is very good. We know of the problems rapidly. We proactively detect problems and fix the problems before they impact our customers.

We have 4 9s availability for our critical applications. We're able to provide services to our customers via wm.com, our digital channel, and it has been quite a success story. We still have work to cover, but it has been following the right trajectory.

Gardner: Here at HP Discover, are there any developments that you're monitoring closely? Are there some things that you're particularly interested in that might help you continue to close the gap on quality?
We want to provide optimal solutions at a right price point for our customers and our business.

Roy: Sure. Things like understanding what's happening in the world of big data and HP’s views and position on that. I want to understand and learn about testing, software testing, how to test faster and produce better code, and to ensure, on a continuous basis that we're reducing the cost of running the business. We want to provide optimal solutions at a right price point for our customers and our business.

Gardner: On that topic of big data, are you referring to the data generated within IT, in your systems, to be able to better analyze and react to that? Or perhaps also the data from your marketplace, things that your customers might be saying in social media, for example? Or is it all of the above?

Roy: It’s all of the above. We have internal data that we're harvesting. We want to understand what it’s telling us. And we'd like to predict certain trends of our system, across the use of our applications.

Externally, we have 18 call centers. We get user calls. We also want to know our customer better and serve them the best. So we want to move into a situation where we can take their issues, frame them into solutions, and proactively service them the best in our industry.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in:

Monday, September 8, 2014

GSN Games hits top prize using big data to uncover deep insights into gamer preferences

It's a shame when the data analysis providers inside a company get the cold shoulder from the business leaders because the data keeps proving the status quo wrong, or contradicts the conventional corporate wisdom.

Fortunately for GSN Games in San Francisco, there's no such culture clash there. "The real thing that's helped us get to the point we are is a culture where everybody is open to being wrong -- and open to being proven wrong by the data," says Portman Wills, Vice President of Data at GSN Games.

"One of the things we use data for is to challenge all of our assumptions about our own products and our own businesses, says Wills. "It's really gotten to a point where it's almost religious in our company. The moment two people start debating what should or shouldn't happen, they say, 'Well let's just let the data decide.' That's been a core change not just for us, but for the game industry as a whole."

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy.

How did GSN Games get to the point where the data usually wins? It took a blazing fast data warehouse of 1.3 trillion rows that consumes, stores and produces analysis from some 110 million registered game-players in near real time. The next BriefingsDirect podcast focuses on just how GSN Games exploits such big data to effectively uncover game-changing entertainment trends for their audience. Oh, and it changes corporate cultures, too.

The discussion, at the recent HP Discover conference in Barcelona, is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Wills
Wills: GSN started as a cable network in the U.S. We’re distributed in 80 million households as the Game Show Network, and then we also have a digital wing that produces casual and social games on Facebook, web, tablets, and mobile. That division has 110 million registered game-players. My team takes data from all over those worlds, throws them into a big data warehouse, and starts trying to find trends and insights for both our TV audience and our online game-players.

In terms of the games, which is really where the growth is, our core demographic is older females, believe it or not, who love playing casual games. We skew more in the 55-plus age range, and we have players from all over the world.

Gardner: The word “games” means a lot of different things to a lot of people. We’re talking about a heritage of network television games back in the ’60s and ’70s that have led us to what is now your organization. But what sort of newer games are we talking about, and what proportion of them are online games, versus more of the passive watching like that on a cable or other media outlets?

Wills: Originally, when our games division started as a branch of GSN, it was companion games to Wheel of Fortune, Minute to Win It, whatever the hot game show was. That's still a part of it, but the growth in the last few years has been in social games on Facebook, where a lot of our games are more casual titles and have nothing to do with the game shows -- tile-matching games or solitaire games, for example.
In the last year or year-and-a-half for us, like everyone else, there’s been this explosion in mobile.

Then, in the last year or year-and-a-half for us, like everyone else, there’s been this explosion in mobile. So it’s iPad, Android, and iPhone games, and there we have the solitaires and the tile matching, too.

Increasingly, a lot of our success and growth has come from virtual casino games. People are playing Bingo, video poker, even slots, virtual slots. We have this title called GSN Casino. That’s an umbrella app with a lot of mini games that are casino-themed, and that one has really just exploded really in the last six months. It's a long way from the Point A of Family Feud reruns to the Point Z of virtual slot machines, but hopefully you can see how we got there.

Gardner: It seems like a long distance, but it’s been also a fairly short amount of time. It wasn't that long ago that the information you might have in your audience came through Nielsen for passive audiences, and you had basically a one- or two-dimension view of that individual, based on the estimate of what time was devoted to a show. But now, with the mobile devices in particular, you have a plethora of data.

Tell us about the types of data that you can get, and what volumes are we talking about.

Mobile experience

Wills: Let’s take mobile, because I think it's easy to grok. Everything about the device is exposed to us. The fact that you’re playing on an iPad Mini Retina versus an iPad 1 tells us a lot about you, whether you know it or not.

Then, a lot of our users sign-in via Facebook, which is another vector for information. If you sign-in via Facebook, Facebook provides us your age range, gender, some granular location information. For every player, we get between 40 and 50 dimensions of data about that player or about that device.

That’s one bucket. But the actual gameplay is another whole bucket. What games do you choose to play in our catalog? How long do you play them? What time of day do you play them? Those start to classify users into various buckets -- from the casual commute player, who plays for 15 minutes every morning and afternoon, to the hard-core player who spends 8 to 10 hours a day, believe it or not, playing our games on their mobile devices.
Mobile doesn’t necessarily mean mobile, like out and about. A lot of our players are on their iPad, sitting on the couch in their home.

At that point, and this is a little bit of a pet peeve of mine, mobile doesn’t necessarily mean mobile, like out and about. A lot of our players are on their iPad, sitting on the couch in their home.

It’s not mobility. They’re not using 3G. They’re not using augmented reality. It’s just a device that happens to be a very convenient device for playing games. So it’s much more of a laptop replacement than any sort of mobile thing. That’s sort of a side track.

We collect all of this data, and it’s a fair amount. Right now, we’re generating about 900 million events per day across all of our players. That’s all streamed into our HP Vertica data warehouse, and there are a few tables, event time series tables, that we put the stuff into. A small table for us would be a few hundred billion records, and a large table, as I said, is 1.3 trillion records right now.

So the scale is big for us. I know that for other companies that seems like peanuts. It’s funny how big data is so broad. What’s big to one person is tiny to someone else, but this is the world that we’re dealing in right now.

We have 110 million players. Thankfully, not all of them are active at one time. That would be really big data. But we will have about 20 million at any given time in peak time playing concurrently. That’s a little bit about the numbers in our data warehouse.

Gardner: Understanding your audience through this data is something fairly new. Before, you couldn’t get this amount of data. Now that you have it, what is it able to do for you? Are you crafting new games based on your findings? Are you finding information that you can deliver back to a marketer or advertiser that links them to the audience better? There must be many things you can do.

No advertising

Wills: First of all, we don’t do any advertising in our mobile games. So that’s one piece that we’re not doing, although I know others are. But there are two broad buckets in which we use data. The first is that we run a lot of the A/B tests, experiments. All of our games are constantly being multivariate tested with different versions of that same game in the field.

We run 20 to 40 tests per week. As an example, we have a Wheel of Fortune game that we recently released, and there was all this debate about the difficulty of the puzzles. How hard should the puzzles be? Should they be very obscure pieces of Eastern literature, mainstream pop culture, or even easier?

So, we tested different levels of difficulty. Some players got the easy, some players got the medium, and some players got the hard ones. We can measure the return rate, the session duration, and the monetization for people who buy power-ups, and we see which level of difficulty performs the best. In the first test of easy, medium, hard, easy overwhelmingly did the best.

So we generated a whole bunch of new puzzles that were even easier than were the previous easy ones and tested that against what was now the control level. The easier puzzles won again. So we generated a whole new set of puzzles that were absurdly easy. We were trying to prove the point that if we gave Wheel of Fortune puzzles that are four-letter words like “bird” and “cups,” nobody would enjoy playing something that simplistic.

Well it turns that they do -- surprise, surprise -- and so that’s how we evolved into a version of Wheel of Fortune that, compared to the game show, looks very different, but it’s actually what customers want. It’s what players want. They want to relax and solve simple puzzles like “door.”
Hopefully faster than overnight. Overnight is a little too slow these days.

Gardner: So Vertica analysis determined that everyone is a winner on GSN, but you’re able to do real-time focus-group types of activities. The data -- because it's so fast, because there is so much information available and you can deal with it so quickly -- means that you’re able to tune your games to the audience virtually overnight.

Wills: Hopefully faster than overnight. Overnight is a little too slow these days. We push twice a day both to our platform code and updates to all of our games in the morning around 11 a.m and in the afternoon around 3:30. Each one of those releases is based on the data that came from the prior release.

So we're constantly evolving these games. I want to go back to your previous question, because I only got to talk about one bucket, which is this experimentation. The other bucket is using the usage patterns that customers have to evolve our product in ways that aren’t necessarily structured around an A/B test.

We thought when we launched our iPhone app that there would be a lot of commuting usage. We had in our head this hypothetical bus player, who plays on the bus in the morning. And so we thought we would build all the stuff around daily patterns. We built this daily return bonus that you can do in the morning and then again in the evening.

The data showed us that that really was only a tiny fraction of our players. There were, in fact, very few players who had this bimodal, morning and evening usage pattern. Most people didn't play at all until after dinner and then they would play a lot, sometimes even binge from 7 p.m. until 2 a.m. on games.

False assumptions

That was an area where we didn't even set up an experiment. We just had false assumptions about our player base. And that happens a surprising amount of the time. We all -- especially the game-design team and people who spent their careers designing video games -- have assumptions about their audience that half the time are just wrong. One of the things we use data for is to challenge all of our assumptions about our own products and our own businesses.

It's really gotten to a point where it's almost religious in our company. The moment two people start debating what should or shouldn't happen, they say, “Well let's just let the data decide.” That's been a core change not just for us, but for the game industry as a whole.

Because we’re here in Spain, a quick tidbit that we uncovered recently is that our main time-frame in every country on Earth, when people play games, is 7 p.m. to 11 p.m., except in Spain where it’s 1 p.m. to 3 p.m. -- siesta time. That’s just one of the examples of how we use big data to use discover insights about our players and our audiences worldwide.

Understanding the audience

Gardner: I have to imagine that the data that led you to that inference in Spain was something other than what we might consider typical structured data. How did the different data brought together allow you to understand your audience better?

Wills: We use this product from HP called Vertica, which is just a tremendous data warehouse, that lets us throw every single click, touch, or swipe in all of our games into a big table. By big, I mean right now it’s I think 1.3 trillion rows. We keep saying that we should really archive this thing. Then, we say we’ll archive it when it slows down, and then it just never slows down, so we have yet to archive it.

We put all of the click stream data in there. The traditional joins, schemas, and all of that don’t really have to happen because we have one table with all of the interactions. You have the device, the country, the player, all these attributes. It’s a very wide table. So if you want to do things like ask what is the usage in five-minute slices by country, it’s a simple SQL query, and you get your results.

Gardner: What you’re describing is very much desired by a lot of types of businesses through understanding a massive amount of data from their audience, to be able to react quickly to that, and then to stop guessing about products and pricing and distribution and logistics and supply chain and be driven purely by the data. You’re a really interesting harbinger of things to come.
One of the things we use data for is to challenge all of our assumptions about our own products and our own businesses.

Portman, tell me little bit about the process by which you were able to do this. Did you have an older data warehouse? What did you use before, and how did you make a transition to HP Vertica?

Wills: When we started the social mobile business three years ago, we were on MySQL, which we are still on for our transactional load. We have three data centers around the world. When people are playing our games, it’s recording, reading, and writing 125,000 transactions per second, and that MySQL, sharded out, works great for that.

When you want to look at your entire player base and do a cross-shard query, we found that MySQL really fell down. Our original Vertica proof of concept (POC) was just to replace these A/B test queries, which have to look across the entire population.

So in comes Vertica. We set up a single node, a Vertica data warehouse. We pull in a year's worth of data, and the same query to synthesize these sessions ran in 800 milliseconds.

So the thing that took 24 hours, which is 86,400 seconds, ran in less than one second. By the way, that 24-hour query was running across dozens of machines, and this Vertica query was running on a single server of commodity hardware.

That's when we really became believers in the power of the column store and column-oriented data warehouses. From the small beginning of just one simple query, it’s now expanded -- and pretty much our whole business runs on top of HP Vertica on the data warehouse side.

Lessons learned

Gardner: As I said, I think GSN Games is a really harbinger of what a lot of other companies in many different vertical industries will be seeking. Looking back, if you had to do it again, what might you have done differently or what suggestions might you have for others who would like to be able to do what you are doing?

Wills: I definitely wish that we had switched to a column store sooner. I think the reason that we've been so successful at this is because of our game design team, which was so open to using data.
I definitely wish that we had switched to a column store sooner.

I’ve heard hard stories from other companies where they want to use a data-driven approach, and there's just a lot of cultural inertia and push back against doing that. It's hard to be consistently proven wrong in your job, which is always what happens when you rely on data.

The real thing that's helped us get to the point we are in is a culture and a company where everybody is open to being wrong -- and open to being proven wrong by the data, which I am very thankful for.

Gardner: Well, it's good to be data-driven, and I think you should feel good being responsible for making 110 million people feel good about themselves every day.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in: