Tuesday, August 18, 2015

The future of business intelligence as a service with GoodData and HP Vertica

The next BriefingsDirect big data innovation case study interview highlights how GoodData expands the realms and possibilities for delivering business intelligence (BI) and data warehousing as a service. We'll learn how they're exploring new technologies to make that more seamless across more data types for more types of users -- all in the cloud.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Read a full transcript or download a copy.

To learn the ups and downs of BIaaS, we welcome Jeff Morris, Vice President of Marketing at GoodData in San Francisco, and Chris Selland, Vice President for Business Development at HP Vertica. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Tell us about GoodData, what you do, and why it's different.

Morris: GoodData is an analytics platform as a service (PaaS). We cover the full spectrum end-to-end use case of creating an analytic infrastructure as a service and delivering that to our customers.

We take on the challenges of collecting the data, whatever it is, structured and unstructured. We use a variety of technologies as appropriate, as we do that. We warehouse it in our multitenant, massively scalable data warehouse that happens to be powered by HP Vertica.

We then combine and integrate it into whatever the customer’s particular key performance indicators (KPIs) are. We present that in aggregate in our extensible analytics engine and then present it to the end users through desired dashboards, reports, or discoverable analytics.

Our business is set up such that about half of our business operates on an internal use case, typically a sales and marketing and social analytic kind of use case. The other half of our business, we call "Powered by GoodData." and those customers are embedding the GoodData technology in their own products. So we have a number of companies creating these customer-facing data products that ultimately generate new streams of revenue for their business.

40,000 customers

We've been at this since 2007. We're serving about 40,000 customers at this point and enjoying somewhere around 2.4 million data uploads a week. We've built out the service such that it's massively scalable. We deliver incredibly fast time to market. Last quarter, about two thirds of our deployments were delivered within 16 weeks or less.

One of the divisions of HP, in fact, deployed GoodData in less than six weeks. They are giving their first set of KPIs and delivering that value to them. What’s making us different in the marketplace right now is that we're eliminating all of the headaches associated with creating your own big data lake-style BI infrastructure and environment.

What we end up doing is affording you the time to focus on the analytics and the results that you gain from them—without having to manage the back-end operations.

Gardner: You're creating analytic applications on datasets that are easily contributed to your platform.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
Morris: Yes, indeed. The datasets themselves also tend to be born in the cloud. As I said, the types of applications that we're building typically focus on sales and marketing and social, and e-commerce related data, all of which are very, very popular, cloud-based data sources. And you can imagine they're growing like crazy.
We see a leaning in our customer base of integrating some on-premise information, typically from their legacy systems, and then marrying that up with the Salesforce, or the market data or social information that they want to integrate and build a full view of their customers -- or a full exposure of what their own applications are doing.
What we end up doing is affording you the time to focus on the analytics and the results that you gain from them—without having to manage the backend operations.

Gardner: So you're providing an excellent example of how HP Vertica forms a cloud-borne analytics platform. Are any of your clients doing this both on-premises and taking advantage of what the cloud does best? Are we now on the vanguard of hybrid BI?

Morris: We're getting there, and there are certainly some industries are more cloud friendly than others right now. Interestingly, the healthcare space is starting to, but they're still nascent. The financial services industry is still nascent. They're very protective of their information. But retailers, e-commerce organizations, technology ISVs, and digital media agencies have adopted the cloud-based model very aggressively.

We're seeing a terrific growth and expansion there and we do see use cases right now where we're beginning to park the cloud-based environment alongside your more traditional analytics environments to create that hybrid effect. Often, those customers are recognizing that the speed at which data is growing in the cloud is driving them to look for a solution like ours.

Gardner: Chris, how unique is GoodData in terms of being all cloud moving toward hybrid?

Special relationship

Selland: GoodData is certainly a very special partner and a very special relationship for us. As you said, Vertica is fundamentally a software platform that was purpose-built for big data that is absolutely cloud-enabled. But GoodData is the best representation of the partner who has taken our platform and then rolled out service offerings that are specifically designed to solve specific problems. It's also very flexible and adaptable.

So, it’s a special partnership and relationship. It's a great proof point for the fact that the HP Vertica platform absolutely was designed to be running in the cloud for those customers who want to do it.

As Jeff said, though, it really varies greatly by industry. A large majority of the customers in our customer advisory board (CAB), which tend to be some of our largest customers and some pretty well-known industries, were saying how they will never put their data in the cloud.

Never is a very long time, but at the same time, there are other industries that are adopting it very rapidly. So there is a rate of change that’s going on in the industry. It varies by size of company, by the type of competitive environment, and by the type of data. And yes, there is a lot of hybridization going on out there. We're seeing more of the hybridization in existing organizations that are migrating to the cloud. There's a lot of new breed companies who started in the cloud and have every intent of staying there.

But there's a lot of dynamism in this industry, a lot of change, and this is a partnership that is a true win-win. As I said, it's a very special relationship for both companies.

Gardner: There's more than just HP Vertica. There's HP Haven, which includes Hadoop, Autonomy, security and applications. Is there a path that you see whereby you can try to be as many things to as many types of customer and vertical industries?

Morris: Absolutely. The HP Haven-style architecture is a vision in a direction that we are going. We do use Hadoop right now for special use cases of expanding and providing structure, creating structure out of unstructured information for a number of our customers, and then moving that into our Vertica-based warehouse.

The beauty of Vertica in the cloud is the way we have set this up and this also helps address both the security and the reliability issues that might be a thought of as issues in the cloud. We're triple clustering each set of instances of our vertical warehouses, so they are always reliable and redundant.

Daily updates

We, like the biggest enterprises out there, are vigilantly maintaining our network. We update our network on behalf of our customers on a daily basis, as necessary. We roll out and maintain a very standardized operating environment, including an open stack-based operating environment, so that customers never need to even care about what versions of the SSL libraries exist or what versions of the VPN exist.

We're taking care of all of that really deep networking and things that the most stalwart enterprise-style IT architects are concerned about. We have to do that, too, and we have to do it at scale for this multi-tenant kind of use-case.

As I said, the architecture itself is very Haven-like, it just happens to be exclusively in the cloud -- which we find interesting and unique for us. As for the Hadoop piece, we don’t use Autonomy yet, but there are some interesting use cases that we are exploring there. We use Vertica in a couple of places in our architecture, not only that central data warehouse, but we also use it as a high-performance storage vehicle for our analytic data marts.

So when our customers are pushing a lot of information through our system, we're tapping into Vertica’s horsepower in two spots. Then, our analytic engine can ingest and deal with those massive amounts of data as we start to present it to customers.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
On the Haven architecture side, we're a wonderful example of where Haven ends up in the cloud. For the applications themselves, the kind of things that customers are creating, might be these hybrid styles where they're drawing legacy information in from their existing on-premise systems. Then, they're gathering up, as I said before, their sales and marketing information and their social information.

The one that we see as a wonderful green field for us is capturing social information. We have our own social analytic maturity model that we describe to customers and partners on how to capitalize on your campaigns and how to maximize your exposure through every single social channel you can think of.

We're very proficient at that, and that's what's really driving the immense sizes of data that our customers are asking for right now. Where we used to talk in tens of terabytes for a big system, we're now talking in the world of hundreds, multiple hundreds of terabytes, for a system. Case by case by case, we're seeing this really take off.

Gardner: Do you have any companies, either named or unnamed, that provide a great use case example of BI as a service?
Where we used to talk in tens of terabytes for a big system, we're now talking in the world of hundreds, multiple hundreds of terabytes, for a system.

Morris: One of our oldest and most dear customers is Zendesk. They have a very successful customer-support application in the cloud. They provide both a freemium model and degrees of for-fee products to their customers.

And the number one reason why their customers upgrade from freemium to general and then general to the gold level of product is the analytics that they're supplying inside of there. They very recently announced a whole series of data products themselves, all powered by GoodData, as the embedded analytic environment within Zendesk.

We have another customer, Service Channel which is a wonderful example of marrying together two very disparate user communities. Service Channel is a facility’s management enterprise resource planning (ERP) application. They bring together the facility managers of your favorite brick-and-mortar retailers with the suppliers who provide those retail facilities service, janitorial services, air-conditioning guy, the plumbers.

Disparate customers

Marrying disparate types of customers, they create their own data products as well, where they are integrating third-party information like weather data. They score their customers, both the retailers as well as the suppliers, and benchmark them against each other. They compare how well one vendor provides service to another vendor and they also compare how much one of the retailers spends on maintaining their space.

Of course, Apple gets incredibly high marks. RadioShack, right now, as they transition their stores, not so much. Service Channel knew this information long before the industry did, because they're watching spend. They, too, are starting to create almost a bidding network.

When they integrated their weather data into the environment, they started tracking and saying, "Apple would like to gain first right of refusal on the services that they need." So if Apple’s air conditioning goes out, the service provider comes in and fixes the air-conditioning sooner than Best Buy and all of their competitors. And they'll bid up for that. So they've created almost a marketplace. As I said before, these data products are really quite an advantage for us.

Gardner: What's coming next?

Morris: We're seeing a number of great opportunities, and many are created and developed by the technologies we've chosen as our platform. We love the idea of creating not only predictive, but prescriptive, types of applications in use cases on top of the GoodData environment. We have customers that are doing that right now and we expect to see them continue to do that.

What I think will become really interesting is when the GoodData community starts to share their analytic experiences or their analytic product with each other. We feel like we're creating a central location where analysts, data scientists, and our regular IT can all come together and build a variety of analytic applications, because the data lives in the same place. The data lives in one central location, and that’s an unusual thing. In most of the industry your data is still siloed. Either you keep it to yourself on-premise or your vendors keep it to themselves in the cloud and on-premise.

But we become this melting pot of information and of data that can be analytically evaluated and processed. We love the fact that Vertica has its own built-in analytic functions right in the database itself. We love the fact that they run our predictive language without any other issue and we see our customers beginning to build off of that capability.
Become a member of myVertica today
Register now
Gain access to the free HP Vertica Community Edition
My last point about the power of that central location and the power of GoodData is that our whole goal is to free time for those data scientists and those IT people to actually perform analytics and get out of the business of maintaining the systems that make analytics available, so that you can focus on the real intellectual capital that you want to be creating.
Identifying trends

Gardner: So, Chris, to cap this off, I think we've identified some trends. We have PaaS for BI. We have hybrid BI. We have cloud data joins and ecosystems that create a higher value abstraction from data. Any thoughts about how this comes together, and does this fit into the vision that you have at HP Vertica and that you're seeing in other parts of your business?

Selland: We're very much only at the front end of the big data analytics revolution. I ultimately don’t think we are going to be using the term "big data" in 10 years.

I often compare big data today to eBusiness 10, 12 years ago. Nobody uses that term anymore, but that was when everything was going online, and now everything is online, and the whole world has changed. The same thing is happening with analytics today.

With a hundred times more data we can actually get 10,000 times more insight. And that's true, but it's not just the amount of data; it's the ability to cross-correlate. That's the whole vision of what Jeff was just talking about that GoodData is trying to do.
We're very much only at the front end of the big data/analytics revolution. I ultimately don’t think we are going to be using the term "big data" in 10 years.

It's the vision of Haven, to bring in all types of data and to be able to look at it more holistically. One of my favorite examples, just to make that concrete, is that there is an airline we were talking to. They were having a customer service issue. They were having a lot of their passengers tweeting angrily about them, and they were trying to analyze the social media data to figure out how to make this stop and how to respond.

In a totally separate part of the organization, they had a predictive maintenance project, almost an Internet-of-things (IoT) type of project, going on. They were looking at data coming off the fleet, and trying to do better job of keeping their flights on time.

If you think about this, you say, "Duh." There was a correlation between the fact that they were having service problems and that the flights were late with the fact that the passengers were angry. Suddenly, they realized that maybe by focusing less on the social data in this case, or looking at that as the symptom as opposed to cause, they were able to solve the problem much more effectively. That's a very, very simple example.

I cite that because it makes real for people that it's when you really start cross-correlating data you wouldn't normally think belong together -- social data and maintenance data, for example -- you get true insights. It's almost a silly simple example, but those types of examples we're going to see much more. The more of this we can do, the more power we are going to get. I think that the front end of the revolution is here.

Listen to the podcast. Find it on iTunes. Get the mobile app for iOS or Android. Read a full transcript or download a copy. Sponsor: HP Enterprise.

You may also be interested in: