Wednesday, December 3, 2014

HP launches Haven OnDemand to deliver big data services suite in the cloud

The next BriefingsDirect big data news analysis discussion examines some major announcements made at the HP Discover conference this week, the debut of HP Haven OnDemand, a new set of analytics-in-the-cloud services.

Our panel of users and experts unpacks the details from Barcelona, and explores the implications of the delivery of cloud-based HP Vertica OnDemand and HP IDOL OnDemand components within the HP Haven OnDemand suite.

Listen to the podcast. Find it on iTunes. Download the transcript.

To learn more about how big data changes everything via these new HP cloud offerings, we're joined by Fernando Lucini, Chief Technology Officer for HP Big Data; Howard Brown, Founder and CEO of RingDNA, based in Los Angeles, and Neal Holley, Operations Director at GateWest New Media Ltd., based in Bristol, UK. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: Fernando, we've heard quite a bit of news the last few days at the HP Discover 2014 Conference in Barcelona, and HP Software General Manager Robert Youngjohns delivered the details Tuesday about HP OnDemand. Let's look at this from the big picture. Why are data and analytics, combined with the cloud-hosting model and delivery model, such a good fit? Why is this an important milestone for the cloud?

Lucini: It's exciting in a number of ways. If you think about what we've launched, we recognized early that our customers, our partners, and developers out there were going to consume technologies in a new way. This is something that the industry all agreed on. We were just early birds in this and we recognized that it's all going to be about on-demand consumption, self-service, speed, elasticity, and all those nice things.

So in some respects, the industry wants to consume things in this fashion. We recognize it, and then the next step for us is to think about the people and what they're going to do with these kinds of services.

You can think about it in two different ways. You have the people out there in the real world who are creating applications on top of very rich information, and that's the mobile apps that we all use. It's the applications to look at both human information, as well as business information, or very structured information, creating applications that do that. We have that persona and we really wanted to make sure that that developer had all the right tools in that model on-demand, self-service.
Learn about big data analytics in the cloud
HP Vertica OnDemand
Request a 30-day free trial
The other part of the equation is the world of the data warehouse, where we have very large amounts of information. We're traditionally applying analysis, but in this new generation, we need the tools that can do this at a bigger scale, can do it quicker, and can be more flexible. This is our Vertica technology and the same kind of on-demand, self-service needs are out there. So the second part of our answer to the question for industry is that we'll provide you an on-demand way to serve that particular purpose.

The announcement comes from a number of good reasons. It provides the market with an answer to both of these peoples' needs. It does so in an incredibly elastic fashion and it does it with incredible richness. It has quite a unique degree of depth and variety.

If you look at the IDOL OnDemand functionality, there are new APIs that you can explore and use with the freemium model.

If you look at the Vertica OnDemand space, it allows you to manage whatever size warehouse you need in an incredibly elastic and transparent way, but still on-demand.

There’s so much to tell. It’s such an exciting time for the industry, and being in HP, leading the charge, is pretty, pretty impressive and important.

Great importance

Gardner: Clearly, this isn't news just for one part of an IT organization. This seems to have a great importance for data scientists, IT operators, developers, even line of business users of business intelligence (BI).

So let's look at this a little bit from the perspective of the IT operator. This is something that's a cost issue in many respects and broadens the use of something like IDOL and Vertica to a much larger market. With it being in the cloud, you don't need to set up your data center and you don’t need to have those capital expenditures.

Let’s start at the top, where we're talking about this as a cloud model. Why does this broaden the market for data and analytics?

Lucini: Go back to this IT operator. This guy or gal has always wanted to provide their business with the tools. There was an element there where these guys want to provide the analysis capabilities, they want to have the ingestion and the features, but it’s a tough thing, as you very well put it. There is capital expenditure, maintenance, and training.

As the differentiator here, the move is that the acceleration is going to be immediate. Let’s use simple examples, I want to be able to take video and do face recognition, extract license plates, extract behaviors, or listen to voice and do something, I want to do that and I don’t want the burden of all the science that goes behind doing these things.
IT operators are going to be incredibly happy that they can provide the business with what the business needs at a lower cost and get outcomes quicker.

This IT operator is going to say, "No problem. Here’s the link. You pay this as you go. Enjoy." And that's as complex as it gets. So the acceleration is going to be immediate, which translates almost immediately to create more and more applications and doing more and more analysis, which is what we all want, at a lower cost point in shorter times.

IT operators are going to be incredibly happy that they can provide the business with what the business needs at a lower cost and get outcomes quicker.

Gardner: This should be of interest to large enterprises that might want to augment their current warehouse approach and strategy. It also sounds like for those organizations that may have been too small or didn’t have the budget to set up their own on-premises data warehouse, they now have an opportunity to walk right into a deep, powerful analytics capability.

Lucini: It democratizes the whole idea of analytics. You want to make it as democratic as possible. Size isn't necessarily important with regards to intelligence, interest, having something to say, or having something to analyze. It’s all about making it democratic, and the cloud really helps in that.

It's also about giving functionality that wasn't accessible to some of these guys. We're talking about very advanced analysis -- technologies for video, voice, or text analysis, let alone warehousing. It’s now available to everybody. They can go in there, test it out, play with it, see how valuable it is to them, and stop dreaming about the value, but make the value. Then, if that’s what they need, they can just start paying as they go and getting on with their lives.

General availability

Gardner: Let’s dig into a little of the details. HP announced Haven OnDemand on December 2, with general availability coming in Q1 2015, so pretty rapidly. Vertica, that’s the one that's coming up first and then IDOL OnDemand is currently available as a freemium model, as you mentioned, on an early access basis, but will be generally available in a few months later into 2015.

What else should we know about the pricing here? Why is this compelling not only as an OPEX versus a CAPEX, but with pricing that is very compelling and attractive.

Lucini: Indeed. In some respects, because you're removing the necessity to open the hardware and to scale it up, we're also providing economies of scale in what we're doing. In HP Cloud Services, we have an amazing cloud that we can go to elastically, and everybody gets advantage of this.

If you think about it, ultimately in one of these models, you get a lot of people come in, have a look, play, investigate, understand, and learn. Then, you get a smaller percentage that actually commit, do the greater applications, and run their warehouses.
You should be in a position where you understand exactly what you're using and what you are paying for it, and it should allow you to toggle back and forth on that need. It’s pretty cool.

It balances out and it allows us to have a lower price point. It also allows us to charge as we go. It allows us a pay-as-you-go model. It all works out. Over time, we'll understand more and more what people want. This is being done in a very collaborative fashion, listening to the market for on-demand.

In the very beginning, we have been very Net Promoter Score focused. I challenge anybody to get yourself a login, and you'll see the Net Promoter kick in.

All the analysis is very much linked to what you want to do, what’s important for you, what’s being used most, and what gives us the most economies. That drives us to be more competitive.

It’s very transparent. It’s very clean. You should be in a position where you understand exactly what you're using and what you are paying for it, and it should allow you to toggle back and forth on that need. It’s pretty cool.

Gardner: As for the actual cloud that this is running on, is there a choice with that or is this starting out on HP Helion Cloud, the HP public cloud. What's the roadmap for the public-cloud infrastructure that this operates on? 

Lucini: At the moment, this is running in HP Cloud Services, which is Helion based of course. It is all designed on top of Helion. So the roadmap for it in the next few courses will be that it will be deployed in any Helion implementation. As long as you have Helion, you can deploy the services underneath.

Of course, Helion is a flavor of OpenStack. So you have the ability to use this in other flavors of OpenStack, but we're principally focused on Helion. We're principally focused on the Public HP Cloud Services and the private Helion implementations with our colleagues from Enterprise Services.

No difference

In some respect in the next year it should be a choice for you to go public cloud for what you need to do. If you're a developer and you just want to create your own app, the private-versus-public doesn’t make a difference to you.

Corporate may want to use this inside a firewall. As you know, in HP we have some of the largest corporates out there. If you're one of these guys and have the need to have that privacy you can install Helion and run these services of top of Helion. Following the HP philosophy, it’s a matter of what the client requires and we'll achieve that.

Gardner: It sounds as if this has been made of, by, and for a hybrid cloud model over time.

Lucini: Correct. Most of our big customers are hybrid, and we're delighted to serve them.

In the meantime, as they o go into a mode of using this stuff on Helion inside of the firewall, they'll still get all the elasticity that Helion provides them. They'll still get all the simplicity that REST and Web Services OnDemand provides them, and the flexibility that Vertica OnDemand provides them for scalability In some respects, there is no downside. There is absolutely no downside to anything that’s happening here. It’s just a matter of choice.
In terms of pricing, I think we're competitive. The features and functions are worth the spend.

Gardner: We'll get to our use cases and the examples of how this is being used shortly, but I just want to look at the competitive landscape. A big player out there, of course, in the public cloud is Amazon Web Services, and Amazon has what’s called It's their data warehouse in the cloud. How does what HP has announced compare and contrast to Redshift? Why is it a worthy competitor and is this price comparable?

Lucini: Of course, guys out there and everybody listening might know Vertica is a leading product in the analytics space and in the warehousing space. So we're coming  at this already as a leader proven inside the firewall.

You get all of the economies, flexibility, and features that Vertica provides; the Flex Zones, all of the optimizations, and the incredible scaling growth factors; and you get it in an on-demand package.

Just because we now have an on-demand version, these things don’t go away. It's quite the opposite. They're immediately available. In that respect, I think we have a strong proposal against Redshift, because you have all the features and functions, not only just the database itself. 

In terms of pricing, I think we're competitive. The features and functions are worth the spend. Our customer base, our history, and our legacy certainly prove that to be the case. Little by little, more and more of the features will seep in, and more customers will start to get comfortable with using it. We already have a few out there in beta land.

We're going to compete. Because of the features, the Flex Zones and other things, we'll carve our own space as well.

What is the differentiator?

Gardner: One of the things that seems unique to me, Fernando, is the IDOL OnDemand being so broad in terms of the types of media, content, information, and data that can now be brought into what’s essentially the type of analytics engine you would only think of for structured information. So it's the best of the structured analytics and high-performance environment, with that breadth and depth of the various types of content. Is that a differentiator in your opinion?

Lucini: Absolutely. I call it everything on-demand. As you notice, I tend not to differentiate between BOD and IOD. The whole philosophy was that we deal with unstructured, structured, and semi-structured information every day to build what we need for our businesses. So why should we see this differently?

If I happen to have an image, it's an image. If I happen to have a file, it's a file. If I happen to have an Excel sheet, it's an Excel sheet. All of these things are materially important. So let’s give our application developer and our data analyst a way to consume all this.

We have the connectors in the cloud, ways for you to suck information into the platform. We have the ability for you to index them and analyze them. We have some protected APIs for you to have a play around with.
It's as broad in analytics as possible. At the same time, it's still market leading in every single one of those APIs.

We have text-mining APIs. Obviously, this is a platform for us. So even though we're using the word Vertica and IDOL, underneath IDOL OnDemand, we have Vertica powering some of our features for user management. All our billing and other APIs are coming up.

It's all about giving the application developer all the tools. What the data is, isn't necessarily important. What's important is that they can process it, use it, extract as much value from it as possible, and make their business successful.

So you are absolutely right. It's as broad data-wise as possible. It's as broad in analytics as possible. At the same time, it's still market leading in every single one of those APIs, which is pretty cool stuff.   
Learn about big data analytics in the cloud
HP Vertica OnDemand
Request a 30-day free trial
Gardner: Now, when you're able to bring all sorts of information and media together, when you're able to tap web services, social media, when you're able to create a sentiment engine and a search engine capability, you're really starting to develop intelligence in new ways.

It seems to me, you can gain insight into markets, prospects, competition, customer inclinations, and directions. It's really about bringing more of a data-driven aspect to a business in ways that had really been sort of an art before, something that was not always by experience, but was by gut instinct.

Before we go to our use cases, how are we really changing a business environment here? Are we talking about a data-driven approach? Are we giving the type of tools that will move a marketing organization, for example, from guesswork into a scientific approach to how they make decisions?

Testing instincts

Lucini: You put it very nicely. We're moving into a world where we're allowing instincts to be tested, and tested quickly. In the past, we had a lot of clever professionals in the marketing world making educated guesses about what’s going on, what I like and don’t like, what you like and don’t like, or what’s popular and what’s not.

We're opening the door for businesses to take data, take a sample of it or take it all, it's their choice, whatever that may be, and in whatever varieties they come, to test out their theories, to see if this theory is correct.

I used to call it the CIO conundrum, where the CIO thinks they've got something and it becomes very difficult for them to prove if they do or don’t, and then they question the results when they get them.

We want them to be able to test this out. If they have an opportunity with their voice data and they think there's massive value in the voice data and they want to cross-correlate it to the social presence, do it, and let the data speak for itself.
It's very exciting stuff, because there is a real change in the industry, and we all have to adapt to it.

It's now no longer difficult. Just go into the platform, put the voice in there, put the text in there, use the analytics tools, give us our enterprise resource planning (ERP) warehouse. We'll do the queries and we'll create what we call combinations -- which is everything coming together as one -- and test the value.

Now, it no longer matters that this is not a very large project with very large budget. It will prove out the case. We have a next generation of proving things out and being capable of proving things out.

That might lead you to a very interesting onsite project with our tools, where you're inside a firewall, but you have proven it out. Or it might take you to a very interesting on-demand implementation. Either way you perform the testing or the proving or the thinking in a much more practical way.

It's very exciting stuff, because there is a real change in the industry, and we all have to adapt to it.

Gardner: Let's learn how some people have been using this already to change their business. Let's go first to RingDNA. Howard Brown, tell us a little bit about your company, what you do, and then how you've been using Haven OnDemand from HP?

Brown: RingDNA is a comprehensive sales acceleration platform that allows companies to create high-performance sales teams by combining powerful communications tools with prospect or customer DNA. That's a combination of marketing data, social data, customer relationship management (CRM) data, and account history, and pulling that all together to allow a sales rep to perform sales faster.

Data for inside sales

Gardner: It’s almost as if you're putting the tools of a data scientist in the hands of a salesperson without them having to be a scientist, to get all sorts of information to make the best call on a call in real-time on an inside sales basis.

Brown: You've got it. It's applying a scientific approach to sales. It's taking all of the data that exists out there which can be truly overwhelming, prioritizing it, and making it contextual to make sales much more effective.

Gardner: And this cuts across communications, as well as data, applications, and web services. Is that correct?

Brown: Absolutely. We apply both a theory-testing model and set of communication tools. When a RingDNA customer walks in in the morning, they know exactly who they should be calling, who they should be emailing or texting, and prioritizing the messages so that they know exactly who to call, how to reach out to them,  and what to say.
What HP IDOL OnDemand has provided us is the ability to test all kinds of theories, because every business we work with tends to have a different theory of what a hot prospect may be.

What’s so exciting is that you can start to understand buyer intent from marketing data from past interactions with your customers. We can look at voice transcripts and sentiment analysis and have a whole new way of determining who the right prospect is, how we should be contacting them, and with what messages.

Gardner: So it's up to your organization to take the best of technology, data, and analytics and empower those inside salespeople. It sounds like it's been up to HP to take the best of its technology in the cloud model and analysis to empower you. How, in fact, has HP empowered RingDNA with your early access use of HP Haven OnDemand?

Brown:  It's been truly game-changing. You nailed it when you talked abut taking business information and human information and combining those two. What HP IDOL OnDemand has provided us is the ability to test all kinds of theories, because every business we work with tends to have a different theory of what a hot prospect may be.

They can simply and easily test those theories using RingDNA and HP IDOL OnDemand. If there are buying signals, like someone visiting a website and downloading a whitepaper in combination with other factors, such as that person viewing web pages or maybe tweeting about their product or service, we can look at that buyer’s sentiment through HP IDOL OnDemand.

We're taking a bunch of this data, processing it through IDOL, and making our reps that much more productive and that much more powerful.

Gardner: One of the things you're doing is you are joining and bringing together very disparate data and information and tidbits of analysis. Is HP IDOL OnDemand doing that for you? Are you doing that? How do you make those joins that bring all that information together? Is the cloud the key to doing that?

Cloud is key

Brown: The cloud certainly is the key. We couldn’t deliver the type of product and service we do today without the cloud. RingDNA is all about accelerating a sales team’s ability to close deals. The last thing you want is to negatively impact those teams.

The cloud model means we can quickly implement a RingDNA process within an organization, bring in all that contextual data, bring in all that metadata, and make that rep that much more productive without negatively impacting their workflow.That’s critical to any business today.

It’s one thing to be able to deliver information. It’s another thing to be able to deliver information and insight without negatively impacting the business. Let's face it, in this  day and age, we can’t afford to slow down. With tools like IDOL OnDemand and RingDNA, you’re not slowing down teams. You're actually accelerating them beyond what you ever thought was possible.

Gardner: Fernando, as you're listening to Howard, is there anything about the way that RingDNA is using Haven OnDemand that you think highlights some specific benefits or values here. Are they a poster child for a certain type of way in which you can use Haven OnDemand?
With IDOL OnDemand coming on stream, we’ve found that we had a whole world of options opened up to us.

Lucini: Certainly they understand that they need to use tools to solve their problems and they go ahead and do it. In that respect, it’s great to see. There are a bunch of things we could learn as an industry from them in terms of seeing the opportunity of mixing two pieces of data, how these things collide, and how we get them to customers. I would challenge anybody to check them out because ultimately the end result is key, and I think everybody would be impressed.

Gardner: Let’s go to our next example. We're also joined by GateWest and Neal Holley. Neal, tell us a little bit about GateWest, what you do, and how you’ve been using HP Haven OnDemand.

Holley: We're HP Autonomy partners and have been since about 2002. During that time, we have deployed and maintained many IDOL-based systems. We provide a lot of support services to our clients on an annual basis. We also provide user interfaces to the core engine, our internal development team.

As well as enterprise search, we also specialize in knowledge management (KM). We have a couple of products addressing the management of knowledge, particularly within law firms, and recently we launched an application for the iTunes App Store providing mobile access to IDOL OnDemand, and we see this part of our strategy of what we’ve termed Mobile KM.

Gardner: Tell me a bit more about the iTunes App Store app. What is it called, and how did you use IDOL OnDemand to build it?

Holley: The app is called KnowGate and it was developed in direct response to the offering of IDOL OnDemand. Over the years, we’ve found that IDOL on-premise had a large cost of entry. Obviously, with IDOL OnDemand coming on stream, we’ve found that we had a whole world of options opened up to us. We were very surprised how straightforward it was to take the standard tools for producing the iPhone apps and iPad apps and interface them with IDOL OnDemand.

Great performer

It’s given us that opportunity to bring the technology that we've worked with for so many years and found to be such a great performer and hold the audience that we’ve always wanted to bring it to. The offering has allowed us to do that through its low cost of entry. As Fernando said, it’s democratizing the tools of the very large corporates that we've traditionally worked for.

Gardner: Help me to better understand this. There is no easier way to adopt a technology than to download it for a few dollars from the app store and instantly fire it up on your mobile device. If I were to download that app today, what would I be able to do with it? Who is the typical user? What is the function that that they would gather from it?

Holley: The typical user is predominantly a business user. The first instance is that you would be able to access your KM, your valuable documents or your key information that you need whether in a law firm, or whether it's engineering specifications or your latest contracts.

That’s the first element of it. The second element is being able to actually capture knowledge while on the move and being able to take information from an email or take a photograph of a document, OCR it, and then be able to ingest that into IDOL OnDemand and share it with the rest of your organization.
So it really opens up that kind of ability, and of course, once it’s shared it becomes valuable.

So it really opens up that kind of ability, and of course, once it’s shared it becomes valuable.

Gardner: Very interesting. Fernando, we're seeing with GateWest, this joining of the cloud model with the mobile model. How is that accelerating the use of analytics? That is to say, an application that can gather data and information and extend it to the cloud and then the cloud can create an analytics value and then send it back to that mobile device? How are you seeing that as a powerful new way of broadening the use and value of analytics in general?

Lucini: If you think about it, mobility is everywhere. We all create mobility and mobility apps for everything you have. I'm sure you guys walk around with a mobile device.

We have to be very clear that all of our consumers, even if it's enterprise-consumers versus consumer-consumers, all become little data analysts. We're all much better versed on information than we ever were.

Now you see 18 year-old kids or 20 year-old kids coming out of university and their ability to manage information in their devices, in their environment, is incredible. You no longer have a situation where you can associate analytics from mobile.

Mobile apps are mostly about analytics with some description, certainly about adding value to the data that a user asks you to create it. When I say "create it," I mean create it indirectly, create it by the motion on your wrist, versus you directly writing something down. So you get these two sources of data.

But it's certainly now such a rich space. Let me give you an example. You can take what's coming out of the back of a device, which is probably machine-driven, all the stuff that really the machine produces. You can put that in Vertica OnDemand and that will be your warehouse for doing the analysis on that: What am I doing, when, how, for how long, all that kind of jazz.

Creating context

At the same time, I'm producing the information directly from my mind. I'm creating context, I'm writing, I'm speaking, or I'm recording, whatever the case may be. Now, IDOL OnDemand can deal with that.

Anybody creating a mobile app is not going to want to have a hard server-based infrastructure, because the whole point of mobility is that it is distributed. It is a distributed computing model.

Those are kind of solutions that are on demand, in the cloud, elastic, pay-as-you-go kind of things. They're perfect for this generation, whether it's enterprise or not. The kind of partners we have are guys who understand that their intelligence and the value they add is not necessarily that they know a tool, but that they are the experts in their space and they know how to balance Vertica OnDemand.

I have my machine or business information and I need to do something important with that. I have my human information and anything in between, and it's the understanding of how this information adds values to people’s lives and how they execute them that’s he key.
The beauty of our OnDemand infrastructure is that it was created for everyone. It was created for our customers and it was created for ourselves.

So it's a really important moment. Mobile is the linchpin of much of what's going on around this that makes sense. If you look at any company today, there's no chance that they won't have a mobile intent.

At the same time, we have a lot of hackathons in OnDemand. I can tell you that 90 percent of the products that are created as a result of hackathons are mobile. It kind of speaks for itself.

Gardner: I know. The combination of the cloud-delivery model, analysis on demand, or as a service and the mobile device is just creating entirely new opportunities to add value as a consumer and as a company. It's really flipping many businesses around.

Let’s look at a particular business when we think about the impact of this new series of models and how they interact. I'm thinking about the IT organization in a company, in an enterprise.

With HP Software having a very broad portfolio of applications, many of which are designed and geared towards those IT organizations and developer organizations in companies, how can Haven OnDemand with that analysis-as-a-service capability be brought to bear on other HP software applications focused on IT organizations?

Lucini: The beauty of our OnDemand infrastructure is that it was created for everyone. It was created for our customers and it was created for ourselves. Not to unveil too many wonderful things, but there will be a number of announcements of our own tools, which will be powered by OnDemand. And we made a distinction of what is on demand versus what we call core. It’s our language to speak about our internal use versus our external use.

Organizational tools

These are tools that help the IT organizations.We have tools for backup, where the on-demand model will add great flexibility to what the IT operators can do with the information and how they can serve the legal compliance and partner infrastructures.

We have uses of OnDemand for a wider HP software family where they provide analytics, both for security as well as operational systems, and things like that. So it's a very democratic tool. We recognize that the world of information pivots on two things, and that’s why we created a platform.

It pivots on our ability to incredibly scale up and analyze structured information and semi-structured information. That’s why we have a Vertica core engine. We recognize that human beings create information and so we have our IDOL infrastructure.

And it's these two things together that every single one of our internal partners, IT, our own software product that tender to IT as well as external customers only to leverage this product. And then in some cases it goes very heavily one way, or very heavily another, you have a very, very strong warehouse.
All of our internal partners look at us and say that they're coming at it either from very human or from very machine, or actually in most cases, both.

You always have that road-map of possibility to get you to the other side, either more heavily toward IDOL or Vertica. You can really start, for example, with a Vertica OnDemand warehousing cloud, make it super-flexible, and put information in Flex Zones, really massage that data, don’t be upset by schemas,  and then work as you go, and scale up.

At the same time, think of what if you need some enrichment, what if you need to take some information that’s coming in and asking to say take in your social feed. So I need to take a voice feed and text information, classify it, and put it into my Flex Zones. That is available, and in the opposite direction, it’s exactly the same.

All of our internal partners look at us and say that they're coming at it either from very human or from very machine, or actually in most cases, both. This is the roadmap to get them to take advantage of both in the same platform. So you can see, it's very, very compelling for our internal partners to use, and we are delighted to serve them.

Gardner: I'm seeing a great deal of flexibility on the applicability of this. We've seen from RingDNA how this can help an inside sales organization do things they just could never have done before.

We have seen from GateWest how this is essential to bringing knowledge management and document management to a whole new level by combining the best of cloud and mobile devices.

Then, as you're now saying, we're only scratching the surface about how IT organizations can use the cloud and the analytics as a service for improving their application lifecycle management, their business service management, or their application development test. So it's really an exciting time.

I'm afraid we are about out of time for today’s discussion, but there's a lot more that people can learn at in terms of Haven OnDemand. Let’s just end with one more peek into the future. Fernando, what might we expect next? Where do you think Haven OnDemand will go in the near future in terms of a new type of business value?

Disrupting markets

Lucini: Let me just say that we're going to disrupt a bunch of markets. We're going to be looking to take over some markets out there that have been very traditionally on premise and we're going to try to democratize it. You can guess that we're going to take the world of video and voice and we are going to make that very democratic.

There are going to be lots of interesting things coming out where we're going to allow our customers to create their own APIs and extend the platform themselves. So there is a lot of that to look forward to.
Learn about big data analytics in the cloud
HP Vertica OnDemand
Request a 30-day free trial
We'll also be extending our Vertica OnDemand presence, getting more-and-more customers in there and getting more modes, using more of our Vertica technology to add functionality in a REST kind of way, in a web-service kind of way to the on-demand picture, and adding more and more APIs just to reflect the richness of a platform. So it's clear to everyone that this is only the beginning of an amazing story. So there are quite a lot of APIs, but there are many, many more to come. So there is quite a lot to look forward to.

Listen to the podcast. Find it on iTunes. Download the transcript. Sponsor: HP.

You may also be interested in:

Monday, December 1, 2014

Hortonworks accelerates the big data mashup between Hadoop and HP Haven

This latest BriefingsDirect deep-dive big data thought leadership interview examines how Hortonworks is working with HP on improved management of very large -- and very active -- datasets.

We'll explore how HP and Hortonworks are integrating Hadoop into more of the HP Haven family to make it easier for developers and data scientists to access business intelligence (BI) and analytics as a service.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. 

To learn how, BriefingsDirect sat down with Mitch Ferguson, Vice President of Business Development at Hortonworks at the recent HP Big Data 2014 Conference in Boston. The discussion is moderated by me, Dana Gardner, Principal Analyst at Interarbor Solutions.

Here are some excerpts:

Gardner: We heard the news earlier this year about HP taking a $50-million stake in Hortonworks, and then about Hortonworks' IPO plans. Please fill us in little bit about why Hortonworks and HP are coming together.

Ferguson: There are two core parts to that answer. One is that the majority of Hadoop came out of Yahoo. Hortonworks was formed by the major Hadoop engineers at Yahoo moving to Hortonworks. This was all in complete corporation with Yahoo to help evolve the technology faster. We believe the ecosystem around Hadoop is critical to the success of Hadoop and critical to the success of how enterprises will take advantage of big data.

If you look at HP, a major provider of technology to enterprises, both at the compute and storage level but the data management level, the analytics level, the systems management level, and the complimentary nature of Hadoop as part of the modern data architecture with the HP hardware and software assets provides a very strong foundation for enterprises to create the next generation modern data architecture.

Gardner: I'm hearing a lot about the challenges of getting big data into a single set or managing the large datasets.
Fully experience the HP Vertica analytics platform...
Get the free HP Vertica Community Edition

Become a member of myVertica
Users are also trying to figure out how to migrate from SQL or other data stores into Hadoop and into HP Vertica. It’s a challenge for them to understand a roadmap. How do you see these datasets as they grow larger, and we know they will, in terms of movement and integration? How is that path likely to unfold?
Machine data

Ferguson: Look at the enterprises that have been adapting Hadoop. Very early adopters like eBay, LinkedIn, Facebook, and Twitter are generating significant amounts of machine data. Then we started seeing large enterprises, aggressive users of technology adopt it.

One of the core things is that the majority of data being created everyday in an enterprise is not coming from traditional enterprise resource planning (ERP) or customer relationship management (CRM) financial management systems. It's coming from websites like Clickstream, data, log data, or sensor, data. The reason there is so much interest in Hadoop is that it allows companies to cost effectively capture very large amounts of data.

Then, you begin to understand patterns across semi-structured, structured, and unstructured data to begin to glean value from that data. Then, they leverage that data in other technologies like Vertica, analytics technologies, or even applications or move the data back into the enterprise data warehouse.

As a major player in this Hadoop market, one of the core tenets of the company was that the ecosystem is critical to the success of Hadoop. So, from day one, we’ve worked very closely with vendors like Microsoft, HP, and others to optimize how their technologies work with Hadoop.

SQL has been around for a long time. Many people and enterprises understand SQL. That's a critical access mechanism to get data out of Hadoop. We’ve worked with both HP and Microsoft. Who knows SQL better than anyone? Microsoft. We're trying to optimize how SQL access to Hadoop can be leveraged by existing tools that enterprises know about, analytics tools, data management tools, whatever.

That's just one way that we're looking at leveraging existing integration points or access mechanisms that enterprises are used to, to help them more quickly adopt Hadoop.
The technology like Hadoop is optimized to allow an enterprise to capture very, very large amounts of that data.

Gardner: But isn’t it clear that what happens in many cases is that they run out of gas with a certain type of database and that they seek alternatives? Is that not what's driving the market for Hadoop?

Ferguson: It's not that they're running out of gas with an enterprise data warehouse (EDW) or relational database. As I said earlier, it's the sheer amount of data. By far, the majority of data is not coming from those traditional ERP,  CRM, or transactional systems. As a result, the technology like Hadoop is optimized to allow an enterprise to capture very, very large amounts of that data.

Some of that data may be relevant today. Some of that data may be relevant three months or six months from now, but if I don't start capturing it, I won't know. That's why companies are looking at leveraging Hadoop.

Many of the earlier adopters are looking at leveraging Hadoop to drive a competitive advantage, whether they're providing a high level of customer service, doing things more cost-effectively than their competitors, or selling more to their existing customers.

The reason they're able to do that is because they're now being able to leverage more data that their businesses are creating on a daily basis, understanding that data, and then using it for their business value.

More than size

Gardner: So this is an alternative for an entirely new class of data problem for them in many cases, but there's more than just the size. We also heard that there's interest in moving from a batch approach to a streaming approach, something that HP Vertica is very popular around.

What's the path that you see for Hortonworks and for Hadoop in terms of allowing it to be used in more than a batch sense, perhaps more toward this streaming and real-time analytics approach?

Ferguson: That movement is under way. Hadoop 1.0 was very batch-oriented. We're now in 2.0 and it's not only batch, but interactive and also real-time, and there's a common layer within Hadoop.  Hortonworks is very influential in evolving this technology. It's called YARN. Think of it as a data operating system that is part of Hadoop, and it sits on top of the file system.

Via YARN, applications or integration points, whether they're for batch oriented applications, interactive integration, or real-time like streaming or Spark, are access mechanisms. Then, those payloads or applications, when they leverage Hadoop, will go through these various batch interactive, real-time integration points.

They don't need to worry about where the data resides within Hadoop. They'll get the data via their batch real-time interactive access point, based on what they need. YARN will take advantage of moving that data in and out of those applications. Streaming is just one way of moving data into Hadoop. That's very common for sensor data. It’s also a way to move it out. SQL is a way, among others, to move data.
Fully experience the HP Vertica analytics platform...
Get the free HP Vertica Community Edition

Become a member of myVertica
Gardner: So this is giving us choice about how to manage larger scales of data. We're seeing choice about the way in which we access that data. There's also choice around the type of the underlying infrastructure to reduce costs and increase performance. I am thinking about in-memory or columnar.

What is there about the Hadoop community and Hortonworks, in particular, that allows you to throw the right horsepower at the problem?

Ferguson: It was very important, from Hortonworks perspective from day one, to evolve the Hadoop technology as fast as possible. We decided to do everything in open source to move the technology very quickly and leverage the community effective open-source, meaning lots of different individuals helping to evolve this technology fast.

The ability for the ecosystem to easily and optimally integrate with Hadoop is important. So there are very common integration points. For example, for systems management, there is the Ambari Hadoop services integration point.

Whether it's an HP OpenView or System Center in the Microsoft world, that allows it to leverage, manage, or monitor Hadoop along with other IT assets that those management technologies integrate with.

Access points

Then there's SQL's access via Hive, an access point to allow any technology that integrates or understands SQL to access Hadoop.

Storm and Spark are other access points. So, common open integration points well understood by the ecosystem are really designed to help optimize how various technologies at the virtualization layer, at the operating system layer, data movement, data management, access layer can optimally leverage Hadoop.

Gardner: One of the things that I hear a lot from folks who don't understand yet how things will unfold, is where data and analytics applications align with the creation of other applications or services, perhaps in a cloud setting like a platform as a service (PaaS).

It seems to me that, at some point, more and more application development will be done through PaaS with an associated or integrated cloud. We're also seeing a parallel trajectory here with the data, along the same lines of moving from traditional systems of record into relational, and now into big data and analytics in a cloud setting. It makes a lot of sense.
What a number of people are doing with this concept is called the data lake. They're provisioning large Hadoop clusters on prem, moving large amounts of data into this data lake.

I talked to lot of people about that. So the question, Mitch, is how do we see a commingling and even an intersection between the paths of PaaS in general application development and PaaS in BI services, or BI as a service, somehow relating?

Ferguson: I'll answer that question in two ways. One is about the companies that are using Hadoop today, and using it very aggressively. Their goal is to provide Hadoop as a service, irrespective of whether it's on premises or in the cloud.

Then we'll talk about what we see with HP, for example, with their whole cloud strategy, and how that will evolve into a very interesting hybrid opportunity and maybe pure cloud play.

When you think about PaaS in the cloud, the majority of enterprise data today is on premises. So there's a physics issue of trying to run all of my big data in the cloud. As a result, what a number of people are doing with this concept is called the data lake. They're provisioning large Hadoop clusters on premises, moving large amounts of data into this data lake.

That's providing data as a service to those business units that need data in Hadoop -- structured, semi-structured, unstructured for new applications, for existing analytics processes, for new analytics processes -- but they're providing effectively data as a service, capturing it all in this data lake that continues to evolve.

Think about how companies may want to leverage then a PaaS. It's the same thing on premises. If my data is on premises, because that's where the physics requires that, I can leverage various development tools or application frameworks on top of that data to create new business apps. About 60 percent of our initial sales at Hortonworks are new business applications by an enterprise. It’s business and IT being involved.

Leveraging datasets

Within the first five months, 20 percent of those customers begin to migrate to the data-lake concept, where now they are capturing more data and allowing other business entities within the company to leverage these datasets for additional applications or additional analytics processes. We're seeing Hadoop as a service on premises already. When we move to the cloud, we'll begin to see more of a hybrid model.

We are already starting to see this with one of Hortonworks large partners, where you put archive data from on premises to store in the cloud at low-cost storage. I think HP will have that same opportunity with Hadoop and their cloud strategy.

Already, through an initiative at HP, they're providing Hadoop as a service in the cloud for those entities that would like to run Hadoop in a managed service environment.
We're seeing Hadoop as a service on prem already. When we move to the cloud, we'll begin to see more of a hybrid model.

That’s the first step of HP beginning to provide Hadoop in a managed service environment off premises. I believe you'll begin to see that migrate to on-prem/off-prem integration in a hybrid opportunity in the some companies as their data moves off prem. They just want to run all of their big-data services or have Hadoop as a service running completely in HP cloud, for example.

Gardner: So, we're entering in an era now where we're going to be rationalizing how we take our applications as workloads, and continue to use them either on premises, in the cloud, or hybrid. At the same time, over on the side, we're thinking along the same lines architecturally with our data, but they're interdependent.

You can’t necessarily do a lot with the data without applications, and the applications aren’t as valuable without access to the analytics and the data. So how do these start to come together? Do you have a vision on that yet? Does HP have a vision? How do you see it?

Ferguson: The Hadoop market is very young. The vision today is that companies are implementing Hadoop to capture data that they're just letting fall on the floor. Now, they're capturing it. The majority of that data is on premises. They're capturing that data and they're beginning to use it in new a business applications or existing analytics processes.
Fully experience the HP Vertica analytics platform...
Get the free HP Vertica Community Edition

Become a member of myVertica
As they begin to capture that data, as they begin to develop new applications, and as vendors like HP working in combination with Hortonworks provide the ability to effectively move data from on premises to off premises and provide the ability to govern where that data resides in a secure and organized fashion, you'll begin to see much tighter integration of new business or big-data applications being developed on prem, off prem, or an integration of the two. It won't matter.

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in: