Tuesday, August 6, 2013

HP Vertica General Manager Colin Mahony on the next generation of anywhere analytics platforms

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

This next edition of the HP Discover Performance Discussion Series welcomes Colin Mahony, General Manager at HP Vertica, on this first day of the inaugural HP Vertica Big Data Conference in Boston.

It's been well over two years since HP acquired Vertica, and the analytics platform has become a pillar of HP's recently announced HAVEn Initiative. Now Vertica is poised to advance beyond its MPP column store database origins into a next generation anywhere analytics platform. New Vertica benefits include ease in cloud deployments and appliance delivery, as well as new features coming later this year for improved speed, lower-cost and greater ease in data input and access.

Learn how Mahony is guiding the future of the HP Vertica Analytics Platform, and how users are finding new ways to leverage its unique speed and attributes. The interview is conducted by Dana Gardner, Principal Analyst at Interarbor Solutions. [Follow Colin on Twitter.] [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:
Gardner: One of the things that strikes me about the market nowadays is that there seems to be a sense of tradeoffs going on when organizations are trying to pick their big data engine or platform. They have a set of value on one side, but it’s opposed by value on the other. They can’t have everything. One size does not fit all.

So how are you at Vertica able to help people deal with these tradeoffs that they're facing when it comes to a next-generation data platform?
Vertica was founded on the premise that one size does not fit all.

Mahony: Vertica was founded on the premise that one size does not fit all. Using a single OLTP transactional database to do everything, including analytics, just doesn't make a lot of sense.

If you think about the areas that the people have to trade off, usually it’s scale for performance or analytics functionality for performance. One of things that I've spent a lot of time looking at is, especially over the last couple of years, is just some of the alternative platforms, not just for analytics, but for all of the different data needs.

You can take something like Hadoop as an example. Hadoop really is a distributed file system and has capabilities to run rudimentary analytics and transform processed data. But I think what people love about Hadoop is that it's really easy to load data into Hadoop. You don't have to define the schema or anything.

Mahony
Instead of schema on write or load time, it’s schema on read time. People like that. They also like at least the perception that it is free and the scalability of it. On the database side, what people love about the database is that you're going to get really good performance, because the data is structured. If you're using a NexGen MPP platform like Vertica, you'll get the performance of the scalability.

Hadoop-like

We've been doing a lot of work in areas like making it easier to get the data into the platform, doing more with it, making it seem much more like a Hadoop-like environment. You can look at our past releases and see that there's been a lot of work done on that and we continue to make those investments.

One thing has been consistent at Vertica since the beginning. What we focus on is to make it really easy for people to get information onto the platform. Then, we make sure we continue to deliver new capabilities, performance, and functionality within the platform.

We make sure we’re enabling our customers and partners to deploy Vertica anywhere and everywhere, whether it’s cloud appliances, software, or the like. Those are the three tenets of the company. It’s all around this notion of making data matter and help people make better decisions that lead to better outcomes with superior information.

There's so much that can be done in this space, but I think the key for us is to focus on the things that we know we do really well. The good news is that it's such a large space with so many demands that we know we can make a huge impact without trying to take on the world. We know we can make a huge impact in what we’re doing.

I think you'll continue to see some interesting developments along the lines of what I'm describing, and it's very much in line with where we've been.
No matter what on-ramp they take, they tend to find a lot of the other capabilities once they get on.

Gardner: Do more and more IT functions and business functions begin and end with big data? It seems to be at the center of so many things.

Exponential growth

Mahony: It is. To go back to the founding of Vertica [in 2005], I remember when Mike Stonebraker was giving the early presentations on the need for it. He talked a lot about the exponential growth of data and how that was outpacing any laws like Moore’s law or other hardware laws. So much information was being created, there was no way that just using more paralyzed hardware was going to be able to address the issue.

The state of the union back then was there was no such thing as "big data." But I think Mike, as a visionary, knew what was going to happen in the industry. And it has happened.

It wasn’t a long time ago, but I remember that I was trying to find our first sample dataset that was over a terabyte and we had a difficult time finding it. When we would talk to the early customers, they looked at us like we were crazy when we were asking about a terabyte.

We have an easy time now finding terabytes of data. The state of the union today is that what's driving so much around big data is that you have obviously the volume, variety, and velocity that we talk about often, but what's really driving those three things is human information, whether it's social media, tweets, or expressive content that’s just so prevalent right now, as well machine information.

If you look at the traditional structured database market by any number, it’s a small percentage of the amount of data that’s out there. The strength of Vertica, and really the strength of HP overall, is that we have the best assets for the unstructured human information in Autonomy, as well as the best assets when it comes to machine information and large data.
When we would talk to the early customers, they looked at us like we were crazy when we were asking about a terabyte.

That has some structure. It’s semi-structured information, but it’s not your traditional transaction system. The power of all of that data comes together when you can have an engine that applies some structure to it and then is able to deliver the analytics that the organization needs. It's both IT as well as line of business, and even this new category we often talk about, which is the data scientist.

One of the great things about this show here is that we’ve got Billy Beane of Moneyball fame as our keynote speaker. The reason that we wanted Billy to come speak here is that Moneyball is exactly what’s happening right now in the world when it comes to big data.

You have the data scientist or the statistician, you have the line of business folks, and you have IT. They all have a part to play in the success of how information is used in companies. By bringing them together and by making the software that much easier for them to come together and solve these problems, you can create very real and differentiated value within organization.

So Moneyball is exactly what’s happening, certainly in corporate America, but also in government and in many other institutions that want to leverage information to be more efficient and create a competitive advantage.

Gardner: Colin, what about the notion of big data as agent for business transformation. We've been hearing about this for 30 years. It's been big part of the academic work in business schools. Process re-engineering has evolved into balanced scorecards. Getting more detailed information in real time about the customers and the marketplace probably has as much or more of a opportunity to transform businesses than just about anything else that's happened over the past 20 years.

More than technology

Mahony: It's an enormous opportunity for business transformation, and definitely the whole is greater than the sum of the parts. What makes companies really successful with information is not trying to boil the ocean, not trying to do a traditional enterprise data warehouse project that's going to take 24 months, if you're lucky, 36 most likely.

They’ll end up with some monolithic inflexible platform that will probably be outdated by the time it gets deployed. What is making a lot of companies successful is they find a particular use, they find a problem area that they want to drill down on, and they mobilize to do it.

For that, they need a solution that is quickly deployed, but also has that capability to become something much larger. Whether it's Vertica, Talend, or any of the other portfolios that we offer, we strive to make sure that somebody can get up and running quickly, whether it's Autonomy and human information analytics, Vertica and machine data or other types of transactional structured data.

The most important thing is that you find that business case, you focus on it, and prove very quickly. There's something we refer to as “Time to Terabyte,” which is less than a month, typically for Vertica. You get a return on investment (ROI) in less than a month for the investments that you made. If you prove that out, then everybody in the organization is happy, the line of business, the technology folks in IT, even the statisticians, data scientists.
It's not just about faster speeds and feeds. It's about fundamentally stepping back and asking how we're running this business.

From there, you start expanding the project, and that's exactly how we win most of our customers. We very rarely go in and say, "Buy an enterprise license for our product across the company." We certainly do those, but more typically we get into a business unit, we find the acute pain, and we solve that problem.

What they're betting on is the ability for us to expand and for them to expand in this platform. That's why we are, on the one hand, all about the platform and the integration, but on the other hand, not about to lose the flexibility and the modularity of what we do, because that's also a huge differentiator for HP's portfolio.

I think that this is a wonderful time in the world of business transformation, and I think, unlike what has been talked about for the last 30 years, you now have the data that can back it up and prove it in real-time to the organization.

That's the big difference. You gave the balanced scorecard as an example. If you look at the balance scorecard methodology, you can take that methodology and drill down into a thousand fields of detail and be able to get that information in real time. That's the opportunity here, and that's I think why this market is so huge.

It's not just about faster speeds and feeds. It's about fundamentally stepping back and asking how we're running this business. What assets, especially information assets, do we have that could dramatically boost the productivity to the same extent that computers, when they were first introduced, boosted productivity. That's the goal that everybody is looking for when it comes to information.

Gardner: Tell our listeners and readers a bit more about yourself and your background.

Mahony: I've been with Vertica since the beginning. In fact, long before Vertica, my background has always been databases. I've always loved computer science, and had a minor in computer science in my undergraduate degree. In my first job out of school, I was taking databases and working with civilian US Government clients, and getting a lot of information published up to the web in the earliest days of the web.

I had a couple of other roles, but they were always very technology focused. Then I got my MBA on the business side and went into venture capital for seven years. That's where I met Mike Stonebraker, the founder of Vertica.
Those are all the things that we have taken into account while we built Vertica, and I think we have always been on the fast track to a platform.

I just loved the idea, everything I knew about databases and the challenges of traditional database and everything I knew about the new world order of information -- at the time we didn’t even talk about the term big data -- it just seemed to align really well.

So I decided to leave the dark side of venture capital and I jumped into something that I have been incredibly passionate about. If you look at that lifecycle even my own background with Vertica and where we’ve come, it’s just been a great. The timing was great and as always it takes a lot more than just great technology and great people.

Gardner: It's been well over two years since HP acquired Vertica and, as we begin the inaugural 2013 Big Data Conference, how would you best characterize how Vertica has evolved since its founding back in 2005?

Mahony: Yes, this is our first user conference. It’s ironic that we've never had one before, but I think also this is a testament to that scale that HP can bring. We have wanted a user conference since the beginning. Obviously, it takes some critical mass to get there which we now have, but also it takes the support of an organization that knows how to do these conferences and understand the value of them.

And we’ve evolved quite a bit. It’s been a busy couple of years here, certainly post the HP acquisition. But I think at a high level, we’ve really shifted and expanded from being an MPP column store, very narrowly-focused database company, really into an analytic platform company.

With that comes several developments, obviously on the product side, but also as an organization, going through that maturation in terms of being able to operate at a global scale across the spectrum of what you would expect an analytics provider to offer.

Gardner: And how do you characterize the difference between a store and a platform? Are there many ecosystem players or is this an organic evolution of your capabilities or both?

Mahony: It’s both, the ecosystem and the tools that you interact with. And of course, we support a very rich and vibrant ecosystem of business-intelligencve (BI) tools, extract, transform and load (ETL) tools, and other types of management tools. Not just the ecosystem around it, but also looking within our own products.

So it's adding a lot of the capabilities like backup and recovery, additional analytics capabilities beyond just standard SQL with the SDKs that Vertica supports, the ability to run both the procedural and the other types of code within the product, being able to express things like MapReduce beyond what a traditional database system would do.

Since the founding of the company, we've tried to take the best part of the database world and the best parts of the SQL world, but address the most challenging issues that traditional databases have had. So whether it is scalability or it’s being able to run things beyond SQL or it’s just the performance, those are all the things that we have taken into account while we built Vertica, and I think we have always been on the fast track to a platform.

We knew it would be a journey and we knew that building a product and a platform from the bottom up is not an easy thing, but we also knew that once we got there, once we sort of crossed that chasm, if you will, then all those decisions that made in the beginning about this product and building an engine from the bottom up would pay off.

Platform modularity

For probably the last year, that's where we’ve been. Right now, we're seeing that it’s easy to add functionality to the platform because of the modularity of the platform, and we can add that functionality without giving up any of the performance.

For me, it’s probably the most exciting time. Being part of HP offers us so many things that make it a lot easier to become a platform, not only on the development side, but a much greater ecosystem, a global scale, being able to support customers globally 24/7.

Gardner: It’s only been a few months since the HP Discover 2013 Conference in Las Vegas where the HAVEn Initiative was announced. This puts Vertica in a very prominent place among other HP properties, technologies, platforms and approaches to solving this big data issue. Recap for us, if you would, what HAVEn is and why Vertica formed such an important pillar for this larger HP initiative?

Big-data lake

Mahony: What companies are looking for is this notion of the big-data lake. To me, it can mean many different things, but at the end of the day, companies want to take all the information assets that they have and they want to put them into a safe place, but a place where access to that information can be used by many different constituencies, whether it's IT, line of business, or data scientist.

So the notion of having a safe place, a harbor, or a port is what we announced as HP HAVEn, which is HP’s big data platform. It is primarily for analytics, but it can be used for just about anything when it comes to information and data.

What's so important about information right now is that there are different constituencies in the companies that want to take the information. First of all they want to capture all the information, not just structured, not just unstructured, but 100 percent of their information.

They want to get it to a place where they can leverage it and use it for a lot of different use cases, but the first part is get that information into the right place. For us, that is one of three components of HAVEn, which is the connectors.

We have over 700 connectors as part of HAVEn coming from Autonomy, coming from our Enterprise Security Group, the ArcSight core Logger and those connectors. That can be human information, extreme log information, or traditional database structured information.
They're driven by vast volumes of information and they close the loop, meaning that the experiences that are happening with an application.

Step one is the connectors to get these components. Step two is to put that data into the best engine for that data. Vertica obviously is one component, but you also have the Autonomy IDOL Engine, you have the ArcSight Logger engine, and also open-source technologies like Hadoop, which is actually the HP HAVEn. So we’ve got a place to put the information.

Step three is any N number of applications. What I'm seeing happening in the industry right now is just like we went from mainframe to client-server, and client-server to LAN, we're in a period now where applications are being developed. They're certainly web-based and distributed, but they're also analytical in nature.

They're driven by vast volumes of information and they close the loop, meaning that the experiences that are happening with an application, if you're driving a car, or whatever it might be, information is being passed, closed loop, back to a system that can then optimize the experience. That is creating a new class of applications.

For that new class of applications, you need the platform to be able to drive those. What we're bringing together in HAVEn is Hadoop, Autonomy, Vertica, Enterprise Security, core assets, and the N number of applications.

At Discover, we announced some of our own internal applications, which are powered by the HAVEn platforms. We announced our HP Analytics offering, which is built using Hadoop, Vertica, Enterprise Security, and Autonomy assets.

About community

We're making some of our own applications, but this is about the community and getting people to be able to build new set of applications that can use these components to really change how people are interacting with their data.

That’s HAVEn, and I am always careful to point out to people that HAVEn itself is not a product, but it's a platform and it’s a broader platform than the one that is just Vertica, Autonomy, or Enterprise Security. It’s a platform where 1+1+1+1+1, instead of equaling 5, should equal 8 or 10 or 12, and that's the goal. Of course, it's also a roadmap into areas that each of these components are working on to bring those closer together. So it’s exciting.

One thing I've certainly noticed over the years with our customers is that the shiny object of why a customer chooses Vertica may look very different across our customers. For some, it's the price. For some, it's the performance and the scale, massive volumes. For some it's a particular analytic function or several pattern matching capabilities. And for others, it's something entirely different.

But what's so exciting, especially about this conference, is that no matter what on-ramp they take, they tend to find a lot of the other capabilities once they get on. Hopefully, here at the conference, we're going to accelerate some of that just by getting our customers and our partners together in an environment where they can share stories.

Cloud and hybrid

Gardner: For our last item today, I wonder if we could take out our crystal ball apparatus and try to do a little blue-sky thinking. One of the other big trends these days of course is cloud computing and hybrid models for the distribution of workloads for applications, but also for data. I'm wondering, as we go down this journey over the next year or two, how do big data and cloud computing come together?

Mahony: As I mentioned in terms of the three things that we are focused on, number one is make it easy to get data into the platform. Number two is do a lot more with the platform, so that there is better analytic capabilities, better pattern matching, and better analytics packs on top of it.

Number three is make sure you can deploy Vertica everywhere, and in the everywhere and anywhere categories, the cloud is certainly the first name that comes to mind. That is absolutely the future of computing. In some ways, I guess, it's the past, but it's interesting how the past repeats itself.
All these activities that are happening up on the cloud are generating a lot of information, information that will be analyzed, I'm sure, in many different ways.

We do run Vertica on hosted environments like Amazon cloud. We're in a private beta on the HP Cloud Service. So there are definitely offerings and developments that that has been underway here at Vertica for a while.

We embrace that, and to us, it's not mutually exclusive. What you described in the hybrid environment where you can run certain things locally. You can burst up to the cloud to do other workloads, especially if you're looking to pull some quick processing power and storage. That's going to be the future and that's the way, just like any other utilities, that we're going to consume some of these capabilities.

This is one of the strengths of a company the size and scale of HP. We have these offerings, whether it's software only, appliance, or cloud. We have the ability to deliver however the customer wants it, and we can also provide not only the flexible technologies, but the flexible business capabilities to make that happen with a lot of ease.

It's an exciting time. If you look at the pillars of the HP, we have cloud, mobility, big data, and security. All four of those pillars tie well into one another, because they're all related. Of course, all these activities that are happening up on the cloud are generating a lot of information, information that will be analyzed, I'm sure, in many different ways.

So it's something that kind of feeds on itself, the same way the mobility does. All of that is a good thing for the analytic space, wherever it is. The final thing I would say is that  the most important thing about analytics is that you do want it embedded into the various applications, just like when you are driving a car, you just want the GPS system to tell you where you are going.

Analytics is the same. You want it within the context of whatever it is that you are doing. Given that so many things are going to be served off the cloud, it's natural that that's the place that will host some of the analytics as well.

So it's an incredibly exciting time, and we're looking forward to having many more of these user conferences and are certainly going to enjoy the rest of the show this week. [Follow Colin on Twitter.]
Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in:


Wednesday, July 31, 2013

Businesses can remain dependable only if they get a full grip on risk and complexity, says The Open Group CEO Allen Brown

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: The Open Group.

This latest BriefingsDirect discussion from The Open Group Conference earlier this month in Philadelphia explores the essential role of standards in an increasingly complex and unpredictable world.

From risks around cybersecurity to supply chain concerns to fast-changing trends around cloud computing, the pace of change and pressures on businesses to adjust well have never been higher. To gain a fuller grip on such risk and complexity, The Open Group is shepherding a series of standards and initiatives to provide better tools for understanding and managing true operational dependability.

BriefingsDirect sat down with the President and CEO of The Open Group, Allen Brown, at the July conference to gather an update on the efforts. The interview was conducted by Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: The Open Group is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:
Gardner: What are the environmental variables that many companies are facing now as they try to improve their businesses and assess the level of risk and difficulty?

Brown: There are a lot of moving targets. We're looking at a situation where organizations are having to put in increasingly complex systems. They're expected to make them highly available, highly safe, highly secure, and to do so faster and cheaper. That’s kind of tough.

Gardner: One of the ways that organizations have been working toward a solution is to have a standardized approach, perhaps some methodologies, because if all the different elements of their business approach this in a different way, we don’t get too far too quickly, and it can actually be more expensive.

Perhaps you could paint for us the vision of an organization like The Open Group in terms of helping organizations standardize and be a little bit more thoughtful and proactive toward these changed elements?

Brown
Brown: With the vision of The Open Group, the headline is "Boundaryless Information Flow." That was established back in 2002, at a time when organizations were breaking down the stovepipes or the silos within and between organizations and getting people to work together across functioning. They found, having done that, or having made some progress toward that, that the applications and systems were built for those silos. So how can we provide integrated information for all those people?

As we have moved forward, those boundaryless systems have become bigger and much more complex. Now, boundarylessness and complexity are giving everyone different types of challenges. Many of the forums or consortia that make up The Open Group are all tackling it from their own perspective, and it’s all coming together very well.

We have got something like the Future Airborne Capability Environment (FACE) Consortium, which is a managed consortium of The Open Group focused on federal aviation. In the federal aviation world they're dealing with issues like weapons systems.

New weapons

Over time, building similar weapons is going to be more expensive, inflation happens. But the changing nature of warfare is such that you've then got a situation where you’ve got to produce new weapons. You have to produce them quickly and you have to produce them inexpensively.

So how can we have standards that make for more plug and play? How can the avionics within a cockpit of whatever airborne vehicle be more interchangeable, so that they can be adapted more quickly and do things faster and at lower cost. After all, cost is a major pressure on government departments right now.

We've also got the challenges of the supply chain. Because of the pressure on costs, it’s critical that large, complex systems are developed using a global supply chain. It’s impossible to do it all domestically at a cost. Given that, countries around the world, including the US and China, are all concerned about what they're putting into their complex systems that may have tainted or malicious code or counterfeit products.

The Open Group Trusted Technology Forum (OTTF) provides a standard that ensures that, at each stage along the supply chain, we know that what’s going into the products is clean, the process is clean, and what goes to the next link in the chain is clean. And we're working on an accreditation program all along the way.

We're also in a world, which when we mention security, everyone is concerned about being attacked, whether it’s cybersecurity or other areas of security, and we've got to concern ourselves with all of those as we go along the way.
The big thing about large, complex systems is that they're large and complex. If something goes wrong, how can you fix it in a prescribed time scale?

Our Security Forum is looking at how we build those things out. The big thing about large, complex systems is that they're large and complex. If something goes wrong, how can you fix it in a prescribed time scale? How can you establish what went wrong quickly and how can you address it quickly?

If you've got large, complex systems that fail, it can mean human life, as it did with the BP oil disaster at Deepwater Horizon or with Space Shuttle Challenger. Or it could be financial. In many organizations, when something goes wrong, you end up giving away service.

An example that we might use is at a railway station where, if the barriers don’t work, the only solution may be to open them up and give free access. That could be expensive. And you can use that analogy for many other industries, but how can we avoid that human or financial cost in any of those things?

A couple of years after the Space Shuttle Challenger disaster, a number of criteria were laid down for making sure you had dependable systems, you could assess risk, and you could know that you would mitigate against it.

What The Open Group members are doing is looking at how you can get dependability and assuredness through different systems. Our Security Forum has done a couple of standards that have got a real bearing on this. One is called Dependency Modeling, and you can model out all of the dependencies that you have in any system.

Simple analogy

A very simple analogy is that if you are going on a road trip in a car, you’ve got to have a competent driver, have enough gas in the tank, know where you're going, have a map, all of those things.

What can go wrong? You can assess the risks. You may run out of gas or you may not know where you're going, but you can mitigate those risks, and you can also assign accountability. If the gas gauge is going down, it's the driver's accountability to check the gauge and make sure that more gas is put in.

We're trying to get that same sort of thinking through to these large complex systems. What you're looking at doing, as you develop or evolve large, complex systems, is to build in this accountability and build in understanding of the dependencies, understanding of the assurance cases that you need, and having these ways of identifying anomalies early, preventing anything from failing. If it does fail, you want to minimize the stoppage and, at the same time, minimize the cost and the impact, and more importantly, making sure that that failure never happens again in that system.

The Security Forum has done the Dependency Modeling standard. They have also provided us with the Risk Taxonomy. That's a separate standard that helps us analyze risk and go through all of the different areas of risk.
You can't just dictate that someone is accountable. You have to have a negotiation.

Now, the Real-time and Embedded Systems Forum  has produced the Dependability through Assuredness, a standard of The Open Group, that brings all of these things together. We've had a wonderful international endeavor on this, bringing a lot of work from Japan, working with the folks in the US and other parts of the world. It's been a unique activity.

Dependability through Assuredness depends upon having two interlocked cycles. The first is a Change Management Cycle that says that, as you look at requirements, you build out the dependencies, you build out the assurance cases for those dependencies, and you update the architecture. Everything has to start with architecture now.

You build in accountability, and accountability, importantly, has to be accepted. You can't just dictate that someone is accountable. You have to have a negotiation. Then, through ordinary operation, you assess whether there are anomalies that can be detected and fix those anomalies by new requirements that lead to new dependabilities, new assurance cases, new architecture and so on.

The other cycle that’s critical in this, though, is the Failure Response Cycle. If there is a perceived failure or an actual failure, there is understanding of the cause, prevention of it ever happening again, and repair. That goes through the Change Accommodation Cycle as well, to make sure that we update the requirements, the assurance cases, the dependability, the architecture, and the accountability.

So the plan is that with a dependable system through that assuredness, we can manage these large, complex systems much more easily.

Gardner: Many of The Open Group activities have been focused at the enterprise architect or business architect levels. Also with these risk and security issues, you're focusing at chief information security officers or governance, risk, and compliance (GRC), officials or administrators. It sounds as if the Dependability through Assuredness standard shoots a little higher. Is this something a board-level mentality or leadership should be thinking about, and is this something that reports to them?

Board-level issue

Brown: In an organization, risk is a board-level issue, security has become a board-level issue, and so has organization design and architecture. They're all up at that level. It's a matter of the fiscal responsibility of the board to make sure that the organization is sustainable, and to make sure that they've taken the right actions to protect their organization in the future, in the event of an attack or a failure in their activities.

The risks to an organization are financial and reputation, and those risks can be very real. So, yes, they should be up there. Interestingly, when we're looking at areas like business architecture, sometimes that might be part of the IT function, but very often now we're seeing as reporting through the business lines. Even in governments around the world, the business architects are very often reporting up to business heads.

Gardner: Here in Philadelphia, you're focused on some industry verticals, finance, government, health. We had a very interesting presentation this morning by Dr. David Nash, who is the Dean of the Jefferson School of Population Health, and he had some very interesting insights about what's going on in the United States vis-à-vis public policy and healthcare.

One of the things that jumped out at me was, at the end of his presentation, he was saying how important it was to have behavior modification as an element of not only individuals taking better care of themselves, but also how hospitals, providers, and even payers relate across those boundaries of their organization.
One of the things about The Open Group standards is that they're pragmatic and practical standards.

That brings me back to this notion that these standards are very powerful and useful, but without getting people to change, they don't have the impact that they should. So is there an element that you've learned and that perhaps we can borrow from Dr. Nash in terms of applying methods that actually provoke change, rather than react to change?

Brown: Yes, change is a challenge for many people. Getting people to change is like taking a horse to water, but will it drink? We've got to find methods of doing that.

One of the things about The Open Group standards is that they're pragmatic and practical standards. We've seen' in many of our standards' that where they apply to product or service, there is a procurement pull through. So the FACE Consortium, for example, a $30 billion procurement means that this is real and true.

In the case of healthcare, Dr. Nash was talking about the need for boundaryless information sharing across the organizations. This is a major change and it's a change to the culture of the organizations that are involved. It's also a change to the consumer, the patient, and the patient advocates.

All of those will change over time. Some of that will be social change, where the change is expected and it's a social norm. Some of that change will change as people, generations develop. The younger generations are more comfortable with authority that they perceive with the healthcare professionals, and also of modifying the behavior of the professionals.

The great thing about the healthcare service very often is that we have professionals who want to do a number of things. They want to improve the lives of their patients, and they also want to be able to do more with less.

Already a need

There's already a need. If you want to make any change, you have to create a need, but in the healthcare, there is already a pent-up need that people see that they want to change. We can provide them with the tools and the standards that enable it to do that, and standards are critically important, because you are using the same language across everyone.

It's much easier for people to apply the same standards if they are using the same language, and you get a multiplier effect on the rate of change that you can achieve by using those standards. But I believe that there is this pent-up demand. The need for change is there. If we can provide them with the appropriate usable standards, they will benefit more rapidly.

Good folks

The focus of The Open Group for the last couple of decades or so has always been on horizontal standards, standards that are applicable to any industry. Our focus is always about pragmatic standards that can be implemented and touched and felt by end-user consumer organizations.

Now, we're seeing how we can make those even more pragmatic and relevant by addressing the verticals, but we're not going to lose the horizontal focus. We'll be looking at what lessons can be learned and what we can build on. Big data is a great example of the fact that the same kind of approach of gathering the data from different sources, whatever that is, and for mixing it up and being able to analyze it, can be applied anywhere.

The challenge with that, of course, is being able to capture it, store it, analyze it, and make some sense of it. You need the resources, the storage, and the capability of actually doing that. It's not just a case of, "I'll go and get some big data today."

I do believe that there are lessons learned that we can move from one industry to another. I also believe that, since some geographic areas and some countries are ahead of others, there's also a cascading of knowledge and capability around the world in a given time scale as well.
Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: The Open Group.

You may also be interested in:

Monday, July 22, 2013

HP Vertica architecture gives massive performance boost to toughest BI queries for Infinity Insurance

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

The next edition of the HP Discover Performance Podcast Series highlights how Infinity Insurance Companies in Birmingham, Alabama has been deploying a new data architecture -- native column store databases -- to improve productivity for their analysis and business intelligence (BI) queries.

To learn more about how Infinity has improved their performance and their results for their business analytics, BriefingsDirect interviewed Barry Ralston, Assistant Vice President for Data Management at Infinity Insurance Companies. The discussion, which took place at the recent HP Discover 2013 Conference in Las Vegas, is moderated by Dana Gardner, Principal Analyst at Interarbor Solutions. [Learn more about the upcoming Vertica conference in Boston Aug. 5.]

Among other findings, Ralston and his team has seen a 100 times improvement in their top 12 worst-performing queries or longest-running queries when moving from a row-store-based Oracle Exadata implementation to a column store-based HP Vertica deployment. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:
Gardner: What was it that you've been doing with your BI and data warehousing that prompted you to seek an alternative?

Ralston: Like many companies, we have constructed an enterprise data warehouse deployed to a row-store technology. In our case, it was initially Oracle RAC and then, eventually, the Oracle Exadata engineered hardware/software appliance.

Ralston
We were noticing that analysis that typically occurs in our space wasn’t really optimized for execution via that row store. Based on my experience with Vertica, we did a proof of concept with a couple of other alternative and analytic store-type databases. We specifically chose Vertica to achieve higher productivity and to allow us to focus on optimizing queries and extracting value out of the data.

Gardner: What does Infinity Insurance Companies do? How big are you, and how important is data and analysis to you?

Ralston: We are billion-dollar property and casualty company, headquartered in Birmingham, Alabama. Like any insurance carrier, data is key to what we do. But one of the things that drew me to Infinity, after years of being in a consulting role, was the idea of their determination to use data as a strategic weapon, not just IT as a whole, but data specifically within that larger IT as a strategic or competitive advantage.

Vertica environment

Gardner: You have quite a bit of internal and structured data. Tell me a bit what happened when you moved into a Vertica environment, first in the proof of concept phase and then into production?

Ralston: For the proof of concept, we took the most difficult or worst-performing queries from our Exadata implementation and moved that entire enterprise data warehouse set into a Vertica deployment on three Dual Hex Core, DL380 type machines. We're running at the same scale, with the same data, with the same queries.

We took the top 12 worst-performing queries or longest-running queries from the Exadata implementation, and not one of the proof of concept queries ran less than 100 times faster. It was an easy decision to make in terms of the analytic workload, versus trying to use the Oracle row-store technology.

Gardner: Let’s dig into that a bit. I'm not a computer scientist and I don’t claim to fully understand the difference between row store, relational, and the column-based approach for Vertica. Give us the quick "Data Architecture 101" explanation of why this improvement is so impressive? [Learn more about the upcoming Vertica conference in Boston Aug. 5.]

Ralston: The original family of relational databases -- the current big three are  Oracle, SQL Server and DB2 -- are based on what we call row-storage technologies. They store information in blocks on disks, writing an entire row at a time.

If you had a record for an insured, you might have the insured's name, the date the policy went into effect, the date the policy next shows a payment, etc. All those attributes were written all at the same time in series to a row, which is combined into a block.
It’s an optimal way of storing data for transaction processing.

So storage has to be allocated in a particular fashion, to facilitate things like updates. It’s an optimal way of storing data for transaction processing. For now, it’s probably the state-of-the-art for that. If I am running an accounting system or a quote system, that’s the way to go.

Analytic queries are fundamentally different than transaction-processing queries. Think of the transaction processing as a cash register. You ring up a sale with a series of line items. Those get written to that row store database and that works well.

But when I want to know the top 10 products sold to my most profitable 20 percent of customers in a certain set of regions in the country, those set-based queries don’t perform well without major indexing. Often, that relates back to additional physical storage in a row-storage architecture.

Column store databases -- Vertica is a native column store database -- store data fundamentally differently than those row stores. We might break down a record into an entire set of columns or store distinctly. This allows me to do a couple of different things from an architectural level.

Sort, compress, organize

First and foremost, I can sort, compress, and organize the data on disk much more efficiently. Compression has been recently added to row-storage architectures, but in a row-storage database, you largely have to compress at the entirety of a row.

I can’t choose an optimal compression algorithm for just a date, because in that row, I will have text, numbers, and dates. In a column store, I can apply specific compression algorithm to the data that's in that column. So date gets one algorithm, a monotone increasing key like a surrogate key you might have in a dimensional data warehouse, has a different encoding algorithm, etc.

This is sorting. How data gets retrieved is fundamentally different, another big point for row-storage databases at query time. I could say, "Tell me all the customers that bought a product in California, but I only want to know their last name."

If I have 20 different attributes, a row-storage database actually has to read all the attributes off of disk. The query engine eliminates the ones I didn’t ask for in the eventual results, but I've already incurred the penalty of the input-output (I/O). This has a huge impact when you think of things like call detail records in telecom which have a 144-some odd columns.

If I'm only asking against a column store database, "Give me all the people who have last names, who bought a product in California," I'm essentially asking the database to read two columns off disk, and that’s all that’s happening. My I/O factors are improved by an order of 10 or in the case of the CDR, 1 in 144.
The great question is what ends up being the business value.

Gardner: You can’t just go back and increase your I/O improvements in those relational environments by making it in-memory or cutting down on the distance between the data and the processing? That only gets you so far, and you can only throw hardware at it so much. So fundamentally, it’s all about the architecture.

Ralston: Absolutely correct. You've seen a lot of these -- I think one of the fun terms around this is "unnatural acts with data," as to how data gets either scattered or put into a cache or other things. Every time you introduce one of these mechanisms, you're putting another bottleneck between near real-time analytics and getting the data from a source system into a user’s hands for analytics. Think of a cache. If you’re going to cache, you’ve got to warm that cache up to get an effect.

If I'm streaming data in from a sensor, real-time location servers, or something like that, I don’t get a whole lot of value out of the cache to start until it gets warmed up. I totally agree with your point there, Dana, that it’s all about the architecture.

In short, in leveraging Vertica, the underlying architecture allows me to create a playfield, if you will, for business analysts. They don’t necessarily have to be data scientists to enjoy it and be able to relate things that have a business relationship between each other, but not necessarily one that’s reflected in the data model, for whatever reason.
Performance suffers

Obviously in a row storage architecture, and specifically within dimensional data warehouses, if there is no index between a pair of columns, your performance begins to suffer. Vertica creates no indexes and it’s self-indexing the data via sorting and encoding.

So if I have an end user who wants to analyze something that’s never been analyzed before, but has a semantic relationship between those items, I don’t have to re-architect the data storage for them to get information back at the speed of their decision.

Gardner: What about opening this up to some new types of data and/or giving your users the folks in the insurance company the opportunity to look to external types of queries and learn more about markets, where they can apply new insurance products and grow the top line?

Ralston: That's definitely part of our strategic plan. Right now, 100 percent of the data being leveraged at Infinity is structured. We're leveraging Vertica to manage all that structured data, but we have a plan to leverage Hadoop and the Vertica Hadoop connectors, based on what I'm seeing around HAVEn, the idea of being able to seamlessly structured, non-structured data from one point.
Then, I’ve delivered what my CIO is asking me in terms of data as a competitive advantage.

Insurance is an interesting business in that, as my product and pricing people look for the next great indicator of risk, we essentially get to ride a wave of that competitive advantage for as long a period of time as it takes us to report that new rate to a state. The state shares that with our competitors, and then our competitors have to see if they want to bake into their systems what we’ve just found.

So we can use Vertica as a competitive hammer, Vertica plus Hadoop to do things that our competitors aren’t able to do. Then, I’ve delivered what my CIO is asking me in terms of data as a competitive advantage.
Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: HP.

You may also be interested in:

Tuesday, July 16, 2013

Hackett research points to big need for spot buying automation amid general B2B procurement efficiency drive

Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: Ariba, an SAP Company.

This latest BriefingsDirect podcast, from the recent 2013 Ariba LIVE Conference in Washington, D.C., explores the rapid adoption of better means for companies to conduct so-called spot buying -- a more ad-hoc and agile, yet managed, approach to buying products and services.

We'll examine new spot-buying research from The Hackett Group on the latest and greatest around agile procurement of low-volume purchases, and we'll learn how two companies are benefiting from making spot buying a new competency.

The panel consists of Kurt Albertson, Associate Principal Advisor at The Hackett Group in Atlanta; Ian Thomson, Koozoo’s Head of Business Development, based in San Francisco, and Cal Miller, Vice President of Business Development for Blue Marble Media in Atlanta. The interview is conducted Dana Gardner, Principal Analyst at Interarbor Solutions. [Disclosure: Ariba, an SAP company, is a sponsor of BriefingsDirect podcasts.]

Here are some excerpts:
Gardner: How did we get to the need for tactical sourcing, and how did we actually begin dividing tactical and strategic sourcing at all?

Albertson: When you look at enterprises out there, our Key Issues Study for 2013 identified the top priorities area as profitability. So companies are continuing to focus on the profitability objective.

Customer satisfaction

The second slot was customer satisfaction, and you can view customer satisfaction as external customers, but also internal customers and the satisfaction around that.

Albertson
With that as the overlay in terms of the two most important objectives for the enterprise --  the third, by the way, is revenue growth -- let’s cascade down to why tactical sourcing or spot buying is important.

The importance comes from those two topics. Companies are continuing to drive profitability, which means continuing take out cost. Most mature organizations have very robust and mature strategic-sourcing processes in place. They've hired very seasoned category managers to run those processes and they want them focused on the most valuable categories of spend, where you want to align your most strategic assets.

On the other side of that equation, you have this transactional stuff. Someone puts through a purchase order, where procurement has very little involvement. The requisitioners make the decision on what to buy and they go out and get pricing. Purchasing’s role is to issue a purchase order, and there is no kind of category management or expense management practice in place.

That’s been the traditional approach by organizations, this two-tiered approach to procurement. The issue, however, comes when you have your category managers trying to get involved in spend where it’s not necessarily strategic, but you still want some level of spend management applied to it. So you've got these very seasoned resources focused on categories of spend that aren’t necessarily where they can add the biggest bang for the buck.
It's putting in place a better model to support that type of spend, so your category managers can go off and do what you hired them to do.

That’s what caused this phenomenon around spot buy, or tactical buy, taking this middle ground of spend, which our research shows is about 43 percent of spend on average. More importantly, more than sometimes half the transactional activity comes through it. So it's putting in place a better model to support that type of spend, so your category managers can go off and do what you hired them to do.

Gardner: And that 43 percent, does that cut across large companies as well as smaller ones?

Albertson: The 43 percent is an average, and there are going to be variances in that, depending on the industry, spend profile, and scale of the company, as you noted. Companies need to look at their spend, get the spend analytics in place to understand what they're buying to nail down the value proposition around this.

Smaller companies generally aren't going to have the maturity in place in terms of managing their spend. They're not going to have the category-manager capabilities in place. In all likelihood, they could be handling a much higher percentage of their spend through a more transactional nature. So for them, the opportunity might even be greater.

Cycle time

When we think about the reasons for doing spot buying, profitability was one reason, but customer service was the other, and customer service translates into cycle time.

That’s usually the issue with this type of spend. You can’t afford to have a category manager take it through a strategic sourcing process, which can take anywhere from six to 30 weeks.

People need this tomorrow. They need it in a week, and so you need a mechanism in place to focus on shorter cycle times and meet the needs of the customers. If you can’t do that, they're just going to bypass procurement, go do their own thing, and apply no rigor of spend management against that.
If we think about the reasons for doing this, profitability was one, but customer service was the other, and customer service translates into cycle time.

It's a common misperception that of that 43 percent of influence spend that we would consider tactical, it's all emergency buys. A lot of it isn’t necessarily emergency buys. It’s just that a large percentage of that is more category-specific types of purchases, but companies just don’t have the preferred suppliers or the category expertise in place to go out, identify suppliers, and manage that spend. It falls under the standard levels that companies might have for sending something through strategic sourcing.

Gardner: Let’s go to some organizations that are grappling with these issues. First, Koozoo. Ian, tell us a little bit about Koozoo and how spot buying plays a role in your life.

Thomson: Koozoo is a technology startup based in San Francisco. We're venture-backed and we've made it very easy to share your view using an existing device. You take an old mobile phone, and we can convert that, using our software application, into a live-stream webcam.

Thomson
In terms of efficiency, we're like many organizations, but as a start-up, in particular, we're resource constrained. I'm also the procurement manager, as it turns out. It’s not in my job title, but we needed to find something fast. We were launching a product and we needed something to support it.

It wasn’t a catalog item, and it wasn’t something I could find on Amazon. So looked for some suppliers online and found somebody that could meet our need within two weeks, which was super important, as we were looking at a launch date.

More developed need

I had gone to Alibaba and I looked at what Alibaba’s competitors were. Ariba Discovery came up as one of them. So that’s pretty much how I ran into it.
I think I "spot buyed" Ariba in order to spot buy. I tested Alibaba, and to be fair, it was not a very clean approach. I got a lot of messy inbound input and responses when I asked for what I thought was a relatively simple request.
There were things that weren’t meeting my needs. The communication wasn’t very easy on Alibaba, maybe because of the international nature of the would-be suppliers.

Gardner: Let’s go to Cal Miller at Blue Marble Media. First, Cal, tell us a bit about Blue Marble and why this nature of buying is important for you?

Miller: Blue Marble is a very small company, but we develop high profile video, film, motion graphics, and animation. We came to be involved with Ariba about three years ago. We were selected as a supplier to help them with a marketing project. The relationship grew, and as we learned more about Ariba, someone said, "You guys need to be on the Discovery Network program." We did, and it was a very wise decision, very fortunate.

Miller
Gardner: Are you using the spot buying and Discovery as a way of buying goods or allowing others to buy your goods in that spot-buying mode or both?

Miller: Our involvement is almost totally as a seller. In our business, at least half of our clients are in a spot-buy scenario. It’s not something they do every month or even every year. We have even Fortune 500 companies that will say they need to do this series of videos and haven’t done it for three years. So whoever gets assigned to start that project it is a spot buy, and we're hopeful that they'll find us and then we get that opportunity. So spot buying is a real strategy for us and for developing our revenue.

Gardner: You found therefore a channel in Ariba through which people who are in this ad-hoc need to execute quickly, but not with a lot of organization and history to it, can find you. How did that compare to other methods that you would typically use to be found?

Miller: Actually, there is very little comparison. The batting average, if you will, is excellent. The quality of people who are coming out to say, "We would like to meet you" is outstanding. Most generally, it’s a C-level contact. What we find is the interaction allows for a real relationship-development process. So even if we don’t get that particular opportunity, we're secure as one of their shortlisted go-to people, and that’s worth everything.

Gardner: Kurt Albertson, when you listen to both a buyer and a seller, it seems to me that there is a huge untapped potential for organizing and managing spot buying in the market.

Finding new customers

Albertson: Listen to Cal talk about Blue Marble’s experience. Certainly from a business development perspective, it’s another tool that I'm sure Cal appreciates in terms of going out and finding new customers.

Listening to Ian talk about it from the buy side is interesting. You have users like Ian who don’t have a mature procurement organization in place, and this is a tool they're using to go out and drive their procurement process.

But then, on the other end of that scale, you do have large global companies as well. As I talked about, these large global companies who haven’t done a good job of managing what we would consider tactical spend, which again is about 43 percent of what’s influenced.

For them, while they have built out very robust procurement organizations to manage the more strategic spend, it’s this 43 percent of influence spend that’s sub-optimized. So it’s more of an evolution of their procurement strategy to start putting in place the capabilities to address that chunk of spend that’s been sub-optimized.
There is a very strong business case for going out and putting in place the capabilities to address the spend.

Gardner: Tell us a bit more about your research. Were there any other findings that would benefit us, as we try to understand what spot buying is and why it should be important to more buyers and sellers?

Albertson: The first question that everyone generally tends to ask when trying to build out a new type of capability is what’s the return on that. Why would we do this? We have already talked about the issue of longer cycle times that occur, if you try to manage the spend through a traditional kind of procurement process and the dissatisfaction that causes. But the other option is to just let the requesters do what they want, and you don’t drive any kind of spend management practices around it.

When we look at the numbers, Dana, typically going through a traditional strategic sourcing process with highly skilled category managers, on average you'll drive just over 6 percent savings on that spend. Whereas, if you put in place more of a tactical spot-buy type process, the savings you will drive is less,  4.3 percent on average, according to our research.

So there's a little bit of a delta there by putting it through a more formal process. But the important thing is that if you look at the return, you're obviously not spending as much time and you're not having as mature resources and as experienced resources having to support that spend. So the investment is less. The return on investment that you get from a tactical process, as opposed to the more strategic process, is actually higher.

There is a very strong business case for going out and putting in place the capabilities to address the spend. That’s the question that most organizations will ask -- what is the return on the investment?

Gardner: Are all the procurement providers, service providers jumping on this? Is Ariba in front of the game in any way?

Process challenges

Albertson: There are some challenges with this process, and if you look at Ariba, they evolved from the front end of the sourcing process,  built out capabilities to support that, and have a lot of maturity in that space.

The other thing that they have built out is the networked community. If you look at tactical buying and spot buying, both of those are extremely important. First of all, you want a front-end ERFx process that you can quickly enable, can quickly go out in a standard methodology, and go to the market with standard requirements.

But the other component of that is that you need to have this network of a whole bunch of suppliers out there that you can then send that to. That’s where Ariba’s strength is in the fact that they have built out a very large network, the largest network out there for suppliers and buyers to interact.

And that’s really the most significant advantage that Ariba has in this space -- that network of buyers and suppliers, so they can very quickly go out and implement a supplier discovery type of execution and identify particular suppliers.

We may call this tactical spend, but it’s still important to the people who are going out within the companies and looking for what they're trying to get, a product or service. There needs to be a level of due diligence against these suppliers. There needs to be a level of trust. Compare that to doing a Google search and going out there and just finding suppliers. The Ariba Network provides that additional level of comfort and trust and prequalification of suppliers to participate in this process.
For the larger organizations, the bigger bang for the buck for them is going after and getting control over the strategic spend.

You're going to find companies coming at it from both ends. The smaller, less mature organizations from a procurement perspective are going to come at it from a primary buying and sourcing channel, whereas for the larger organizations, the bigger bang for the buck for them is going after and getting control over the strategic spend.
Again, we're in an environment right now, particularly for the larger organizations, where everyone is trying to continue to evolve the value proposition. Strategic category managers are moving into supply-relationship management, innovation, and how do they collaborate with suppliers to drive innovation.
s
We all know that across the G&A function, including procurement, there are not the significant investments of resources being made. So the only way they are going to be able to do that is extract themselves out of this kind of tactical activity and build out a different type of capability internally, including leveraging solutions like Ariba and the Supplier Discovery capability to go out and help facilitate that buy so that those category managers can continue to evolve the value that they provide to the business.

Cloud model

Gardner: It seems that the cloud model really suits this spot-buying and tactical-buying approach very well. You log on, the network can grow rapidly, and buyers and sellers can participate in this networked economy. Is this something that wouldn’t have happened 5 or 10 years ago, when we only looked at on-premise systems? Is the cloud a factor in why spot buying works now?

Albertson: Obviously, one of the drivers of this is how quickly can you get up to speed and start leveraging the technology and enabling the spot-buy tactical sourcing capabilities that you're building.
One of the drivers of this is how quickly can you get up to speed and start leveraging the technology.

Then on the supply end, one of the driving forces is to enable as many suppliers and as many participants into this environment. That is going to be one of the key factors that determines success in this area, and certainly a software-as-a-service (SaaS) model works better for accomplishing that than an on-premise model does.
Listen to the podcast. Find it on iTunes. Read a full transcript or download a copy. Sponsor: Ariba, an SAP Company.

You may also be interested in: