Thursday, April 2, 2009

Amazon's BI-on-the-fly using MapReduce-as-a-service brings huge cloud data crunching to the masses

Amazon's announcement of a cloud-based data mining and analysis service, using the Hadoop implementation of MapReduce, potentially opens advanced business intelligence (BI) activities to many more businesses and organizations. It's an excellent example of just how much cloud computing can change the world.

In essence, the service, Amazon Elastic MapReduce, if it works as advertised, abstracts the complexity and cost of massive parallel and symmetrical programming and processing so non-computer scientists -- you know, business types -- can examine and query huge data sets.

Think of it as having your own tuned supercomputer that you can plug gigantic data sets into and ask questions that will determine the course of your businesses for the next decade. Oh, and you can pay for the pleasure on a credit card.

This high-end BI value has pretty much been the sole purview of large, skilled and deep-pocketed enterprises. But there are plenty of people, researchers, government agencies, academics, small to medium enterprises, venture capitalists and the like that would hugely benefit from sussing out important trends and findings from the growing reams of raw data generated by modern businesses and societies. Talk about metadata on steroids! Here's another way to use social networks, folks.

For more on the business implications of MapReduce and advanced BI, take a look at a podcast I recently moderated. For more on the more technical aspects of what MapReduce-oriented computing means, there's a second podcast discussion.

Given the intriguing price points Amazon is providing, this service could be a game-changer. It will likely force other cloud providers to follow suit, which will make advanced BI services more available and affordable for more kinds of tasks. I can even imagine communities of similarly interested user parties sharing query formulations and search templates of myriad investigations. A whole third-party BI consulting and services industry could crop up virtually overnight.

It will interesting to see if Business Intelligence 2.0 types of analysis can also be brought to the service, through third parties or even outright products that leverage the cloud BI services in the background.

Their pitch: We can bring what Google does for the Web to your entire universe of data. For any of your users. Oh, and we can bring other useful and available data sets into the mix, too. And you can afford this. Your executives can figure out how to use it directly. No lab coats required.

Governments and legislators in particular -- which have access to huge stores of publicly financed data -- could significantly drop the cost of providing data and analysis services to the masses. As I understand it, the federal and state governments are a bit better at creating data than leveraging it in near real time. As in, the once a decade census data takes almost 10 years to get published. This could help that a lot.

Part of the challenge will be getting to the data and making the largest -- sometimes in the petabyte scale -- sets available to a service like Amazon's. The garbage-in, garbage-out parable does not change. And moving and managing these large sets is not trivial.

What's more trust remains a hurdle. For sensitive data, the handling and security of the bits need to be managed. But if a sales force trusts it's daily grind to Salesforce.com, perhaps other sensitive data too has a place on someone else's cloud fabric.

For those that can get access to good data on matters of importance to them, and perhaps do unique joins against other data sets, this cloud--based BI development could be a boon. Things that were never possible at any price are now doable.

With Amazon's move, the important BI tasks moves up away from cost-inhibitors and the infrastructure access pain to the data access, quality and query development skills levels, where it belongs.

Particularly in this economy, taking the risk out of weighty business and market decisions -- at an affordable cost on someone else's cloud fabric -- is a no brainer.

Sunday, March 29, 2009

HP advises strategic view of virtualization to dramatically cut IT costs, gain efficiency and usher in cloud benefits

Listen to the podcast. Download the podcast. Find it on iTunes/iPod and Podcast.com. Sponsor: Hewlett-Packard.

Read a full transcript of the discussion. Access more HP resources on virtualization.

Virtualization has become imperative to enterprises and service providers as they seek to better manage IT resources, cut total costs, reduce energy use, and improve data center agility.

But virtualization is more than just installing hypervisors. The effects and impacts of virtualization cut across many aspects of IT operations. The complexity of managing virtualization IT runtime environments can easily slip out of control.

A comprehensive level of planning and management, however, can assure a substantive economic return on virtualization investments. The proper goal then is to do virtualization right -- to be able to scale the use of virtualization in terms of numbers of instances elastically while automating management and reducing risks.

To gain the full economic benefits, IT managers also must extend virtualization from hardware to infrastructure, data, and application support -- all with security, control, visibility, and compliance baked in.

What's more, implementing virtualization at the strategic level with best practices ushers in the ability to leverage service oriented architecture (SOA), enjoy data center consolidation, and explore cloud computing benefits.

To learn more about how virtualization can be adopted rapidly with low risk using sufficient governance, I recently interviewed Bob Meyer, the worldwide virtualization lead in HPs' Technology Solutions Group.

Here are some excerpts:
For the last couple of years, people have realized the value of virtualization in terms of how it can help consolidate servers, or how it can help do such things as backup and recovery faster. But, now with the economy taking a turn for the worse, anyone who was on the fence, who wasn’t sure, who didn’t have a lot of experience with it, is now rushing headlong into virtualization.

They realize that it touches so many areas of their IT budget, it just seems to be a logical thing to do in order for them to survive these economic times and come out a leaner, more efficient IT organization. ... It’s gone to virtualization everywhere, for everything -- "How much can I put in and how fast can I put it in." ... Everybody will have a mix of virtual and physical environments.

We're not just talking about virtualization of servers. We're talking about virtualizing your infrastructure -- servers, storage, network, and even clients on the desktop. People talk about going headlong into virtualization. It has the potential to change everything within IT and the way IT provides services.

Throughout the data center, virtualization is one of those key technologies that help you get to that next generation of the consolidated data center. If you just look at from a consolidation standpoint, a couple of years ago, people were happy to be consolidating five servers into one or six servers into one. When you get this right, do it on the right hardware with the right services setup, 32 to 1 is not uncommon -- a 32-to-1 consolidation rate.

Yet the business can be affected negatively, if the virtualized infrastructure is managed incompletely or managed outside the norms that you have set up for best practices. One of the blessings of virtualization is its speed. That’s also a curse in this case, because in traditional IT environments, you set up things like a change advisory board and, if you did a change to a server, if you moved it, if you had to move to a new network segment, or if you had to change storage, you would put it through a change advisory board. There were procedures and processes that people followed and received approvals.

In virtualization, because it’s so easy to move things around and it can be done so quickly, the tendency is for people to say, "Okay, I'm going to ignore that best practice, that governance, and I am going to just do what I do best, which is move the server around quickly and move the storage around." That’s starting to cause all sorts of IT issues.

Initial virtualization projects probably get handled with improper procedures. ... Just putting a hypervisor on a machine doesn’t necessarily get you virtualization returns.

You have to start asking, "Do I have the right solutions in place from an infrastructure perspective, from a management perspective, and from a process perspective to accommodate both environments?"

The danger is having parallel management structures within IT [with a separate one for virtualized resources]. It does no one any good. If you look at it as a means to an end, which virtualization is, the end of all this is more agile and cost-effective services and more agile and cost-effective use of infrastructure.

Virtualization really does touch everything that you do, and that everything is not just from a hardware perspective. It not only touches the server itself or the links between the server, the storage, and the network, but it also touches the management infrastructure and the client infrastructure.

What we intend to do is take that hypervisor and make sure that it's part of a well-managed infrastructure, a well-managed service, well-managed desktops, and bringing virtualization into the IT ecosystem, making it part of your day-to-day management fabric.

The focus right now is, "How does it save me money?" But, the longer-term benefit, the added benefit, is that, at some point the economy will turn better, as it always does. That will allow you to expand your services and really look at some of the newer ways to offer services. We mentioned cloud computing before. It will be about coming out of this downturn more agile, more adaptable, and more optimized.

No matter where your services are going -- whether you're going to look at cloud computing or enacting SOA now or in the near future -- virtualization has that longer term benefit of saying, "It helps me now, but it really sets me up for success later."

We fundamentally believe, and CIOs have told us a number of times that virtualization will set them up for long-term success. They believe it’s one of those fundamental technologies that will separate their company as winners going into any economic upturn.
Read a full transcript of the discussion. Access more HP resources on virtualization.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod and Podcast.com. Sponsor: Hewlett-Packard.