Monday, September 29, 2008

Greenplum pushes envelope with MapReduce and parallelism enhancements to its extreme-scale data offering

Greenplum has delivered on its promise to wrap MapReduce into the newest version of its data solutions. The announcement from the data warehousing and analytics supplier comes to a fast-changing landscape, given last week's HP-Oracle Exadata announcements.

It seems that data infrastructure vendors are rushing to the realization that older database architectures have hit a wall in terms of scale and performance. The general solution favors exploiting parallelism to the hilt and aligning database and logic functions in close proximity, while also exploiting MapReduce approaches to provide super-scale data delivery and analytics performance.

Greenplum's Database 3.2 takes on all three, but makes signigficant headway in embedding the MapReduce parallel-processing data-analysis technique pioneered by Google. The capability is accompanied by new tooling to extend the reach of using the technology. The result is Web-scale analytics and performance for enterprises and carriers -- or cloud compute data models for the masses. [Disclosure: Greenplum is a sponsor of BriefingsDirect podcasts.]

The newest offering from the San Mateo, Calif.-based Greenplum provides users new capabilities for analytics, as well as in-database compression, and programmable parallel analytic tools.

With the new functionality, users can combine SQL queries and MapReduce programs into unified tasks executed in parallel across thousands of cores. The in-database compression, Greenplum says, can increase performance and reduce storage requirements dramatically.

The programmable analytics allow mathematicians and statisticians to use the statistical language R or build custom functions using linear algebra and machine learning primitives and run them in parallel directly against the database.

Greenplum's massively parallel, shared-nothing architecture fully utilizes each core, with linear scalability to thousands of processors. This means that Greenplum's open source-powered database software can scale to support the demands of petabyte data warehousing. The company's standards-based approach enables customers to build high-performance data warehousing systems on low-cost commodity hardware.

Database 3.2 offers a new GUI and infrastructure for monitoring database performance and usage. These seamlessly gather, store, and present comprehensive details about database usage and current and historical queries internals, down to the iterator level, making this ideal for profiling queries and managing system utilization.

Now that HP and Oracle have taken the plunge and integrated hardware and software, we can expect that other hardware makers will be seeking software partners. Obviously IBM has DB2, Sun Microsystems has MySQL, but Dell, Hitachi, EDS and a slew of other hardware and storage providers may need to respond to the HP-Oracle challenge.

On Greenplum's blog, Ben Werther, director, Professional Services & Product Management at Greenplum, says: "Oracle has been getting beat badly in the high-end warehousing space ... Once you cut through the marketing, this is really about swapping out EMC storage for HP commodity gear, taking money from EMC's pocket and putting it in Oracle's."

It will also be interesting to watch as bedfellows and evaluated from Microsoft/DatAllegro, what happens with Ingres, whether Sun with MySQL can enter this higher end data performance echelon. This could mean that players like Greenplum and Aster Data Systems get some calling cards from a variety of suitors. The Sun-Greenplum match-up makes sense at a variety of levels.

Stay tuned. This market is clearly heating up.

Thursday, September 25, 2008

Interview: From OpenWorld, HP's John Santaferraro on latest BI Modernization strategies

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Hewlett-Packard.

Read a full transcript of the discussion.

Leading up to HP and Oracle's blockbuster announcement Sept. 24 of record-breaking data warehouse appliance performance, the business value of these infrastructure breakthroughs was the topic of a BriefingsDirect interview with John Santaferraro, director of marketing for HP's Business Intelligence Portfolio.

Now that the optimized hardware and software are available to produce the means to analyze and query huge data sets in near real-time, the focus moves to how to best leverage these capabilities. Soon, business executives will have among the most powerful IT tools ever developed at their disposal to deeply and widely analyze vast seas of data and content in near real time to help them run their business better, and to steer clear of risks.

Think of it as business intelligence (BI) on steroids.

At the Oracle OpenWorld unveiling, HP Chairman and CEO Mark Hurd called the new HP Oracle Database Machine a “data warehouse appliance.” It leverages the architecture improvements in the Exadata Programmable Storage Server, but at the much larger scale and with other optimization benefits.

The reason for the 10x to 72x performance improvements cited by Oracle Chairman and CEO Larry Ellison have do to bringing the “intelligence” closer to the data, that is bringing the Exadata Programmable Storage Server appliance into close proximity to the Oracle database servers, and then connecting them through InfiniBand connections. In essence, this architecture mimics some of the performance value created by cloud computing environments like Google, with its MapReduce technology.

To better understand how such technologies fit into the Oracle-HP alliance, with an emphasis on professional services and methodologies, I asked HP's Santaferraro about how BI is changing and how enterprises can best take advantage of such new and productive concepts as "operational BI" and "BI Modernization."

The Santaferraro interview, moderated by your’s truly from San Francisco, comes as part of a series of discussions with IT executives I’ll be doing this week from the Oracle OpenWorld conference. See the full list of podcasts and interviews.

Read a full transcript of the discussion.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Hewlett-Packard.

Wednesday, September 24, 2008

HP and Oracle team up on 'data warehouse appliances' that re-architect database-storage landscape

Oracle CEO Larry Ellison today introduced the company's first hardware products, a joint effort with Hewlett-Packard, to re-architect large database and storage configurations and gain whopping data warehouse and business intelligence performance improvements from the largest data sets.

The Exadata Programmable Storage Server appliance and the HP Oracle Database Machine, a black and red refrigerator-size full database, storage and network data center on wheels, made their debt at the Oracle OpenWorld conference in San Francisco. Ellison called the Machine the fastest database in the world.

HP Chairman and CEO Mark Hurd called the HP Oracle Database Machine a "data warehouse appliance." It leverages the architecture improvements in the Exadata Programmable Storage Server, but at the much larger scale and with other optimization benefits. [Disclosure: HP is a sponsor of BriefingsDirect podcasts.]

The hardware-software tag team also means Oracle is shifting its relationships with storage array vendors, including EMC, Netezza, NetApp and Terradata. The disk array market has been hot, but the HP-Oracle appliance may upset the high end of the market, and then bring the price-performance story down market, across more platforms.

I think we can safely say that HP is a preferred Oracle storage partner, and that Oracle wants, along with HP, some of those high-growth storage market profits for their own. There's no reason to not expect a wider portfolio of Exadata appliances and more configurations like the HP Oracle Database Machine to suit a variety of market segments.

"We needed radical new thinking to deliver high performance," said Ellison of the new hardware configurations, comparing the effort to the innovative design for his controversial America's Cup boat. "We need much more performance out of databases than what we get."

This barnburner announcement may also mark a market shift to combined and optimized forklift data warehouses, forcing the other storage suppliers to find database partners. IBM will no doubt have to respond as well.

The reason for the 10x to 72x performance improvements cited by Ellison are do to bringing the "intelligence" closer to the data, that is bringing the Exadata Programmable Storage Server appliance into close proximity to the Oracle database servers, and then connecting them through InfiniBand connections. In essence, this architecture mimics some of the performance value created by cloud computing environments like Google, with its MapReduce technology.

Ellison said that rather than large data sets moving between storage and database servers, which can slow up performance at 1TB and larger databases, the new Exadata-driven configuration moves only the query information across the networks. The current versions of these optimized boxes use Intel dual-core technology, but they will soon also be fired up by six-way Intel multi-core processors.

Talk about speeds and feeds .... But the market driver in these moves is massive data sets that need to be producing near real-time analytics paybacks. We're seeing more and more data, and varyinf kinds of data, brought into data warehouses and being banged on by queries of applications and BI servers from a variety of business users across the enterprise.

HP and Oracle share some 150,000 joint customers worldwide, said HP Executive Vice President, Technology Solutions Group Ann Livermore. That means that these database boxes will have an army of sales and support personnel. HP will support the Machine hardware, Oracle the software. Both will sell it.

Hey, you, get onto my cloud!

We're very early in the private cloud business -- which is precisely why such large and influential vendors as Oracle, Intel, HP, VMware, Citrix and Red Hat are jumping into the market with initiatives and pledges for standards and support. We're seeing some whoppers here at Oracle OpenWorld, from Oracle, Intel and HP in particular.

Why? The early birds that can establish de facto standards on data portability and resources governance -- minding the boundaries between the private and public clouds and their digital condensates -- will be in a position to define the next abstraction of meta operating system (for lack of a better term).

In just the last two weeks, VMware, Citrix and now Oracle have pledged to come to market with the infrastructure that enterprises and service providers alike will want. The cloud wanters need cloud makers, the picks and shovels, to build out on the vision of next-generation data center fabrics -- of dynamic resource pools of infrastructure, platform, data applications and management services.

How these services are supported, and how they are managed to inter-relate with each other and the services-abstracted older IT assets, forms the new uber platform -- the new target through which to attract developers, architects, partners and users -- lots and lots of users all feeding off of huge clouds of dynamic, low-cost services.

Yes, a market critical-mass cloud platform standard implementation could create yet a new way to lock in huge multi-billion-dollar markets to ... need. To need, and to want, and to buy, and to have a heck of a hard time stopping that needing. The picks and shovels. The lock-in, the black hole-pull of the infrastructure, hard to resist, and then ... impossible.

Such a prize! And just like in the past, the crass business interests side of the vendors will want to own, dominate and lock-in to their proprietary platform implementations. Opposing forces, also inside the same vendors, will opine on the need (correctly) for openness and standards to provide the real value the users and ecology players demand. The new lock-in, they will say (correctly) is not technical but in terms of convenience, simplicity, power, and cost. Seduce them, don't force them, might be the mantra.

So seduce or lock-in, early-days private cloud platform definitions require the best management of two sets of boundaries -- one that properly falls between the pubic-facing clouds, and the nascent "private" or on-premises or enterprise clouds. The pay-off comes not just from operating efficiencies but on how well the services generated from either types of cloud can interoperate and play well in supporting extended enterprise and B2C processes.

This need to cross boundaries well will also prompt the handful of public cloud providers (Amazon, Google, Yahoo, Microsoft, Apple, etc.) to embrace sufficient levels of standards-based interoperability. Think of it as mass markets balancing interests ... like globalization ... where economics more than proprietary technologies wins the day.

The second boundary to to be defined properly is between the legacy systems, SOAs, business applications and middleware -- and the private cloud fabrics that will increasingly be where new applications/services are "natively" deployed, and where the integrations to the old stuff occurs. We can really have two kinds of clouds -- one for IT and one for consumers. There needs to be one cloud that suits all of the digital universe, within certain (as yet undefined) parameters. They really need to bet this boundary right so that B2E and B2B is also B2C.

Clouds will, of course, be highly virtualized, and so they will be able to support many of the older proprietary and standard-based IT systems and development environments. But why virtualize the new stuff, too? Why have B2E/B2B old and separately B2B/B2C new? We should want one cloud approach that newer apps and services can target directly, and then virtualize al the older stuff.

The question then is what constitutes the new "native" platform that is of, for, and by the standard cloud. If there is a fairly well-defined, standards-based approach to cloud computing that manages all these boundaries -- between public and private, between the old and the new of IT -- and which can serve as the target for all the new apps, services, data abstractions, modeling tools, workflow/policy/governance/ESBs and development needs -- well that's a business worth shooting for.

Who cares how the lock-in occurs, this is the next $100 billion company business. In other words, getting this right is a very big deal. The time is nigh for defining IT for at least a decade, maybe longer.

But like the Unix wars of old (and the app server wars of not-so-old) there will be jockeying for cloud implementation supremacy, brinkmanship over whose this or that is better, and the high-stakes race for who gets the definitions of the boundaries correct best for the users, developers, channel, and partners. Who can woo the best?

What is different this time, in cloud time, is that there are few players that can play this game, less of a channel to be concerned about, and fewer developer communities to woo. Far more than in the past, developers can use the tools and frameworks of their choice, and the clouds will support them. Users also have new choices -- not between a Mac and a PC, between Unix and x86, between Java and .Net, between Linux and Windows -- but between cloud ecologies of vast services providers. The better the bundle of services (and therefore interop and cooperation), the better the customer attraction and loyalty. The seduction, the lock in, comes from packaging and execution on the services delivery.

More important than in past vendor sporting events, the business model rules. The cloud model that wins is the "preferred cloud model" that gives IT shops in enterprises high performance at manageable complexity and dramatically lower total costs. That same "preferred" cloud attracts the platform as a service developer crowd, allows mashups galore, allows for pay-as-you-use subscription fees. Viral adoption on a global scale. Oh, and the winning cloud also best plays out the subsidy from online advertising in all its forms and permutations.

Yes, we can expect several fruitful years of jockeying among the major vendors, the rain makers for the cloud providers -- and see gathering clouds of alliances among some, and against others. We're only seeing the very beginning of the next chapter of IT in the last few weeks of IT vendor news.

The cloud wars, however, won't be won on technical merits alone, it will be a real beauty pageant too. It will be more of a seduction and an election, less of a slight of hand and leveraging of incumbency ... and that will be a big switch.

From OpenWorld, Oracle and HP align forces to modernize legacy apps and spur IT transformation

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Hewlett-Packard.

Read a full transcript of the discussion.

The avenues to IT transformation are many, but the end result must include modernization of data, applications, systems, and operational best practices. It's no surprise then that the partnership of Oracle and Hewlett-Packard gained new ground Sept. 24 at Oracle OpenWorld in San Francisco.

The companies are providing products and services that holistically support the many required variables to successfully implement IT transformation. HP hardware and storage systems have been tuned to support Oracle databases, applications and software infrastructure for many years, and the partnership continues to expand in the age of SOA, legacy modernization, and cloud computing.

To learn more about how HP and Oracle will continue to act in concert, especially as enterprises seek the highest data center performance at the lowest cost, BriefingsDirect interviewed Paul Evans, worldwide marketing lead for IT transformation solutions at HP, and Lance Knowlton, vice president for modernization at Oracle. The discussion took place Sept. 23, 2008 at the Oracle OpenWorld conference.

The application modernization and IT transformation interview, moderated by yours truly from San Francisco, comes as part of a series of discussions with IT executives I’ll be doing this week from the Oracle OpenWorld conference. See the full list of podcasts and interviews.

Read a full transcript of the discussion.

Listen to the podcast. Download the podcast. Find it on iTunes/iPod. Learn more. Sponsor: Hewlett-Packard.