Wednesday 28 April 2010

The reducing cost of data warehousing

This post in many ways relates to the mission for edexe: performance, price and productivity.

As a result of recent activities, I am stunned by how much the cost of data warehousing has reduced in the last couple of years..

MPP (massively parallel processing) technologies have fallen in price; on the ETL/DI front and on the database front, and database automation tools are becoming more popular as a result.

Competition is hot with regards to MPP in the database/appliance market. Teradata and Netezza have been the dominant players for the last 5-6 years, however a host of new appliances, cluster based and columnar databases have hit the market in the last 2-3 years. The increased competition from the likes of Oracle Exxadata, HP Neoview, Aster, Green Plum, Kognitio, Kickfire and Vertica is rapidly bringing the price of MPP database processing down to sub £100k (and below) for entry level databases.

This is bringing the MPP database well within the reaches of mid-tier companies, enabling enterprise performance on a budget.

On the data integration front, expressor’s parallel processing engine delivers speeds to compare against or even beat the most established DI vendors such as Ab Initio, Informatica and Datastage, yet remains priced sub £50k for an entry level DI product. Talend also deliver an MPP option to the market with their MPx version of their Integration suite.

Again high performance DI/ETL products are now available for the mid-tier company.

So I’ve discussed price and performance, what about productivity?

Well the database products deliver significant benefits in terms of productivity over the traditional OLAP databases such as Oracle and SQL Server. The MPP database standard was set by Netezza, delivering MPP performance, with minimal DBA activity. Even in large organisations, Netezza DBAs may only spend 1 day a week maintaining the system. Within the MPP space this is pretty much standard, with self-organising databases being the norm.

Between DI tools, there is little to choose in terms of productivity, however when compared to SQL or other hand cranked coding languages, the productivity gains are huge in comparison (anywhere from 50-80% reduction in coding time).

Staying with productivity, one tool that has really impressed me is BIReady. It is a database automation solution that really does deliver on productivity on two main fronts: changes to the data model do not necessarily require changes to the data structures, since data is automatically organised in a model independent normalised schema; key assignments are managed within BIReady, so need not be maintained by the ETL solution. This is a significant productivity gain in terms of reducing DBA activity (like the MPP databases) and simplifying the ETL process and shortening development times by taking away the need for key management. What’s more BIReady pricing also fits comfortably into mid-tier budgets.

So there we have it, price, performance and productivity. It is now possible to purchase a low maintenance, high performance end to end MPP warehousing technology for sub £300k. The nature of this beast also means that the delivery and maintenance of the solution is reduced.

High performance datawarehousing is finally in reach of the mid-market companies.

1 comment:

  1. well said to work and get in lined with the price and performance of data warehousing...get more insight in exploring a data warehouse at:

    http://datawarehousing-services.blogspot.com/2010/04/exploring-datawarehouse.html

    ReplyDelete