Ask the Experts Question: Do Organizations Really Need a Physical Data Warehouse Structure?

  • Written By: TEC Staff
  • Published: July 22 2009

We recently had a question from one of our readers (through our Ask the Experts page) discussing QlikView’s approach to data collection.

Reader’s Question
“QlikView says its innovative way of collecting data and not needing a physical data warehouse (DW) structure is the right thing to do in DW/business intelligence (BI) solutions. Can one expect to build a sustainable / scalable corporate data warehouse through such an approach?”

TEC Analyst Response
Here are a few main points to consider:

•    QlikView’s patented Associative Query Logic (AQL) technology is really what is behind the in-memory architecture. The technology replaces both relational joins (which means reporting), as well as the building of dimensional hierarchies—for online analytical processing (OLAP). It seems impressive and merits further reading.
•    The company does claim to support the “no data warehouse” model. Data from a number of sources can—in theory—be loaded into memory. Of course, memory requirements are proportional to the size of the data.
•    Does the company support data quality and data cleansing? It seems to support data cleansing on the fly, but I think this needs some deliberation and testing. To us, this is a vital part of the BI process –and implementations fail mostly due to unreliable data. Data quality “on the fly” is a tricky business.

To add to the above points, organizations have forced DW and BI solutions to deliver the results of enormous amounts of data queries in a short period of time. With the increasingly use of 64-bit processors and hardware with new scalable and massive parallel processing (MPP) capabilities, it is now possible to store bigger amounts of data in memory  and take advantage of their process speed.

In traditional data warehouses, the query goes to the database, grabs the information directly from a hard disk, and loads the results into memory. With in-memory tools, all information, or at least a very big portion of it, is stored directly in memory—so all query processes take place in memory. This enables users to manipulate large amounts of data at a very quick rate.

Vendors like QlikView and Kognitio use this approach to deliver in-memory BI and data analysis capabilities. Other vendors like SAP, Teradata, and even Microsoft with its Project Gemini, are catching up with in-memory technologies.

We think it’s definitely possible to apply in-memory technologies to BI, and in the near future we will be seeing more and more use of this kind of technology. But data will still need a place where it can be loaded from (i.e., a data warehouse), and I’m not sure that loading it directly from the operational systems is the way to go. We also think that, like every new technology, it will undergo a maturity process. We will cover this subject in more depth in upcoming blog posts. Please stay tuned.
comments powered by Disqus