Thinking Radically about Data Warehousing and Big Data: Interview with Roger Gaskell, CTO of Kognitio

Managing data—particularly in large numbers—still is and probably will be the number one priority for many organizations in the upcoming years. Many of the traditional ways to store and analyze large amounts of data are being replaced with new technologies and methodologies to manage the new volume, complexity, and analysis requirements. These include new ways of developing data warehousing, the emergence of big-data providers to store and analyze very large sets of information, and new hardware and software technologies to speed the data analysis process, among others.

One of the important players in this fierce and competitive space is Kognitio, a company founded in 2005 with headquarters in Bracknell, United Kingdom, and offices in the United States in Chicago, New York, and Minneapolis.

Kognitio is a provider of data warehouse and business intelligence (BI) systems that incorporate in-memory, massively parallel processing (MPP), as well as an interesting virtual online analytical processing (OLAP) cube solution.

We had the pleasure of interviewing Roger Gaskell, chief technical officer (CTO) of Kognitio, and got some of interesting insights on Kognitio’s systems as well the BI and the data warehouse space in general.


Roger Gaskell is responsible for all the product development that goes on at Kognitio. Prior to Kognitio, Mr. Gaskell worked as a test development manager at AB Electronics, primarily for the development and testing of the first mass production of IBM personal computers.

1.    Hello, Mr. Gaskell. Could you give us a brief introduction to Kognitio and the products the company offers?
Certainly. Kognitio is a technology software vendor that develops a high-performance in-memory analytic database called WX2. The relational database allows organizations to mine and analyze large and complex data sets, commonly known as Big Data, to unearth insights and intelligence. From large corporations to smaller businesses, companies of all shapes and sizes run WX2 to analyze their data quickly and efficiently, allowing them to make better, more informed business decisions that help them drive growth and reduce costs.

Although we offer WX2 as an on-premise solution either as a software-only license or as a fully configured data warehouse appliance, Kognitio also allows companies to get the analytic insight they need for a low-cost monthly fee via the Kognitio Cloud, where we run our Data Warehousing as a Service offering. The Kognitio Cloud is a secure “private cloud” solution that offers end users maximum flexibility and a low cost of entry into data analytics and warehousing.

2.    How did you start working in data warehousing and BI?
I have been active in the data warehousing space for a number of years, having started my career at AB Electronics and then at Whitecross Systems. I have been instrumental in all generations of the WX2 analytic database to date, evolving it from a database application that ran on proprietary hardware to a software-only analytic database that runs on industry-standard servers. In my career, I have been lucky enough to talk to a large number of companies about the benefits of high-performance analytics and have seen a wide range of applications that have been built on top of the database.

3.    What are, in your view, the two most important advances to date in the data warehouse and BI space?
The number one advance to date is massively parallel processing, or MPP. Without it we could not possibly cope with the ever-increasing data volumes and analytic complexity that business applications command. While Moore’s Law, which dictates that processing power doubles every two years, is still valid for most applications, it does not ring true in the data warehousing space, where data volumes double every year.

There are a number of advances that compete for number two. These would include: data warehouse appliances, interoperability standards, in-memory analytic processing, and cloud computing. However, if I had to choose only one, I would currently opt for Hadoop, which seems to be taking over some parts of the role of the enterprise data warehouse (EDW).

4.    What are the main features of Kognitio’s WX2 appliance?
The Kognitio appliance is architected to allow users to analyze data—and not just store it. The more complex the analytics, the better. For this reason, the Kognitio appliance has the highest number of central processing unit (CPU) cores per unit of data than any other data warehouse appliance on the market. Typically, our appliances ship with one CPU core per 20GB of data.

The Kognitio appliance accomplishes this enormous power using industry-standard commodity servers—there is no proprietary hardware involved. We package the hardware and software in a single entity, and represent a completely optimized package of software, servers, and storage.

The combination of WX2 and industry-standard servers (blade or rackmount) is purpose-built to handle analysis against terabytes of data quickly and simply. As a result, the appliances deliver extreme performance and scalability for all database applications, including data warehousing, BI, and OLAP.

We currently offer three varieties of the appliance: Rapids, Rivers, and Lakes . Each variety represents a certain blend of disk and RAM to offer the right performance and storage levels.

5.    Pablo is an interesting solution for doing OLAP-based analysis. What is the difference between Pablo and other OLAP applications on the market?
Pablo truly represents a new vision for creating and supporting OLAP environments and removes the headache of physical cube proliferation. As a feature of the WX2 analytic database, Pablo is a perfect solution for organizations that have made a major commitment to OLAP-based analysis, as it allows them to use OLAP visualization tools and get OLAP class performance but without the need to create and populate physical OLAP cubes. The underlying data is held as low-level transactional data in standard relational structures, but to the application layer it appears as a “virtual cube.”

By virtualizing the cube structures, organizations can still get a holistic view of their data but with zero latency given that large virtual cubes can be created within just seconds. Moreover, users can benefit from one single version of the truth; instead of contending with multiple physical cubes that represent data snapshots that age as soon as they are created, Pablo allows users to access the same underlying data volumes. However, most importantly, analysts can get on with their analyses instead of getting bogged down with overhead of cube creation—saving the company time and money.

6.    What is your take on the evolution of the data warehouse? What are the main challenges companies still have to face in doing big data management and analysis?
Hadoop seems to becoming the de facto technology for “Big Data” processing. While the advantages may be clear (no license fees, open source, etc.), the barriers to entry can be large. Organizations considering Hadoop must realize that to implement a Hadoop environment and maintain its operation calls for a considerable engineering skill set and resource.

Despite the need for this large engineering investment, many organizations are still willing to take the plunge. However, the organizations that are more “Hadoop mature” have realized the critical point with the platform: although Hadoop can be engineered to do a good job of processing data, storing it, and providing routine reporting, the solution cannot be used as an ad hoc, interactive analytic platform. This is because Hadoop is very slow and lacks sophisticated visualization tools. Those more savvy organizations are starting to combine Hadoop processing storage clusters with purpose-built, fast analytic platforms such as in-memory databases.

7.    Kognitio provides cloud-based data warehousing services. How do you view the adoption of the cloud for data warehouse? And what’s ahead for Kognitio? What opportunities and challenges do you foresee?
Cloud-based data warehousing is emerging as a deployment model for large-scale analytics and Kognitio has always offered this—helping businesses to solve several issues that are commonplace today. Most notably, the inability to prove business value from data warehousing and analytics projects before spending large capital sums of money. By running analytics in the cloud, there is no overhead of implementing a data warehouse onsite, and there is no need for an information technology (IT) department to service, maintain, and support the database. Instead, organizations can focus on meeting their business goals, and not get bogged down with infrastructure issues.

The challenge is always getting access to the data in the underlying source systems and deciding the type of cloud model that best suits the company. Is it ok to run analytic projects in the public cloud or would it be better to run in a private, trusted, or exclusive cloud? These are questions that need to be addressed by any company before opting for the cloud-based model. But there are certainly a number of opportunities here for end-user organizations to do more business-focused work and drive their operations instead of getting stuck in the weeds with bits and bytes.

8.    In-memory technologies are becoming mainstream. What is the advantage of Kognitio over other providers?
Kognitio has always been a proponent of in-memory analytics, and WX2 has always allowed data to be resident in memory for faster analytics. However, in-memory is only part of the story. Having data memory resident allows very fast access to the data, but to turn this into scalable performance in order to satisfy complex analytic workloads on large data volumes requires the ability to use huge amounts of scalable processing power.

WX2 is truly massively parallel. This means that every core on every processor in every server works simultaneously on every query. So, if you increase the number of servers, WX2 will automatically reconfigure itself in order to utilize and maximize the usage of every new core. It is through this intelligent use of RAM and CPU power that WX2 offers unprecedented speeds to access and query complex data sets.

So, as the data explosion continues, organizations count on Kognitio and its in-memory analytic database solution to get comprehensive answers to their questions. Where other solutions cannot cope with overwhelming data volumes and force the user to rely on data samples, Kognitio has helped numerous customers survive the big data phenomenon—analyze whatever, whenever—and get answers to complex questions in sub-second response times. It is the in-memory and MPP feature set of the database that drives this performance.

9.   How do you imagine the BI space in 10 years?
I foresee an interesting landscape developing over the next decade, one in which large Hadoop clusters process and store every possible piece of data that an organization can collect, and where processed data sets are then sent to very high performance in-memory databases to meet the demands of complex train of thought analyses.

10.   What is your favorite sport or sports team?
I am a passionate skier and travel every year to both North America and France to do my skiing. My other passion is soccer, having been a Manchester United supporter and fan all my life.

I welcome your thoughts—please leave a comment below, and I’ll respond as soon as I can.
comments powered by Disqus