Certainly. Kognitio is a technology software vendor that develops a high-performance in-memory analytic database called WX2. The relational database allows organizations to mine and analyze large and complex data sets, commonly known as Big Data, to unearth insights and intelligence. From large corporations to smaller businesses, companies of all shapes and sizes run WX2 to analyze their data quickly and efficiently, allowing them to make better, more informed business decisions that help them drive growth and reduce costs.
Although we offer WX2 as an on-premise solution either as a software-only license or as a fully configured data warehouse appliance, Kognitio also allows companies to get the analytic insight they need for a low-cost monthly fee via the Kognitio Cloud, where we run our Data Warehousing as a Service offering. The Kognitio Cloud is a secure “private cloud” solution that offers end users maximum flexibility and a low cost of entry into data analytics and warehousing.
I have been active in the data warehousing space for a number of years, having started my career at AB Electronics and then at Whitecross Systems. I have been instrumental in all generations of the WX2 analytic database to date, evolving it from a database application that ran on proprietary hardware to a software-only analytic database that runs on industry-standard servers. In my career, I have been lucky enough to talk to a large number of companies about the benefits of high-performance analytics and have seen a wide range of applications that have been built on top of the database.
The number one advance to date is massively parallel processing, or MPP. Without it we could not possibly cope with the ever-increasing data volumes and analytic complexity that business applications command. While Moore’s Law, which dictates that processing power doubles every two years, is still valid for most applications, it does not ring true in the data warehousing space, where data volumes double every year.
There are a number of advances that compete for number two. These would include: data warehouse appliances, interoperability standards, in-memory analytic processing, and cloud computing. However, if I had to choose only one, I would currently opt for Hadoop, which seems to be taking over some parts of the role of the enterprise data warehouse (EDW).
The Kognitio appliance is architected to allow users to analyze data—and not just store it. The more complex the analytics, the better. For this reason, the Kognitio appliance has the highest number of central processing unit (CPU) cores per unit of data than any other data warehouse appliance on the market. Typically, our appliances ship with one CPU core per 20GB of data.
The Kognitio appliance accomplishes this enormous power using industry-standard commodity servers—there is no proprietary hardware involved. We package the hardware and software in a single entity, and represent a completely optimized package of software, servers, and storage.
The combination of WX2 and industry-standard servers (blade or rackmount) is purpose-built to handle analysis against terabytes of data quickly and simply. As a result, the appliances deliver extreme performance and scalability for all database applications, including data warehousing, BI, and OLAP.
We currently offer three varieties of the appliance: Rapids, Rivers, and Lakes . Each variety represents a certain blend of disk and RAM to offer the right performance and storage levels.
Pablo truly represents a new vision for creating and supporting OLAP environments and removes the headache of physical cube proliferation. As a feature of the WX2 analytic database, Pablo is a perfect solution for organizations that have made a major commitment to OLAP-based analysis, as it allows them to use OLAP visualization tools and get OLAP class performance but without the need to create and populate physical OLAP cubes. The underlying data is held as low-level transactional data in standard relational structures, but to the application layer it appears as a “virtual cube.”
By virtualizing the cube structures, organizations can still get a holistic view of their data but with zero latency given that large virtual cubes can be created within just seconds. Moreover, users can benefit from one single version of the truth; instead of contending with multiple physical cubes that represent data snapshots that age as soon as they are created, Pablo allows users to access the same underlying data volumes. However, most importantly, analysts can get on with their analyses instead of getting bogged down with overhead of cube creation—saving the company time and money.
Hadoop seems to becoming the de facto technology for “Big Data” processing. While the advantages may be clear (no license fees, open source, etc.), the barriers to entry can be large. Organizations considering Hadoop must realize that to implement a Hadoop environment and maintain its operation calls for a considerable engineering skill set and resource.
Despite the need for this large engineering investment, many organizations are still willing to take the plunge. However, the organizations that are more “Hadoop mature” have realized the critical point with the platform: although Hadoop can be engineered to do a good job of processing data, storing it, and providing routine reporting, the solution cannot be used as an ad hoc, interactive analytic platform. This is because Hadoop is very slow and lacks sophisticated visualization tools. Those more savvy organizations are starting to combine Hadoop processing storage clusters with purpose-built, fast analytic platforms such as in-memory databases.
Cloud-based data warehousing is emerging as a deployment model for large-scale analytics and Kognitio has always offered this—helping businesses to solve several issues that are commonplace today. Most notably, the inability to prove business value from data warehousing and analytics projects before spending large capital sums of money. By running analytics in the cloud, there is no overhead of implementing a data warehouse onsite, and there is no need for an information technology (IT) department to service, maintain, and support the database. Instead, organizations can focus on meeting their business goals, and not get bogged down with infrastructure issues.
The challenge is always getting access to the data in the underlying source systems and deciding the type of cloud model that best suits the company. Is it ok to run analytic projects in the public cloud or would it be better to run in a private, trusted, or exclusive cloud? These are questions that need to be addressed by any company before opting for the cloud-based model. But there are certainly a number of opportunities here for end-user organizations to do more business-focused work and drive their operations instead of getting stuck in the weeds with bits and bytes.
Kognitio has always been a proponent of in-memory analytics, and WX2 has always allowed data to be resident in memory for faster analytics. However, in-memory is only part of the story. Having data memory resident allows very fast access to the data, but to turn this into scalable performance in order to satisfy complex analytic workloads on large data volumes requires the ability to use huge amounts of scalable processing power.
WX2 is truly massively parallel. This means that every core on every processor in every server works simultaneously on every query. So, if you increase the number of servers, WX2 will automatically reconfigure itself in order to utilize and maximize the usage of every new core. It is through this intelligent use of RAM and CPU power that WX2 offers unprecedented speeds to access and query complex data sets.
So, as the data explosion continues, organizations count on Kognitio and its in-memory analytic database solution to get comprehensive answers to their questions. Where other solutions cannot cope with overwhelming data volumes and force the user to rely on data samples, Kognitio has helped numerous customers survive the big data phenomenon—analyze whatever, whenever—and get answers to complex questions in sub-second response times. It is the in-memory and MPP feature set of the database that drives this performance.
I foresee an interesting landscape developing over the next decade, one in which large Hadoop clusters process and store every possible piece of data that an organization can collect, and where processed data sets are then sent to very high performance in-memory databases to meet the demands of complex train of thought analyses.
I am a passionate skier and travel every year to both North America and France to do my skiing. My other passion is soccer, having been a Manchester United supporter and fan all my life.