Business intelligence (BI) 2.0 may have been overshadowed by all the excitement around its cooler cousin Web 2.0, but it cannot be ignored. It is time to take a down-to-earth look at a few recent advances that are making BI more accessible, affordable, and relevant to businesses than ever before. The most observable changes in BI over the past two to three years have occurred in the vendor landscape. "Megavendors" (such as Microsoft, SAP, IBM, and Oracle) are rising above independent vendors (Business Objects and Cognos). In addition to IBM's acquisition of Cognos, Oracle's purchase of Hyperion, and SAP's acquisition of Business Objects, the BI road map of the software giant Microsoft poses a threat to all vendors in the small to medium business (SMB) market. Behind the myriad changes in the BI marketplace, major innovations in BI technologies have been afoot.
In this article, we look at three recent advances in particular: search technologies, software as a service (SaaS), and operational BI. In order to understand how these technologies are being implemented and made available to organizations, we must look at the vendors that provide them. A complete survey of all vendors in these technology brackets is practically impossible; what we illustrate in this article is simply that innovation is still very much alive and well in BI.
The traditional BI solution comprises a data warehouse or data marts (individual business area–specific data stores for reporting and analysis) as its foundation. Data warehouses are designed for high performance in querying operations, and often contain summaries based on business needs. Extract, transform, and load (ETL) processes bring data from operational data sources into the data warehouse periodically, the frequency dependent on latency needs. Between extracting data from sources and loading them into targets, data cleansing and data summarization are activities that take place in an ETL process. Online analytical processing (OLAP) provides high-performance access to data in the data warehouse through the creation of cubes, which are data structures that aggregate data across multiple dimensions and provide the business user with an analytical environment. Reporting and querying environments provide access to canned or ad hoc enquiries into historical data in the data warehouse/marts.
These technologies have grown over several years, and although BI is now a mature field, new ideas and approaches are still being developed, as business models change and evolve around the world.
Searching for the Truth
"One version of the truth," albeit a redundant phrase, has been almost synonymous with BI. Getting to that one version, however, requires search and query operations. Every BI environment is data-intensive, and each action (whether a simple report or a complex analysis involving multiple dimensions) involves sifting through gigabytes (often terabytes) of data in order to arrive at an answer to a business question.
Searching in traditional BI environments involves creating structured data sources (data warehouses, data marts, OLAP cubes) and applying query mechanisms that understand the data structures. For instance, structured query language (SQL) and multi-dimensional expressions (MDX) are common ways of querying data that is stored in a data warehouse or OLAP cubes.
Innovations in the search space include high-performance searching across unstructured and disparate data sources, new storage mechanisms, new approaches to user experience, and advanced multilingual search capabilities. The existence of structured and unstructured content and the high cost of maintaining a data warehouse and ETL processes are very real in organizations; new and innovative ways to search heterogeneous data are essential.
Endeca's MDEX TM technology promises a platform in which searches across data stored in relational databases, data warehouses, and unstructured content become possible. The technology is based on self-describing records where every field is an attribute-value pair. With this flexibility, a little known fact stored in a document can become part of an analysis instantaneously. The guided summarization–based search assumes that the user does not know in advance what kind of queries can be asked. The approach widens the BI audience in an organization, as the user does not need to have a detailed understanding of underlying metadata or data models.
FAST is yet another innovator in search technology. FAST's Contextual Insight uses entities that define the scope of a search. The search combines these entities (such as name, location, etc.) to arrive at an answer to a user's question. FAST also supports a natural language processing component and advanced linguistic features. Companies that deal with large documents (publishing, news, media, etc.) in multiple languages can benefit from FAST's advanced language capabilities.
Information Builders combines the search technology of Google Appliance with its own BI data integration technologies. WebFOCUS Magnify allows the user to use the universally familiar keyword search to combine an organization's Web content with its enterprise data. This can provide insight into online business activity in conjunction with BI built on historical enterprise data.
SaaS: BI à louer (for Rent)
The high cost and prolonged implementation time involved in BI projects has made BI unaffordable to many SMBs. For companies facing challenges in implementing BI, SaaS eliminates the need to build data warehouses and OLAP cubes. For companies that have BI solutions in place, SaaS makes new BI functionality available to users for a low cost and minimal implementation time.
In order to get BI using the SaaS model, data from various sources are sent to a hosted service, where ETL-type processes bring the data into a structured representation. Once an initial load of an organization's data is completed, customers of the hosted service provide incremental updates of data.
SAS Solutions OnDemand offers analytic services specific to higher education, financial services organizations, and marketing. The OnDemand Business Intelligence package provides extensive reporting and analysis capabilities based on both relational data and OLAP cubes. SAS also offers a leasing option to use its application in house.
Business Objects OnDemand includes a data warehouse and ETL process. Those familiar with Business Objects will recognize the universe, which is available as the business interface to the data warehouse. Crystalreports.com serves as the client interface that hosts various report types and dashboards.
Oco's On-Demand Business Intelligence makes it possible for users to benefit from two proprietary technologies: Connect for ETL and Intelligent Data Schema for the process of mapping business entities into data warehouses. Retail-specific solutions are also offered on the SaaS model.
Host Analytics provides a business performance management service, which can also be purchased as a license. Individual services such as this can be options for companies that are looking to augment existing BI solutions.
In the kNow—Operational BI
Data integration latency can vary based on the nature of business needs. For traditional BI reporting and analysis based on historical data, latencies of days or even weeks are common. For operational reporting and analytic applications that require near real-time data, latencies of hours, minutes, or even seconds, may be necessary. For instance, a forecasting application that predicts the re-order of merchandise based on sales and demand will require sales data that is at least hours old. A sudden increase in the sale of specific merchandise, due to a marketing initiative to offer discounts, can trigger either an automatic re-order action in the purchase order system, or send an alert to the department responsible for re-ordering. Historical information is used to predict demand patterns; however, operational data is required to detect any anomalous activities in real (or near real) time so that low availability can be addressed immediately. Operational BI is also relevant in live reporting, where data needs to be offloaded from the operational system to avoid placing query burdens on application systems.
The need to tie operational activities to BI content is becoming more important. It is essential, however, to keep in mind that not all data in a data warehouse needs to be current; a combination of real-time and analytical data integration is what best serves an organization's needs. Data cleansing has to be minimal (or nonexistent) with real-time data integration; as a result, data quality must be ensured at the point of its entry into the application systems.
IBM Cognos Now! is a product that includes a configured server with all the necessary components for operational BI. It is also offered as a hosted service through SaaS. A streaming server provides continuous data integration of transactional information with historical data. A business rules execution engine makes it possible to set up alerts based on operational activities. An analytic server includes engines for querying and analysis; aggregate data is stored in the server, eliminating the need for the data warehouse to be kept up-to-date.
Informatica's PowerCenter 8.5 specifically targets the need for real-time data. PowerCenter Real Time Edition combines batch processing features with new real time capabilities and uses techniques in parallel processing to address performance constraints. Data smart parallelism aligns the partitioning in PowerCenter with that of the database and maximizes the use of hardware resources for performance. Change data capture (CDC) provides a continuous data integration method that listens for changes in data sources to trigger the transfer into the data warehouse.
All That to Say …
In conclusion, BI is at the forefront of innovations that take full advantage of recent advances in hardware and software technologies. Google's revolution has touched several areas of the software industry—BI is no exception. Advanced search techniques provide more intuitive ways to access complex enterprise data. Service-oriented business models make it possible for companies to embark on BI journeys on much smaller budgets and timelines. Insight available through BI is being converted into alerts and actions through new techniques in operational BI.
About the Author
Anna Mallikarjunan is a member of TEC's research & development team. She is responsible for the analysis and development of TEC's decision support software as well as tools for business intelligence (BI). She has over four years of business analysis, design, and development experience in several areas of BI, including data warehousing; extract, transform, and load (ETL); online analytical processing (OLAP); reporting; and custom application development.
Past positions Mallikarjunan has held include technical lead and applications development manager of a team of .NET, data warehousing, and BI professionals for a fashion retail company. In this role, she was responsible for the development, maintenance, and support of Windows and Web-based applications, as well as an operational data store, data marts, and BI applications.
Mallikarjunan holds a BSc in computer science from the University of Madras (India), and an MSc in computer science from Anna University in Madras, India.