The User's Undying Quest for Exploring and Discovering Info - Part 1

SAP AG and Endeca Technologies might not appear to have much in common at first glance, other than occasional partnering in some joint opportunities, and perhaps that SAP Ventures owns a piece of privately held Endeca. In the world of home appliances, SAP would be analogous to a tried-and-true refrigerator, but with the most advanced features in the market, such as a built-in TV set.

Such an appliance stores important food (i.e., data and transactions) and provides some important basic information and entertainment (i.e., news reports) to nearly 90,000 customers in over 120 countries. Indeed, SAP is the world's leading provider of business software, offering enterprise applications and services to companies of all sizes and in more than 25 industries for nearly four decades.

By the same token, the much smaller and younger Endeca would be analogous to a cool smartphone that gives users information and entertainment at the speed of light (or thought, or the speed of computing). The snazzy gadget will not only provide results but also suggest possibilities that users might not be aware of (or that they never thought of) before seeking some information.

Interestingly, Endeca derives its name from the German word “entdecken,” which means “to discover” (perhaps another common thread with SAP). Seriously speaking, Endeca offers innovative information access software that helps people explore, analyze, and understand complex and constantly changing information, guiding them to often unexpected insights and better decisions.

The Endeca Information Access Platform (IAP), built around a new class of access-optimized semi-structured database, powers applications that combine the simplicity of searching and browsing with the analytical power of business intelligence (BI). Since the company’s inception in 1999 in Cambridge, Massachusetts (US), over 600 leading global organizations including ABN AMRO, Boeing, Cox Newspapers, the (US) Defense Intelligence Agency, ESPN, Barnes & Nobles, Dell, Ford Motor Company, Hyatt, IBM, John Deere, The Library of Congress, Texas Instruments, and have been relying on Endeca’s platform.

The Tale of Two Events (with Similar Messages)

In mid-2009, these two quite different vendors (in terms of stature and target audience) had their own separate user conferences. Still, those two events revolved around similar themes. I attended Endeca Discover 2009 in person, since it took place in my neck of woods (Boston).

Endeca’s early start was with information discovery projects in the public sector (the less we know, the better), and while today the vendor also targets manufacturers and distributors (as can be seen from the abovementioned representative client roster), the lion’s share of Endeca customers are online retailers and the Internet media & publishers. The Endeca Discover conference is traditionally focused towards those electronic commerce-oriented companies.

Against the backdrop of a dour economic situation everywhere, this love-fest event was contrarily buzzing with fervent activity and interest from existing and prospective customers. I even overheard some of Endeca’s professional services staffers commenting amongst themselves during the break: “if this attendance is an epitome of recession (or even depression), then how are we going to be able to deliver mushrooming projects when the economy improves?” Although privately held, the company is not that tight-lipped about its rapidly growing revenues, which are reportedly estimated at over US$ 100 million.

Endeca has pioneered a new software category that enables users to not only easily find what they are looking for, but also to discover insights, information, and relationships across data they never knew existed along the way. There will be a separate series of in-depth articles on Endeca and its platform, but for now it suffices to say that Endeca’s “secret sauce” is its proprietary semi-structured database called MDEX (meta-relational index engine).

A Quick Taste of Endeca’s Secret Sauce

The semi-structured model helps overcome the drawbacks of traditional rigid overarching relational schemas that are too limited to handle diverse (structured and unstructured) and ever-changing data. On the other hand, OLAP (online analytical processing) cubes might be able to overcome relational databases’ inability to provide near instantaneous analysis and display of large amounts of data, but they are still not able to accommodate ever growing number of new (perhaps esoteric) data attributes or dimensions (e.g., “find all basketball point-guards that played in Europe during high school”).

The MDEX engine handles such requirements with relative ease via an extensible markup language (XML)-like data model of self-describing objectsEndeca ITL (information transformation layer) plays the extract, transform, load (ETL) role of importing data from disparate data sources. These data objects can come from multiple sources, such as structured content (e.g., a content management system [CMS], databases, etc.) and unstructured content (text documents and user-generated content [UGC] such as blogs, wikis, podcasts, multimedia files, etc.).

In addition, the software must be able to access rapidly changing data such as ESPN sports scores, news feeds, or an online store’s new items in catalogs and all products’ availability (stock situation). One of Endeca’s landmark (and trademark) capabilities that MDEX enables is Guided Navigation™ or the ability of the search engine to not only return results, but also the options to further select subsets within these results. The user might not even be aware that these options and relations exist.

Endeca is not based on the “rocket science” of some overly complex optimization algorithms. Neither is the platform trying to invent data based on, say, predictive analytics. According to the “you can't make cheese out of chalk” adage, if some combination of data attributes is not yet available, that is fine, and Endeca will not try to create results that do not exist just to impress and possibly mislead the user.

Conversely, if some relationships between data and related indexes exist, Endeca will return both the results and further suggestions (while breadcrumb trails are kept updated), and choices will either expand or narrow depending on the path that the user selects in a point-and-click manner. Simple as that, or, in other words: WYSIWYG (what you see is what you get). If you know how to order movies over Netflix or select channels on a JetBlue flight, you are ready to use Endeca.

For instance, NFL aficionados might search the ESPN portal for “Tom Brady” and will get about 6,000 records as possible results. But on the left side, the site will offer search refinements, such as by type (i.e., stories, audio, photo, video), by date (i.e., last 7 days, last 30 days, last 365 days, etc.), by team, by columnist, etc. Each option will show in brackets the number of related records (further potential results that match the current search criteria).

For more informed sports fans (or even fanatics), ESPN administrators might use the page builder tool to create landing pages or topic pages. Namely, instead of the list of possible search results, the user is rather directed to a specially designed page for the query, i.e., the page dedicated to Tom Brady (the future Hall of Fame quarterback) or to the New England Patriots.

Small wonder then that Endeca’s online media customers (i.e., newspapers and magazines, professional knowledge providers, cable and TV, libraries, bookstores and publishers, etc.) rave about real results. I’ve repeatedly heard about the examples of fivefold increase in Web traffic, 20 percent increase in page views (PVs), 15 percent increase in subscription renewals, 15 percent increase in search click-through rates (CTRs), and so on and so forth.

The New Fundamentals for the Future

The Endeca Discover 2009 conference started with an enlightening and spirited keynote from Paul Sonderegger, Endeca’s chief strategy officer. Entitled “The New Fundamentals for the Future,” the keynote produced a number of eye-opening facts, starting with the fact that 56 percent of US household wealth is in real estate and stocks assets. Thus, on average, since mid-2007, US household wealth has been severely undermined by over 20 percent.

But these worried and broke ordinary folks are not exactly helpless, since they are at least “wired.” The bestseller book  “Groundswell” by Forrester analysts Charlene Li and Josh Bernoff defines “groundswell” as follows:
“A social trend in which people use technologies to get the things they need from each other, rather than from traditional institutions like corporations.”

For most brick-and-mortar retailers, there have been notable decreases in same-store sales, year over year, for the first three months in 2009. According to the listing of reported monthly same-store sales at and Endeca’s research, there were 66 percent, 63 percent, and 71 percent of stores reporting decreases in January, February, and March of 2009 respectively (over a year ago).

Similar negative trends have been seen in the predicted US advertising spans. Namely, the initial Carat’s prediction from August 2008 was a 3.1 percent growth in advertising; then the forecast was sharply revised in March 2009 to a 9.8 percent decline. Along similar lines, ZenithOptimedia predicted a 2.6 percent in June 2008, and revised the forecast to a 6.2 percent decline in December 2008.

However, both online retail sales and advertising spend continue to grow. Given the abovementioned groundswell effect of connected and informed consumers that can easily pass a verdict and disseminate news of any product or brand, there are the three new fundamentals for the future that every retailer, manufacturer, and software vendor has to keep in mind.

The first is the atomization of user experience. Namely, according to the Long Tail theory, companies will compete on ever-smaller and more specific product footprints and capabilities to satisfy ever-smaller niches, and all in a plug-and-play manner. Think of Apple iPhone OS applications.

Certainly, some of these are seemingly frivolous or ridiculous like the slew of fart applications or those that follow girlfriends’ menstrual cycles (and moods). There was even a despicable and eventually condemned and discontinued baby shaker application. On the other hand, there are some neat and useful applications like Shazam that lets the iPhone users identify music tracks from the radio, find them on iTunes Store and buy them from there, and share the tags with friends.

The second core principle revolves around innovations to facilitate IT operations. To that end, one should expect even more use of the concepts like virtualizationcloud computing, and service-oriented architecture (SOA).

Last but not least, there is the BT or “business technology” core principle coined by Forrester. Namely, BT denotes a pervasive technology use by casual users and end users, increasingly managed outside the direct control of IT departments. In other words, like in the abovementioned groundswell phenomenon, ordinary folks vie to “control their own destiny,” and require the use of technology with minimal (if not even zero) training. The recent post on the Forrester Blog for the CIO site says:
“How does BT show itself? Employees, customers, and partners are bringing Web 2.0 and social computing technologies into business processes; business leaders are directly contracting for online solutions and business process outsourcing; and users are configuring their own business solutions, using ERP applications from vendors like SAP or IT-provided platforms built from technologies like business process modeling (BPM).”

Part 2 of this blog series will explore how SAP is adapting to the outlined fundamentals above. One concrete example will be the recently unveiled SAP BusinessObjects Explorer product.

In the meantime, your comments and feedback with regard to the opinions and assertions expressed thus far are welcome. If you are applications users, how important are the aforementioned considerations in your software selections? For both users and vendors, what are your information discovery experiences (both in using and delivering a solution)?
comments powered by Disqus