Mainstream Enterprise Vendors Begin to Grasp Content Management
Part One: PCM System Attributes
P.J. Jakovljevic -
11/11/2004
Introduction
SAP's
recent acquisition of the former catalog management vendor A2i
and IBM's acquisition of the former product information
management (PIM) vendor Trigo might indicate some enterprise-wide
product content management (PCM) approaches of the mainstream enterprise
platform and enterprise applications or enterprise resource planning
(ERP) vendors, as their responses to the need for an effective master data
management (MDM) system to the widespread challenges of sprinkled data
integration from multiple systems, physical locations, and diverse trading partners.
Thus, PCM and PIM would be the core parts of MDM solutions that will manage
any kind of master data and be seamlessly integrated into a customer's existing
enterprise architecture, ideally eliminating all data duplication and making
centralized customer, supplier, or product information available to other applications
across the organization.
SAP,
IBM, and like mainstream enterprise vendors need to solve the problems inherent
to data residing in disparate systems, as enterprises are becoming painfully
aware of the need to clean up their structured data and unstructured content
acts to capitalize on more important efforts like regulatory compliance, globalization,
demand aggregation, and supply chain streamlining, to name some. To that end,
these enterprise vendors have to provide the ability to also integrate emerging
radio frequency identification (RFID) data into their software, as
well as full support for web services-based provisioning and consumption of
data and processes.
Yet,
the all-encompassing content management solution is still in the ever-evolving
design stage, as vendors try to piece together comprehensive systems. Therefore,
there seems to be a proliferation (and subsequent confusion about) of the pertinent
terms and acronyms like enterprise content management (ECM), product
content management (PCM), catalog management, product information management
(PIM), records management (RM), product data management (PDM),
enterprise data repositories (EDR), document management (DM),
knowledge management (KM), web content management (WCM), digital
asset management (DAM), enterprise information management (EIM),
digital rights management (DRM), document imaging, workflow management
(WM) or business process management (BPM), and many more.
Generally speaking, PCM (sometimes also called PIM) refers to a system for managing all types of information about finished products, and it is a further evolutionary step of catalog content management backed up with a workflow management. This is however different from ECM, which focuses more on document management and other unstructured editorial and web content, whereas PCM is more granular around individual data elements and focuses on highly structured product content. ECM encompasses many of the above-cited technologies used to capture, manage, store, preserve, and deliver content and documents related to organizational processes. In other words, it allows the management of an organization's unstructured information (e.g., e-mails, photos, spreadsheets, documents, etc.), wherever that information exists—stored in repositories, shuttled across networks, and managed over the course of its existence or life cycle.
This
is part one of a three-part note.
Part
two will present background information and lessons learned.
Part
Three will address challenges.
Definition of PCM Systems
Coming back to managing structured, alphanumeric information, a PIM or PCM solution would include the ability to organize a company's product information, regardless of location, into a consolidated system of record, and be able to synchronize or distribute that information to any business partners that require it. Yet, true PCM should mean more than just the centralized repository to eliminate data duplication with a limited nugget of functionality; rather, this repository must be capable of storing all product information, while the system must be more than a point solution or an island, since it must also offer high-performance access to that information, and it must include tightly integrated functionality that can be used to drive all crucial enterprise initiatives.
First and foremost, the PCM should revolve around a single centralized repository of product information. It should be the "system of record" for all non-transactional product information and organizational intelligence about products, and eliminate data duplication and system redundancy across the enterprise. In effect, it should be the "ERP for product information" containing not only "rich product content", but also other types of related information, such as supplier information, as well as one or more supplier-specific sub-records of sourcing information for each product that allows the PCM to simultaneously drive both sell-side and buy-side initiatives. In other words, the rich product content managed by the PCM must be much more than simply transactional data about each product from the ERP or product master file (e.g., a part number, a description, and a price).
This
brings us to the notion of enterprise publishing (where some PCM systems will
overlap with ECM), which aims at reducing costs to create and speed deployment
of all the product-related information, including user manuals, sales collateral,
and web sites, that make up the complete product offering. In fact, rich product
content must comprise all of the non-transactional product information within
an organization, such as detailed parametric data on product specifications;
merchandising text, high resolution images, drawings, diagrams, and portable
data formats (PDF) for various marketing and publishing requirements; a
classification scheme for organizing the products into a searchable taxonomy
of categories and subcategories with category-specific attributes; product relationships
to represent selling relationships (such as up-sells, cross-sells, and accessories)
and structural relationships (such as assemblies, kits, and bundles); parts
usage information; and finally, various product-specific services for leveraging
the rich product content such as hotspots information for illustrated parts
catalogs without the need for a separate system.
The
term PIM has appeared more frequently lately in the discussion of global
data synchronization (GDS) and syndication because of a number of market
initiatives that act as catalysts for change. For example, many large retailers,
including Wal-Mart, Office Depot, The
Home Depot, Target, Albertsons, and
Safeway have mandated their suppliers to synchronize product
data via European article number (EAN)/UCCnet
registry and data synchronization services. Other catalysts would include the
Sunrise 2005 initiative that seeks to standardize on a format
for global product identification via a new 14-digit code, and the RFID initiatives
in place to bring about the rapid adoption of new radio frequency tags on all
products, so that they may be more easily tracked through manufacturing and
retail environments.
A
full-fledged PCM system should additionally have no predetermined notion of
the repository structure itself, but rather offer a fully flexible schema that
can be tailored to meet the specific requirements of each enterprise and each
vertical industry, and that can change over time. The PCM must be more than
just a simple database application or end-user application, and more than just
a standalone point solution that addresses a single functional requirement (such
as UCCnet synchronization, paper print or web-based publishing, or illustrated
parts catalogs). Rather, it must be a completely open system with both graphical
user interface (GUI) tools for end users and multi-platform application
programming interfaces (API) for programmatic access (e.g., Java 2 Enterprise
Edition [J2EE], Microsoft .NET, eXtensible Markup Language
[XML], web services, and simple object access protocol [SOAP]), supporting
both content authoring and runtime searching, and providing a horizontal platform
for building best-of-breed vertical solutions.
The
like PCM system must also support all the leading middleware application stacks
so that it can leverage and integrate with web application servers
(WAS), enterprise application integration (EAI) and portal servers.
Also, rather than a fixed web-based user interface, it should provide a flexible
presentation layer that can be completely customized and tailored to particular
organizational requirements and various vertical markets needs.
Finally,
the PCM should be able to unify and harmonize product information stored within
repositories across the enterprise, creating "a single copy of the truth" regardless
of where the data resides. That is to say, the PCM must act as a centralized
"hub" that plugs PCM functionality and high-performance access to highly-structured
product information into all enterprise initiatives, not only at the user level
but also at the enterprise integration level, for plug-and-play coordination
with other extended-ERP solutions, such as customer relationship management
(CRM), product lifecycle management (PLM), supplier relationship
management (SRM) and supply chain management (SCM), where the
vendors with broad offering like SAP or Oracle should be glad
to oblige their users.
Desired PCM System Attributes
Based on the above discussion, a proper PCM system, such as the one acquired by SAP, should have the following attributes:
- Powerful
product content aggregation and cleansing, management and editing of product
information, since the proper PCM system should do more than store data that
used to reside in another system. Instead, it must include powerful and extensive
capabilities for loading, restructuring, cleansing, normalizing, and transforming
source data from a variety of electronic sources, including text, Microsoft
Excel, Microsoft Access, structured query language
(SQL), and XML for both flat files and relational data.
- Classification
into a taxonomy with category-specific attributes, since not only must the
proper PCM systems have a completely flexible schema, it must also support
multiple classification schemes, user-defined taxonomy hierarchies of arbitrary
depth with category-specific attributes, multiple simultaneous taxonomies,
and drag-and-drop taxonomy editing capabilities that allow the taxonomy of
the fully populated repository to be completely restructured and refined over
time.
- Intelligent
image management, since many systems can easily store an image as a binary
large object (BLOB). By contrast, the proper PCM system must support
intelligent image management with an understanding of all of the leading image
formats, the ability to automatically transform images for different publishing
purposes, and optimized high-performance image access and efficient image
caching.
- Integrated high-performance
product search engine, since search mechanisms offered by traditional systems
are not precise enough for searching product information. The full-fledged
PCM system must hence include a fully integrated multidimensional search engine
that is optimized for product search, with support not only for drill-down,
parametric, and keyword search, but also units or measurement search, partial
or contains search, and other types of search. To that end, there should be
the ability to let customers search for goods without knowing product codes,
that is, in a "No part number, no problem" manner.
- Performance
acceleration, with scalability up to millions of products, since traditional
enterprise applications, such as ERP or CRM, are not optimized for heavy search
and access loads. Similarly, a traditional relational database management
system (DBMS) is slow on typical searches against large repositories,
so relying on the "naked" DBMS is also a problem. Not to mention that databases
have not been architected well to manage large, binary objects, since rows,
columns, and SQL access are not suited for managing object like frames of
a video or pages of a document. Therefore, a proper PCM system must have a
self-optimizing performance acceleration layer that is able to quickly serve
up product information to users and other enterprise applications.
Most catalog solutions are simple database applications that layer a thin veneer of functionality over SQL and they rely on SQL for all access to the data, whereby SQL works well with retrieving a single record from among thousands or even millions. Yet, to retrieve, for example, several thousand records from among a few million, and to limit across all of the different dimensions of the search for users to only see valid selections and valid values, that requires a multi-table join.
Also,
to interactively browse and sort search results, it requires the use of cursors
and temporary files, which is another thing that cripples the performance of
a SQL-based DBMS. One such example would be having thirty thousand bearings
and very intricate relationships of which bearings can be sold with which other
bearings, which requires a system to manage and automate those relationships.
- Cross-media
publishing (web and paper or CD-ROM print), since the appropriate PCM system
must drive all product content initiatives, including tightly integrated functionality
not only for internal PCM, but also multichannel syndication, deployment of
searchable web catalogs, and print solutions for catalogs and other printed
publications. The things that people expect in a paper catalog in terms of
layout, structure, and tabular orientation of product records, should also
be deliverable to the web. Additionally, the ability to slice and dice a single
master catalog that may contain several million products into as many customized
virtual private, personalized, subset catalogs as necessary, whereby each
slice looks like a complete catalog, either to the user on the web or when
published to paper.
- Database-driven
print catalogs, since a full-fledged PCM system that supports print catalog
publishing must do so in a way that is completely database-driven, meaning
it "pushes" product information into the page layouts, rather than simply
using the repository to store product information that was first entered directly
into the page layout application.
- The system
must support UCCnet synchronization, and also be able to syndicate product
information to multiple audiences, transforming it into a variety of industry-standard
and user-defined XML and delimited text formats, on an ad hoc and scheduled
basis.
- The system
must have an integrated workflow engine that can provide a framework for managing
product information in a collaborative environment, and can function standalone
or in conjunction with external workflow applications and systems.
- The above-elaborated
cross-platform compatibility; and enterprise scalability, since the appropriate
PCM system must offer an n-tier architecture, capable of easily integrating
with various deployment architectures, including a full suite of security
and encryption services as well as the ability to integrate with leading user
directories, such as lightweight directory access protocol (LDAP).
Finally, the PCM system must provide master and slave capabilities to enable
a global 24-7 deployment consisting of both staging and publishing servers.
This
concludes part one of a three-part note.
Part
two will present background information and lessons learned.
Part
three will address challenges.