Business Intelligence and Identity Recognition-IBM's Entity Analytics

The cause of poor customer service ratings, ineffective marketing initiatives, faulty financial planning, and the increase in fraudulent activity can, in many cases, relate back to an organization's management of its data. As the data collected and stored in organizations has grown exponentially over the past few years, its proper management has become critical to the successful implementation of such business initiatives as product marketing and corporate planning. Additionally, as fraud and acts of terror receive greater attention, it has become essential to use data to identify people and their relationships with one another.

This article will define master data management (MDM) and explain how customer data integration (CDI) fits within MDM's framework. Additionally, this article will provide an understanding of how MDM and CDI differ from entity analytics, outline their practical uses, and discuss how organizations can leverage their benefits. Various applications of entity analytics, including examples of its application to different types of organizations, will be highlighted along with the benefits it offers organizations in such service industries as government, security, banking, and insurance.

Data Management—Its Broad Spectrum

MDM has emerged to provide organizations with the tools to manage data and data definitions effectively throughout an organization in order to present a consistent view of the organization's data. In essence, MDM overcomes the silos of data created by different departments and provides an operational view of the information so that it may be leveraged by the entire organization. It focuses on the identification and management of reference data across the organization to create one consistent view of data. MDM's application identifies how different subsets of MDM address separate aspects of an organization's needs.

MDM manifests its importance when a customer service representative (CSR) cannot access customer information due to inconsistencies introduced by a corporate acquisition or a new system implementation, which may lead to the frustration (or even alienation) of the customer. Add to this the extra time the CSR spends accessing the appropriate data, and the issue extends to wasted time and money. MDM focuses on the identification and management of reference data across the organization to create one consistent view of data.

CDI is a subset of MDM, and serves to consolidate the many views of a customer within the organization into one centralized structure. This data consolidation provides the CSR with the information required or the ability to link to the required information, which may include billing, accounts receivable, etc. Once the data is consolidated, references to each customer file are created that link to one another and assign the "best" record from the available information. Consequently, data inconsistencies that occur across disparate systems, such as multiple address formats, are cleansed based on defined business rules to create one version of customer data that will be viewed across multiple departments within the organization.

The creation of "one version of the truth" presents unique challenges to organizations In many organizations, there are multiple views of the customer, such as accounts payable, call center, shipping, etc. Each profile may have the same customer name, but different addresses or other associated information such as unique customer numbers for each department, making it difficult to link one person to multiple processes. The difficulty comes when determining which view is the most correct. For example, if four versions of the same customer name and associated address exist, one version should be chosen from the four files to represent the most correct view in order to create a consolidated profile of that customer. The issue that arises here is that each department may have a different definition of "customer," making reconciliation of customer data an enormous task. For instance, organizations often profile their customers differently in systems across the organization, giving employees an incomplete view of the customer. The resolution of this issue allows the redundant or inaccurate customer records to be purged.

Aside from incomplete records, as the customer information is entered into the system multiple times, more silos are created, amplifying the problem. In addition to CSRs and employees having direct contact with the customer, marketing is another department that may have a different or incomplete view of the customer. This can translate into ineffective marketing campaigns and missed revenue opportunities. Although this last example may seem farfetched, the reality is that poor management of data within an organization affects the bottom line. CDI, when implemented properly, can not only reduce costs, but also increase sales, customer service ratings, and customer loyalty.

Data Complexity

As data becomes more complex, management strategies have been applied differently and used more widely to address not only organizational needs, but those concerning fraud detection and security. IBM's Entity Analytics Solution (EAS) addresses the needs of such organizations as government agencies and financial and insurance institutions to combat fraud and terrorism by applying data management techniques in a different way than CDI. Essentially, the concept surrounding the EAS platform translates into "the more data collected, the better". Instead of discarding extra information, as CDI does, the opposite direction is taken by aggregating, grouping, and resolving identity information attributes to use new, old, accurate, inaccurate, and seemingly attributes. This helps with the development of pattern recognition. For example, if a person collects more than one social security check using two or more separate addresses, EAS will identify the fact that a particular individual collects multiple checks sent to various addresses, and will create an alert in the system.

The ability to link individuals to multiple data sets and determine their interconnectivity helps proactive identification of potential fraudulent or criminal activity. IBM, with its acquisition of Language Analysis Systems (LAS), has started to address these needs through IBM Global Name Recognition. Instead of taking a business intelligence data integration or customer relationship management (CRM) customer data integration approach (whereby data cleansing activities take place to create one version of the truth), Entity Analytics uses the opposite approach to identify recurring data patterns to address terrorism and fraud through its Terrorism and Fraud Intelligence Framework (T&FI). The software addresses the issues of searching and managing data on individuals across geographic regions, customers within financial institutions, etc. to meet the demands of managing data sets from diverse cultures and geographic regions. This goes beyond name recognition to analyze how names are interconnected through the identification of recurring data patterns and entity connections. These connections are flagged based on rules created to identify suspicious transactions or behavior.

IBM Entity Analytics Software Offering

IBM Entity Analytics Solutions Global Name Recognition provides four modules (see figure 1 below) to enable organizations to identify people, relationships, and data patterns, and to share that information anonymously to identify potential fraudulent or suspicious behavior. IBM's EAS consists of

  • IBM Identity Resolution, which identifies an individual entity and connects the data associated to that individual across data silos;
  • IBM Relationship Resolution, which identifies non-obvious relationships to reveal social, professional, or criminal networks. This module also provides instant alerts once data connections are detected;
  • IBM Anonymous Resolution, which de-identifies sensitive data sets using proprietary preprocessing and one-way hashing to add additional layers of privacy, and link that data based on codes that enable entity relationship identification without jeopardizing individual privacy laws. Data is shared anonymously and remains with the data owner to ensure data security;
  • IBM Name Resolution solution includes name searching, variation generator, parser, culture classifier, and defining genders. Global Name Recognition's primary use is to recognize customers, citizens, and criminals across multiple cultural variations of name data. A practical application of the name variation generator is to learn the different spellings of names across various geographical regions.

(Click here for larger version)

Figure 1. EAS's Identity & Relationship Recognition Platform, IBM 2005

Government Use

Governments are obligated to spend taxpayers' dollars prudently in addition to protecting the public trust. This includes ensuring that they provide proper payments, services, and benefits to all social services recipients. Improper payments represent 10 percent or more of the total payout in social benefits. The US government issues over $6.6 billion (USD) in improper payments annually. The identification of relationships and data patterns and their associated entities can identify these data anomalies before fraudulent payments are issued, allowing money to be accurately channeled to the correct recipients.

In the aftermath of hurricane Katrina, the US federal government distributed $1.2 billion (USD) in aid to individuals who submitted fraudulent claims, either by using the same name at multiple addresses, or by using multiple names at the same address. This is an example that highlights the benefits of entity analytics over CDI in the detection of fraud. Where CDI attempts to reconcile the data into one correct version, EAS tries to spot the multiple records and creates a flag to identify the discrepancies. Solutions such as EAS identify this type of activity beforehand, thus reducing the possibility of fraudulent claims.

National Security and Terrorism Prevention

National security and terrorism prevention are major priorities for many countries. Identification of terrorists and individuals associated with known terrorists is crucial to safeguarding national security and to developing a list of potential security threats. For instance, the United States is currently using name recognition technology in the war on terrorism. The US Homeland Security agency used EAS to analyze Iraqi data sources in an effort to leverage data to help identify and gather relationship information during interrogations. Consequently, approximately 2,000 relationships of interest between intelligence agency personnel, service personnel, criminals, detainees, kin, tribal leaders, tribal members, and those interrogated were discovered. The detection of these relationships assisted in identifying and capturing potential terrorists before they committed acts of terror as well as in developing strategies based on potential areas of threat.

Additionally, governments are using EAS at an international level to help prevent terrorists and potential criminals from entering or exiting a country. An individual's identity, related to the way he or she spells a surname, can be different across multiple geographic regions. Ordinarily, data inconsistencies of this nature may present one individual as multiple individuals based on the recorded inconsistencies within the different systems. With an EAS solution in place, the systems can link and match these data sets to find consistent elements, and link them to create a complete individual record, thereby turning multiple fictitious people into one entity. Additionally, IBM Anonymous Resolution, coupled with anonymous identification, helps protect individual privacy and adheres to international privacy laws.

Financial and Insurance Industry Applications

In both the banking and insurance industries, the need to identify and to track data patterns and entity relationships has become essential to detect potential fraud and money laundering activities. One example of such activities is the submission of forged mortgage applications marked as approved. Bank employees have used this technique to pocket millions of dollars by creating fake customers, changing small amounts of application data on approved forms, and pocketing the money. With the ability to match "like" forms by collecting and storing every piece of information, financial institutions can raise flags based on recurring data patterns, thereby decreasing the potential for fraud.

Balancing Trade-offs Between Security and Privacy

As analytics software becomes more entrenched in general use, questions arise as to whether its benefits to identify criminals and terrorists outweigh its potential to infringe on personal privacy. Governments must strike a balance between effective management and analysis of information assets to recognize and preempt potential threats while ensuring the preservation and protection of citizen personal privacy and civil rights. Citizens must also be confident that information under the care of the organizations entrusted to protect them is not re-tasked or re-purposed for missions beyond the scope of the mission for which it was gathered.

Data management represents an effective approach to strike this balance. Responsibly managed information analysis enables national security compliance through effective and accurate watch list reference, intuitive filtering, and know your customer (KYC) controls as designated by such regulatory guidelines as the US Patriot Act and Bank Secrecy Act, and the international Basel Accords. This is done while providing a centralized source for managing personally identifiable information (PII) security, collective notification, opt out, and access controls resident in almost all privacy and regulatory requirements. By granting the government access to that data relating to known terrorists only, the balance of the citizen data is not shared with the government. Thus, EAS software accomplishes this in a manner consistent with international and domestic privacy laws.

The ability to identify people, track their movements, and uncover interconnections in their relationships and social associations is imperative to help preempt potential security threats. In the financial and insurance industries, using these tools can reduce fraud and create an environment of proactive fraud detection. Although there remains the issue of personal security and the question as to whether the government has the right to capture so much information about so many people, the benefits of identifying and matching individuals based on their associations have proven advantageous in the detection and prevention of potential fraud and terrorist activity. Furthermore, financial, insurance, and security organizations may derive immediate benefits from such entity analytics software as IBM's Entity Analytics by proactively thwarting fraudulent and criminal activities, and saving time, money, and lives in the process.

comments powered by Disqus