If your data does not reflect reality, the system can never be effective. In today's world of collaboration, showing a trading partner dirty data is giving them the wrong message and tearing down the trust called for in a collaborating partnership.
What is dirty data?
When reality and the data in your system do not agree, you have dirty data. It may be as simple as having 1000 units of an inventory item on the shelf and the system says you have 900 or 1002. It may be a customer's ship-to address is out of date. It may be that you have the same piece of information in two places (two applications or even two systems) and they do not agree one of the two or maybe both are dirty. One rule of having the same piece of data in two places is — "identical data isn't".
How does it get dirty?
are many reasons why a piece of data may be dirty. The data or the transaction
causing a change in the data may have been inaccurately recorded. In the case
of our inventory example, one of the potentially many transactions that change
the on-hand data could have been inaccurate. For example, a receipt may have
had an incorrect quantity.
the physical act may have been flawed, for example, an order called for 100
units to be picked, the transaction may have been recorded as 100 but the physical
picking resulted in more or less than 100 being picked. Perhaps worse, when
the material was placed in a location within the warehouse, the location was
incorrectly reported, resulting in two pieces of dirty data (one location says
it has some inventory but has none and a second location says it has no inventory
but it does have some.)
transaction may be correct, but the recording of the transaction is delayed
so that for a period of time, the data does not reflect reality. This timing
problem results in data being temporarily dirty. No harm, unless some decision
will be based upon the dirty data. Our systems do not have to be "real time"
but they do have to be "right time" to avoid decision making on dirty data.
piece of data may be dirty because it was derived from dirty data. When a credit
check uses an incorrect number for the total amount outstanding, it is usually
because one of the outstanding invoices is dirty. The total amount outstanding
was derived from individual invoices.
can sometimes be blamed for dirty data. If a flaw or bug exists in the software,
it may result in corruption to the data. In this case, the result is typically
that many pieces of data are dirty, all in the same way.
How do we know we have dirty data?
data is often a sleeping problem, one that can wake up at any time. Dirty data
does not always get detected; it causes problems that are minor enough to remain
hidden. It does not mean that the dirty data is not causing problems, it maybe
that the data is well hidden. The dirty data may be used in a way that causes
other data to become dirty but not to be detected. For example, if a lead-time
is incorrect, order calculations will be incorrect also, resulting in either
over stocking or out of stocks. The out of stock condition will alarm us, but
the over stocking will usually continue unnoticed.
the dirty data may create error conditions, alerting someone of the problem.
All to often, the error conditions are not reported and the data remains dirty.
checks are used to detect some problems. Auditors send out verification letters
to customers or suppliers. Cycle counts or physical inventory gives us a precise
picture of reality and the process allows us to compare the system to this reality.
we are lucky, we get feedback from users and trading partners telling us about
What is the impact of dirty data?
data may mean a dead-end to business value. Even worse, it can have a negative
impact on business value. Dirty data can cause minor problems or be catastrophic.
A catastrophic problem would be losing a customer or have having to take a major
financial write-off due to inventory problems. Less of an impact is carrying
too much inventory (carrying too little can mean a loss of revenue). Even less
of an impact is an invoice going to the wrong department at a customer site
but the customer routing it correctly to fix your mistake, again and again.
The impression you leave with the customer is that you are out of control or
that you do not care.
you are giving trading partners access to your data, what is the impression
that you are leaving? When you open your collaboration door, the trading partner
sees the inside of your company good or bad. Internal problems can quickly
become external problems.
worse, if you convert dirty data to be used with a new system, what happens?
You will have problems with the new system but it will be very difficult to
determine the cause of the problems, the dirty data or the system itself.
What can we do about dirty data?
need to be on the lookout for dirty data. When it is detected, we need to both
fix the data and, more importantly, the problem that caused it to be dirty in
the first place. We need to seek out dirty data and fix it before it results
in business problems. Business Intelligence systems can help find dirty data
by putting it in front of people who can judge it best, in the form of information.
These people can locate logical inconsistencies (a number being too large or
too small for example.)
the most extreme examples, we must undertake a cleansing process. We must proactively
seek out the correct information and take steps to correct it. This may mean
a physical inventory, or a campaign to get all customers to validate their name
and address. This may mean a program to compare the information in two systems,
find where they disagree and to settle the data (and political problems) that
Dirty data causes problems large and small, catastrophic and insignificant. With today's IT budget, big cash outlays for the acquisition of new hardware or software are limited. Maybe now is the time to use your resources to search out dirty data and fix the problem before your trading partners or auditors find them.
Thompson, a principal of Process ERP Partners, has over 25 years experience
as an executive in the software industry with the last 17 in process industry
related ERP, SCM, and e-business related segments. Olin has been called "the
Father of Process ERP." He is a frequent author and an award-winning speaker
on topics of gaining value from ERP, SCM, e-commerce and the impact of technology
can be reached at Olin@ProcessERP.com.