The Why of Data Collection

  • Written By: Olin Thompson
  • Published: November 3 2005


Data collection systems work. However, they mean an investment in technology. Before we can justify that investment, we need to understand why we may want to use a data collection system in place of people with clipboards.

What is data collection? In a general sense, it is the manual or automated acquisition of data. That definition has evolved to mean various automated methods of data acquisition. Examples of data collection include time and attendance devices, bar code scanning, radio frequency identification (RFID), and sensor based devices such as automated scales, flow meters, case counters, etc.

When deciding to invest in data collection, the first question to ask is "Why?" Our objective is never just "data collection", but are rather some business objectives like lower inventory, higher quality, better customer service, or more accurate costing. We can often meet these objectives without a data collection system. For example, we can have people writing data on clipboards and later keying it in. Our data collection question is "Will the results of using automated data collection be better than they were when we used people with clipboards?"

A frequent motivator for data collection systems is visibility. Simply put, we want to know more, and know it sooner. The right information, which is accurate, and made available at the right time is one definition of visibility.


Errors happen. For example, the Grocer Manufacturers Association (GMA) reported that 36% of all orders have errors. At minimum, this means that one line item had the wrong quantity or even the wrong product shipped. These errors happen because people are people. People will continue to misread a number, miscount, record a different number than they intended, or have handwriting that others cannot understand. Other people will take those numbers and misread them, or key them into a later system incorrectly. Software editors can catch some of these errors, such as item or location number errors, but not all. For example, errors in quantities may be missed. Errors are not intentional—they are a product of human nature.

Errors cost money. The costs of errors are real. If detected soon enough, it may be as simple as sending someone out to determine the correct data and then correcting the record. Still, this process costs time and, therefore, money.

If the error goes undetected, the impact and thus the cost can be much greater. Some inventory errors mean that an item's balance or location is wrong. With some inventory errors, we may actually have two errors, one item's balance is overstated and another is understated. In both cases, we make decisions based upon incorrect data. For example, if an inventory error happens and goes undetected, we may make decisions that affect the business. We tell a customer that a product is out of stock when we have inventory, or worse, we promise a customer a product, but we do not have it. We may also schedule production or a purchase order when we already have inventory on the shelf.

What is the value of improved accuracy? It is very difficult to assign a number to the value of improved accuracy. Accountants like to talk about adjustments to inventory values and that is important. However, the impact of accuracy is found in the improved decisions that can be made, and from the financial impact of those decisions.


The phrase "time is money" is well known. However, a good question is "When do we want to know?" "Immediately" is often the answer. To gain "real time" (an overused term which frequently means "soon enough"), data must be collected and entered into the system as the event happens. An example of this would be forwarding case counts to the production control system as the case goes past the scanner. An alternative example could include a selection of inventory for a shipment gained through a bar code reader on the forklift, and can be immediately used to update both the inventory and the order status. As an aside, when the need is termed "real time", we should ask what that really means. It often means "as soon as the person needs it", which could be either once a shift or once a month.

When looking at the time factor, beware of whether the processing systems will be able to handle it. Getting accurate data in a timely fashion is good. However, the value of this data is dependent on the system that will process the data. For example, if we run a system once a day, real time data does not add any value to the output of that system. Perhaps the answer is to run the system more frequently than once a day. The question should be "If the system were run more frequently than once a day, would we make better decisions?"

For example, a pet food manufacturer wanted real time inventory from its plants. The objective was to do a better job of making shipping commitments to its customers. They later discovered that the sales department was still making commitments based upon a once a day reporting system that was run overnight. The added expense of a real time inventory and associated data collection did not create value for the company.


Collecting data cost money. A production warehouse person who must manually record data is being less productive in his or her primary role, making product or working with inventory. The step of later keying in this data (and then correcting errors) also costs time and money. These variable costs happen each minute of each day.

A data collection system avoids some of these labor costs, but that system still costs money. Production and warehouse people become more productive due to the greater efficiency in collecting the data, either by arming them with a device such as a bar code reader, or by making the data collection fully automated by using a case counter, flow meter, or other such instrument.

The costs associated with a data collection system are very different. A one-time investment in equipment, software, and training is required. The justification is frequently not a calculation of how many hours are saved, but the business impact of more data, higher quality data, and more timely data.


Data collection systems need to be justified, like any business investment. The justification comes from two sources: improved information and cost reduction or avoidance.

A data collection system can improve the value of existing or planned systems. The value of data collection is dependent upon these other systems. Think about the impact of these other systems receiving more accurate or timelier data. Will better data mean better decisions and therefore better business results?

Data collection systems can also cut the cost of gathering data. It can improve the productivity of production workers, for example, by giving them more time to add value to products rather than recording data. Can the proposed system improve productivity, cut cost, or avoid the need to add cost?

About the Author

Olin Thompson, a principal of Process ERP Partners, has over twenty-five years experience as an executive in the software industry with the last seventeen in process industry related enterprise resource planning (ERP), supply chain partnership (SCP), and e-business related segments. Thompson has been called "the Father of Process ERP." He is a frequent author and an award-winning speaker on topics of gaining value from ERP, SCP, and e-commerce, and the impact of technology on industry. He can be reached at

comments powered by Disqus