Six Sigma (6s)
Quality is a popular approach to process improvement, particularly among
technology driven companies such as Allied Signal, General Electric, Kodak
and Texas Instruments. Its objective is to reduce output variability through
process improvement, and/or to increase customer specification limits
through design for producability (Dfp), so that these specification limits
lie at more than "six" standard deviations, or s's,
from the process mean (I'll explain the quotation marks later). In this
way, defect levels should be below 3.4 "defects per million opportunities"
for a defect, or "dpmo" for short.
Although
originally introduced by Motorola in 1986 as a quality performance measurement,
6s has evolved into a statistically oriented
approach to process improvement. It is deployed throughout an organization
using an army of champions and experts called "black belts," a title borrowed
from their martial arts counterparts. They command a rank-and-file made
up of teams focusing on the improvement of the organization's processes.
Just search the Internet for "six sigma" and you'll come up with several
informative descriptions of its history and current practice. The Six
Sigma Academy, a Motorola spin-off, provides consulting service to many
of the leading practitioners of this approach. What I want to focus on
here though is the 6s metric itself, not the
concept or the approach.
I
don't like the 6s metric. As you'll see, it
fails to pass many of the tests that I've previously established for "good"
metrics and described in Part 1 of Metrics for the Order Fulfillment Process
[1]. In particular, it's neither simple to understand nor, in most applications,
an effective proxy for customer satisfaction. It does not have an optimum
value of zero. And, its definition is ambiguous and therefore easily gamed
because there is no accepted test for what to include as an "opportunity"
for a defect.
Publications
Mentioned:
[1]Art Schneiderman, "Metrics for the Order Fulfillment Process" Journal
of Cost Management. Part 1: Vol. 10, No. 2, Summer 1996, p. 30. Part 2:
Vol. 10, No. 3, Fall 1996, p. 6.
What
is an "opportunity"?
I've trained
improvement teams, team leaders, and black belts for one of the aforementioned
companies in their 6s metrics module. Once
they get through the distinction between defects vs. defectives and attribute
vs. variable data, the greatest difficulty that the trainees encounter
is in determining what constitutes an opportunity for a defect. Obviously,
by increasing the number of opportunities (the denominator of dpmo), you
can improve the metric, particularly if you include opportunities that
are not important to customers and consequently are not routinely checked
for conformance, thereby allowing their defects to go uncounted.
This
weakness can be overcome (but seldom is in practice) by applying an objective
weighting for defect severity in counting both opportunities and actual
defects. For example, critical defects, ones that make the output unusable
by the customer, get a weighting of one while inconsequential defects
get a weighting of zero. Cosmetic defects or ones that can be corrected
or compensated for have values in between, depending on the relative cost
of correction or their likely impact on the customer's repurchase decision.
A
similar approach is taken in Failure Mode and Effect Analysis (FMEA) where
improvement priorities are set based on a combination of frequency of
occurrence, severity and detectability of candidate failure modes. I understand
that the TI flavor of 6s does include this
type of logic. Where should the weightings come from? The customer of
the process, of course (but, more about this in a future installment in
this series, if there's sufficient interest). Current practice usually
leaves the choice of what constitutes an opportunity for a defect as a
subjective, not objective decision. This has proven to be a poor standard
for good metrics.
Is
it really "six" s?
Let's return to the metric itself. Once we've identified all of the appropriate
opportunities for defects and counted the actual number of them that fail
to meet specification, we're ready to calculate the metric. It's trivial
to determine the dpmo value, but what is the corresponding sigma value?
First, you'll have to find a table of values for the "one-sided tail of
a normal distribution." That should be easy, right?
Well,
they're not that easy to find. Most textbooks or statistics tables end
at values of three or four sigma. Why? My guess is that up until recently
there was little need for knowing values above these levels. Practical
applications simply did not exist in our world. There's probably a profound
message for us there, if we look carefully. I've found such a table though
in the 1992 Motorola Publication Six Sigma Producibility Analysis and
Process Characterization[3] by Mikel J. Harry and J. Ronald Lawson. Other
more recent 6s sources always seem to reference
this one. Its Appendix C gives a value of 1.248x10-9 for 6s.
But
wait, what happened to the 3.4x10-6? Forgive my cynicism, but
here comes what looks to me like a little "slight-of-hand." We are told
that there is a typical 1.5s long-term drift
in most process means. To adjust for it, we need to subtract out this
1.5s, so that we actually use the table entry
at 4.5s to get to the adjusted short-term value:
that's 3.451x10-6. In other words, if we measure 3451 defects
in a billion opportunities, only one of them was caused by short-term
process variability. The other 3450 were caused by this mysterious long-term
drift in the mean, so we're not going to count them. We'll report that
our process is operating at 6s. Got it? To
be honest though, in small print we will admit to the 1.5s adjustment,
whether it's justifiable or not. To make it easier for us, tables are
provided that incorporate this adjustment, with the obligatory footnote.
Well,
I am aware of situations where there is a drift in the mean, caused
for example by tool wear or component aging, but I also know of processes
in which this phenomenon simply does not occur. And, why forgive this
long-term drift anyway, even when it does exist. Laser machining eliminates
tool wear; compensation circuits can adjust for component aging; and there's
a whole science of adaptive feedback systems that can sense and compensate
for various forms of both deterministic (like tool wear) as well as random
"non-stationarity," as the statisticians like to call this drift.
In
a previous work-life, I spent many an evening atop beautiful Mt. Haleakala
in Hawaii peering through a large telescope at satellites streaking across
the sky. It was guided by a computerized tracking system that effectively
compensated for significant random image wander created by the intervening
atmospheric turbulence. So I know first hand that it can be done.
Furthermore,
there is a conceptual problem created by the assumption that there is
a constant relationship between long-term drift in the mean and short-term
process variation. It implies that they both have a common root cause.
I can think of no theoretical reason why that should be true in any given
case, let alone be true in general. If instead it's based on empirical
observation, than I'd like to see the supporting data so I can draw my
own conclusion as to its general validity. It seems to me that this largely
undocumented long-term drift in the mean is as worthy a target for process
improvement as is reduction in short-term variation. And I don't buy the
argument that it's too complicated in general to analyze, so we'll just
use a universal approximation. Too much very valuable information is buried
by that concession, not to mention the undesirable behavior that it all
too often encourages.
My
cynical symbiont would have loved to have been a fly-on-the-wall, when
this convenient "discovery" was made. Why convenient? Well, think about
it. If each unit produced has 100 opportunities for independent defects,
then without this 1.5s adjustment 6s
quality would mean that you would have only one defective unit in 10 million
output units produced! Banks would never make an error in processing loan
applications, semiconductor manufacturers would produce many products
that never have even a single defect throughout the product's entire lifecycle,
and call centers would correctly transfer each and every call the first
time and maintain this perfect performance over many decades. For nearly
all processes, that would be indistinguishable from the already un-sellable
concept of zero defects as a reasonable achievable goal.
Publications
Mentioned:
[3]1992 Motorola Publication "Six Sigma Producibility Analysis and Process
Characterization" by Mikel J. Harry and J. Ronald Lawson
Is
6s a good goal for ALL processes?
So I for one don't buy this 1.5s "free bonus"
even in cases where it may exist. But there are other critical problems
with the 6s goal. I've argued repeatedly that
each metric has a limiting value determined by the process's enabling
technology and organizational structure. Absent process re-design, nothing
can be done to reduce the sigma level below this limiting value or entitlement
on a permanent basis. Individual heroics can create short-term gains beyond
this limit (as evidenced by the well-known Hawthorne Effect), but they
are not sustainable in the long-term.
The
goal of 6s for all processes requires an organizational
commitment to continuously re-design every one of them before their limit
is approached. Not only must the financial commitment be there, but also
the required new enabling technology and organizational flexibility. In
many situations, these commitments are unrealistic, unreasonable and/or
unsound. My personal bias is to focus on metrics that address the gap
between current and potential performance and focus on the rate at which
that gap is closing [2]
Consider
also an old saying that we have in the System Dynamics world: "things
get worse before they get better." Its origins lie in the observation
that major changes usually create short-term disruptions that adversely
affect current performance. Process redesign almost always displays this
dynamic. If you are being rewarded on your 6s
performance, past experience will discourage you from self-initiating
a process redesign since there is a good chance that it will initially
blow your 6s performance. Short-term special
dispensation from the 6s goal may be a prerequisite
for justifiable process redesign.
Furthermore,
increasing technical and organizational complexity slow the rate of process
improvement. Combine this with the observation that complex processes
tend to have long cycle times compared to the time it takes for unpredictable
changes to occur in their environment and you're quickly led to the conclusion
that many important processes can never achieve 6s
performance unless they are dysfunctionally over-simplified. This is how
chaos theory enters the picture. My view is that only routine, mature,
and very high unit volume processes should even be considered as potential
candidates to have 6s as a goal.
Set
a goal of 6s to drive desired changes in the
wrong processes and you will only stifle innovation and encourage conservatism
and sub-optimization. As IT professionals seek to enable business initiatives
such as e-commerce, and develop new business models, they find innovation
and uncertainty to be inexorable partners. I've seen new product development
efforts seriously undermined as a result this type of phenomenon. Instead,
if you must, set a process goal of xs, where
x is dependent on process complexity and maturity. I would speculate that
x=3 might be closer to the right number for many important processes.
Another
related perspective on this issue is in terms of process learning. As
a process approaches its limiting performance, learning declines in absolute
terms. An organization which has achieved 6s
in all of its processes is an organization that has, in this sense, stopped
learning. In all cases that I can think of, when you stop learning, you
stop competing and we all know where that feedback loop leads.
Publications
Mentioned:
[2]Art Schneiderman, "Setting Quality Goals" Quality Progress. April 1988,
p. 51.
Business
Implication: What is the real effect on the bottom line?
Six Sigma Quality is often touted on the basis of its significant bottom
line impact. Some claim more than $1M per year per Black Belt in typical
cost savings. For example, according to a Motorola Six Sigma Presentation,
in 1996 it achieved 5.6s performance (up from
4.2s in 1986), $16B in cumulative manufacturing
cost savings and a reduction in Cost of Poor Quality from 15% in 1986
to a little over 5% of sales in 1996. I'm not sure where that number comes
from nor where the billions of dollars in resulting claimed savings went,
but I'd really like to see an independent audit so that I could understand
the basic assumptions used.
I
would hope that the calculated savings net out the component of traditional
cost reduction, as captured, for example, by the historical cost experience
curve, so that the resulting number is truly reflective of the incremental
savings that are directly assignable to the 6s
initiatives. It is always very tempting to attribute all benefits to the
current program, regardless of their true origins.
All
too often, these "cost savings" estimates fail to recognize that many
apparently variable costs are in fact fixed or semi-fixed. They don't
really go away, but instead move elsewhere in the organization, at least
for the short term. Another common practice is the inclusion of profit
from new revenue which will be generated by the resources (people, equipment
and facilities, for example) freed-up by the process improvement. Unfortunately,
these estimates seldom consider total market potential or competitive
dynamics. Furthermore, there is rarely a closing of the loop to assure
that the predicted savings were actually achieved. I've heard more than
one improvement team query their sponsor with: "What level of savings
are you looking for?" Not surprisingly the chosen assumptions yield that
desired answer.
I
would not be surprised at all to find that Darwinian rules develop over
time for the calculation of sigma levels in many organizations in order
to assure survival of only the fittest opportunities for inclusion. I've
been told of more than one case where a persistent defect has been dropped
from the calculation with the justification that "we can't be measured
on what we don't control." Try selling that argument to the customer.
What
is also perplexing is that over the last five years Motorola's stock has
not outperformed the aggregate Electronic Equipment Industry of which
it is a member. One senior quality executive at Motorola told me that
the bulk of the 6s savings had to be passed
on to customers in the form of price reductions, so they do not appear
on the bottom line. These two observations suggest that Motorola's competitors
have realized similar performance improvements, with or without the benefit
of the six sigma approach.
Also
keep in mind that cost reduction by itself does not create significant
societal wealth. Its principal effect is to move wealth from one place
to another. The improvement in labor productivity only benefits society
if there are value-creating alternatives available for the surplused capital
and labor. Reduce equipment and raw materials usage and you reduce the
wealth of the equipment and raw materials suppliers. Societal wealth is
mostly created on the revenue side of the equation; by the creation of
new outputs that are of value to people. But 6s
is of little use there. Just try applying it to processes having a significant
amount of creative content like product development or new process design.
So
my bottom line is that the claimed financial benefits of improvement in
the 6s metric, are also unsubstantiated. This
undermines the assertion of its proponents that the results prove that
the metric really works. The true benefits of 6s
are shared in common with the other flavors of TQM.
The
hidden danger of the 6s metric.
Why is all of this important? You could argue that I'm nitpicking and
that the real value of 6s is in the concept
and approach, not the actual metric. But, non-financial performance measures
are increasingly becoming an important consideration in individual's compensation
and promotion. Past performance along these dimensions even enters resource
allocation decisions. This arises from the over-riding objective of metrics:
to drive positive changes in individual and group behavior. But, if the
non-financial measures are inherently unsound, so too will be the decisions
to which they contribute. In my view, the 6s
metric falls into this category of noise generating metrics.
The
6s metric does have some redeeming characteristics
though
-
It is defect oriented.
-
With the exception of identification of opportunities for a defect,
it is reasonably well documented.
However,
it has an overwhelming number of weaknesses as a metric. Let me summarize
them:
- Unless
the opportunities are weighted by importance to the customer, it can
be a poor surrogate for customer satisfaction because the metric can
get better while customer satisfaction gets worse. How? By improvement
of one type of defect at the numerical expense of a more important one
(e.g., eliminate 10 unimportant defects while creating only 5 more important
ones: net result, an apparent improvement of 5, with an obvious reduction
in customer satisfaction). Note though that this refinement adversely
affects the metric's simplicity requirement.
-
Anyone who has taught the 6s metric can testify
to its complexity, even when the students are soon-to-be Black Belts.
This complexity also violates the KISS principle of good metrics.
-
The 1.5s adjustment is unsupported and clearly
is case dependent at best, thus making the metric inherently biased
(it systematically overstates actual performance).
-
Because of its ambiguity, it is easily gamed unless complemented by
other, more valuable metrics. As a test, give two groups of knowledgeable
people the independent job of identifying the opportunities for defects.
It is likely that their lists will look very different. Although it
is often touted as a universal metric that allows cross-process comparisons,
this weakness significantly undermines that potential.
-
Although it looks like variable data, it is based on attribute data
(number of defects), which masks the degree to which the individual
specifications fail to meet customer requirements. This breaks the link
of the metric to its underlying root causes, unless the associated variable
data is also measured and reviewed.
-
It is based on the gap between current performance and zero defects
rather than the process's limiting value. In doing this it fails to
accommodate strategic decisions about process redesign priorities.
-
As a goal, it fails to differentiate between processes of different
complexity and maturity. If fails to recognize the role of chaos or
exogenous unpredictability in some very important processes, for example
forecasting, product development, resource allocation, and strategic
planning.
Bottom
line as far as the 6s metric is concerned,
forget it. Calculate defects and defect rates, along with their underlying
variable data, but don't bother trying to convert them to an arbitrary
sigma value.
Conclusion
In
closing, don't get me wrong, I'm not saying that numerical goals, variation
reduction or DfX (aka Design for 6s),
where X stands for the "abilities": producability, testability, maintainability,
serviceability, recyclability, etc., are unimportant. I have always been
a big fan of Armand Feigenbaum, who described most of the 6s
statistical concepts in the 1950's. His classic book, Total Quality
Control [4] became the bible and inspiration for the Japanese quality
movement and the source for the name TQC. What I am saying is that 6s
is a poor metric. So my advice: use Six Sigma as the name for your version
of TQM, but don't track its numerical value or put it on your balanced
scorecard.
Publications
Mentioned:
[4]Armand Feigenbaum, Total Quality Control (3rd edition; June
1985) American Society for Quality; ISBN: 0318132583
The
Author:
Arthur M. Schneiderman, independent consultant, www.schneiderman.com.
From 1986 to 1993, Art was Vice President of Quality and Productivity
Improvement at Analog Devices, Inc., the leading manufacturer of precision
linear integrated circuits. Art was a Senior Examiner for the Malcolm
Baldrige National Quality Award and served on the Conference Board's US
Quality Council II. He is a tutor in the University of Limerick, Ireland's
Master of Quality Programme. He was on the 13-member design team for the
Center for Quality Management, a Boston based consortium of more than
100 companies and academic affiliates.
Before
joining Analog Devices, Art spent six years as a consultant with Bain
& Company, an international consulting firm specializing in strategic
planning and implementation. Before that, he was Principal Research Scientist
at AVCO Everett Research Laboratory where over a fifteen-year period he
directed several aerospace R&D programs. Art also spent a year on the
research staff of the System Dynamics Group at the MIT Sloan School of
Management where he did research in economic dynamics.
Art
is a graduate of MIT with a BS and MS in Mechanical Engineering and an
MS in Management from MIT's Sloan School of Management.