Questions tagged [quality-control]

Quality Control relates to statistical methods used for monitoring, maintaining, and/or improving either the statistical quality control of a process or the capability of a process. Use for questions about 6 sigma.

Quality Control is a subsection of the collection of statistical tools introduced by Sir Ronald A. Fisher, Joseph M. Juran, Philip B. Crosby, Walter Shewhart, W. Edwards Deming, and Genichi Taguchi. In general, the tools most often associated with quality control can refer to either:

  1. Statistical Quality Control
  2. Process Capability

Statistical Quality Control is usually monitored via Statistical Process Control (SPC) methods such as Shewhart Charts or Control Charts. Control charts are modified time series plots, usually with some average value and control limits plotted three standard deviations above and below the center line. They do not include the tolerance band of the process. Such charts include, but are not limited to:

  • $\overline{X}-R$: (Average and range)
  • $\overline{X}-s$: (Average and standard deviation)
  • $IX-MR$: (Individual and Moving Range, sometimes $X-MR$)
  • $c$: (counts)
  • $u$: (counts with varying subgroup size)
  • $p$: (proportion with varying subgroup size)
  • $np$: (proportion)
  • EWMA: (exponentially weighted moving average)

Process Capability is often measured as some relationship between the process distribution and the engineering specification for the process. Charts for such analysis are usually based upon histograms with the target value, upper specification level, and lower specification level (T, USL, and LSL) superimposed. These indexes include, but are not limited to:

  • $C_p$: It represents the best case ratio of the spec. and the process.
  • $C_r$: It is simply the inverse of $C_p$.
  • $C_{a}$: A representation of the accuracy of the process.
  • $C_{pa}$: A variation of $C_{pk}$ for asymmetrical processes.
  • $C_{pk}$: $C_{p}$ modified by the factor $k$.
  • $C_{p-}$: The difference between $C_p$ and $C_{pk}$.
  • $C_M$: "Capability of the Machine;" it uses a wider range of possible process outcomes than $C_p$
  • $C_{pm}$: Similar to $C_{pk}$, it relates the process in comparison to the spec. limits and a target value. It is most useful with asymmetrical tolerances.
  • $C_{pp}$: "Incapability Index" based upon $C_{pm}$ and similar to $C_r$.
  • $C_{pmk}$: $C_{pm}$ modified by the factor $k$.
  • $C_pT$: replaces $\hat{\mu}$ in $C_{pk}$ with target value $T$.
  • $Z_{bench}$: Represents capability as a $Z_{score}$ and works with continuous or discrete data.
  • $Q_k$: When no tolerance range exists, the Mean Standard Error of the process can be used to evaluate this index.
  • $C_{p\omega}$: A weighted index which can be used to calculate approximations for $C_p$, $C_{pm}$, and $C_{pk}$.
  • $C_p\left ( u,v \right )$: An index which can calculate $C_p$, $C_{pm}$, $C_{pk}$, and $C_{pmk}$.
  • $C_{p\log}$: Similar in use to $C_p$, it can be used for lognormal distributions.
  • $C_{p(\ln)}$: An alternate to $C_{p\log}$; used for lognormal distributions.
  • $C_{pk(\ln)}$: A version of $C_{pk}$ used for lognormal distributions.
  • $C_s$: Useful for any skewed distribution.
  • $C_{npk}$: Useful for any distribution, as long as the parameters can be determined.
  • $C_f$: Used for proportions of non-conforming units.
  • $C\%$: Used with FTY/RTY data; capability of percent non-conforming.
  • $P_p$: A long-term version of $C_p$.
  • $P_{pk}$: A long-term version of $C_{pk}$.
148 questions
43
votes
8 answers

How do I get people to take better care of data?

My workplace has employees from a very wide range of disciplines, so we generate data in lots of different forms. Consequently, each team has developed its own system for storing data. Some use Access or SQL databases; some teams (to my horror)…
Richie Cotton
  • 644
  • 9
  • 15
18
votes
2 answers

Quality assurance and quality control (QA/QC) guidelines for a database

Background I am overseeing the input of data from primary literature into a database. The data entry process is error prone, particularly because users must interpret experimental design, extract data from graphics and tables, and transform results…
David LeBauer
  • 7,060
  • 6
  • 44
  • 89
18
votes
3 answers

Why isn't bayesian statistics more popular for statistical process control?

My understanding of the bayesian vs frequentist debate is that frequentist statistics: is (or claims to be) objective or at least unbiased so different researchers, using different assumptions can still get quantitatively comparable results while…
nikie
  • 444
  • 4
  • 14
9
votes
3 answers

How to verify extremely low error rates

I am faced with trying to demonstrate through testing an extremely low error rate for a sensor (no more than 1 error in 1,000,000 attempts). We have limited time to conduct the experiment so we anticipate not being able to obtain more than about…
9
votes
2 answers

Problems with Outlier Detection

In a blog post Andrew Gelman writes: Stepwise regression is one of these things, like outlier detection and pie charts, which appear to be popular among non-statisticians but are considered by statisticians to be a bit of a joke. I understand…
114
  • 701
  • 6
  • 15
7
votes
2 answers

"Random variable" vs. "random value" (when translating from Russian into English)

In the introductory part of a Russian document I was translating, there was a simplified description of a control chart-based quality management system. The description contained this sentence in Russian: Контрольные границы определяют предел…
CopperKettle
  • 1,123
  • 12
  • 18
7
votes
4 answers

What is the mathematically rigorous definition of chunky data?

When in the workplace, certain measurement-taking devices are subject to different numerical accuracy; in some cases, the accuracy can be pretty weak (i.e., to one or two significant values only). Thus, instead of data sets like this: $$\{0.012,…
7
votes
1 answer

Probability of failure in a finite population

I regularly inspect finite populations for failures (we make custom products in batches of ~500-800). Currently, we inspect every product for failure, which is quite a bit of work. I want to reduce the number of samples we inspect by stating a…
6
votes
1 answer

Measuring k-means clustering quality on training and test sets

I'm working on implementing streaming k-means in Mahout. The code is mostly done and we're talking about how to integrate the code. As part of the quality evaluations, I want to know how the clustering performs on the 20 newsgroups data…
Dan Filimon
  • 161
  • 1
  • 1
  • 5
6
votes
2 answers

How to judge if a datapoint deviates substantially from the norm

This is Statistics 101, but I'm not a statistician and so can't seem to find the right technical jargon to google. My company collects data at discrete points through time. Today's datapoint is positioned somewhat differently to the others, and so…
5
votes
2 answers

Dealing with zeros in a poisson regression

Our code goes through multiple stages of review. I wish to use the number of defects at an earlier stage of review as a "defect density" estimate for later stages. It sometimes happens that code has zero defects in the early stage of review. This is…
Xodarap
  • 2,448
  • 2
  • 17
  • 24
5
votes
1 answer

What do the letter and the subscript in the Shewhart's control chart constants ($A_2, A_3, c_4, c_5, \ldots$) mean?

The expected value of the sample standard deviation is $$E(s) = c_4(n)\sigma$$ where $$ c_4(n) = \sqrt{2\over n-1}{\Gamma({n\over2})\over\Gamma({n-1\over2})} $$ The page on Wikipedia led me to believe that was related to the order of the…
Frank Vel
  • 546
  • 3
  • 18
5
votes
2 answers

What is the rationale for the rules for detecting an out of control process in Statistical Process Control?

Statistical Process Control (SPC) can be used to determine if a process is "in statistical control". A common tool for SPC is the "mean control chart" -- essentially a time series of sample means obtained from the process one seeks to analyze. A…
Demetri Pananos
  • 24,380
  • 1
  • 36
  • 94
5
votes
2 answers

Algorithim to determine if point is "too far from the average"

Long story short I have a collection of about 30 scripts that manipulate data sets and place them in a database. These scripts report their running times as well as any errors that occur to a separate database. I wrote another script that goes…
user974896
  • 161
  • 1
5
votes
2 answers

Looking for aberrations in time based data

Looking at IO latency data for a storage array. When a drive is about to fail, one indication is an increase in the IO operations 'time to complete'. The array kindly provides this data in this format: Time Disk…
Jason
  • 51
  • 4
1
2 3
9 10