One-class KNN for Quality Control

Question

I've come across this paper

https://uta-ir.tdl.org/uta-ir/bitstream/handle/10106/1827/Sukchotrat_uta_2502D_10083.pdf?sequence=1&isAllowed=y]

where it is described a k-Nearest Neighbors Data Description (kNNDD)-Based Control Chart. (pag 45)

First, the author describes the Local Outlier Factor (LOF) method and then the $K^2$ chart, where the control value is defined as the average euclidean distance of a point from its k nearest neighbors.

I can't really find a connection between the LOF algorthm and this control value. Am I wrong?

Has QUIT--Anony-Mousse · Answer 1 · 2017-02-04T18:38:21.367

3

At the very heart of LOF you will find "k-distance", the distance to the k-nearest neighbor.

The idea of using the k-distance is older than LOF. And the range 10..50 may be a good choice for LOF, but the usual kNN outlier detection often works best for k=1.

edited Feb 04 '17 at 18:38

answered Feb 04 '17 at 18:32

Has QUIT--Anony-Mousse

39,639
7
61
96

Any idea why LOF is included in the paper? As you said there is a little connection, but in my opinion it adds anything else to the description of $K^2$ chart. – momomi Feb 04 '17 at 18:40
Probably he.was originally meant to.compare to LOF, too. – Has QUIT--Anony-Mousse Feb 05 '17 at 09:06
I've tried LOF but it seems not to work. There are too many obs with score >1. – momomi Feb 05 '17 at 09:11
1

Values *near* 1 are normal. Depending on your data set, outliers may start at 1.1, 2.0, 3.0, or 10... that is why you usually look at the M highest values only. But you should probably use an version adapted for time series rather than ignoring time (or worse, treating time as another attribute) – Has QUIT--Anony-Mousse Feb 05 '17 at 11:18

One-class KNN for Quality Control

1 Answers1