DROCC: Deep Robust One-Class Classification
link to abstract: https://arxiv.org/abs/2002.12718
link to Github repo: https://github.com/microsoft/EdgeML/tree/master/examples/pytorch/DROCC
Anomaly Detection이란?
비정상 탐지, 말 그대로 비정상적인 수치를 탐지한다. 이의 목적은 outlier를 찾는 것이다. 즉, 전형적인 데이터와 다른 포인트를 찾는 것.
Classical AD는 간단한 함수와 입력된 정상 데이터를 모델링 시키는 것 방법이다. 반면에 deep-learning-based AD는 데이터의 feature들을 자동적으로 학습시키는 방법이다.
Summary
Deep Robust One-Class Classification. robust는 강한, 굳건한 등 의미를 가지고 있다.
Deep SVDD와 비교: 두 방법 모두 최종 layer에서 학습한 classical one-class loss를 최소화 하는 방식이지만, Deep SVDD는 representation collapse 단점이 있다.
이 모델은 다른 추가적인 정보 없이 대부분의 분야에 활용 가능하며 representation collapse에 강하다. 정상 클래스에 있는 점들이 잘 샘플링 된 로컬에서 선형적인 low-dimensional manifold에 있다고 가정한다.
Introduction
- robust to representation collapse by involving a discriminative component that is general and is empirically accurate on most standard domains
- typical data lies as a group of points on low-dimension, representing well-sampled training data
- has a gradient ascent phase to adaptively add anomalous points to out training set
- also has a gradient descent phase to minimize the classification loss by learning a representation and a classifier on top of the representations to separate typical points from the generated anomalous points
- automatically learns an appropriate representation, similar to Deep SVDD, but is robust to a representation collapse as mapping all points to the same balue would lead to poor discrimination between normal and anomalous points
- summary: method based on a low-dimensional manifold assumption on the positive class using which it synthetically and adaptively generates negative instances to provide a general and robust approach to AD. for one-class classification problem, low FPR on negatives is important
Related Work
- Generative modeling (e.g. GAN-based methods): requires reconstruction of the entire input during the decoding step.
- deep one class SVM (Deep SVDD): suffers from representation collapse issue
- transformations based methods: transformations have limitations because they are heavily domain dependent and are hard to design for domains like TS. also suitability of a transformation varies based on the structure of the typical points
- side-information based AD: complementary to DROCC, which does not assume any side-info
DROCC
- hypothesis: the set of typical points S lies on a low dimensional locally linear manifold that is well-sampled
- outside a small radius around a training point, most points are anomalous
- manifolds (group of points) are locally Euclidean, so we can use l2 distance function to compare points that are very close neighbors
- typical points are positive and anomalous points are negative
- to solve the saddle point problem, DROCC uses gradient descent-ascent technique
- algorithm: 3 steps of adversarial search are performed in parallel for each x in the batch.
- allows DNN structure
- Similar to experiments with DeepSVDD, DeepSAD uses the hidden state of the final timestep as the representation in the one-class objective. An important aspect of training DeepSAD is the pretraining of the network as the encoder in an autoencoder. We also tuned this pretraining to ensure the best results.
'Deep Learning > Anomaly Detection' 카테고리의 다른 글
[Paper Review] Time Series Anomaly Detection using Temporal Hierarchical One-Class (THOC) Network (0) | 2022.01.23 |
---|---|
[Python] Deep SVDD DataLoader on Time Series Data (0) | 2022.01.23 |
Anomaly Detection Paper 모음 (0) | 2022.01.23 |
(수정중) [Paper Review] NeuTraL AD (0) | 2022.01.23 |
[Paper Review] Deep SVDD (Deep One-Class Classification) (0) | 2022.01.23 |