Anomaly detection (AD) is often focused on detecting anomaly areas for industrial quality inspection and medical lesion examination. However, due to the specific scenario targets, the data scale for AD is relatively small, and evaluation metrics are still deficient compared to classic vision tasks, such as object detection and semantic segmentation. To fill these gaps, this work first constructs a large-scale and general-purpose COCO-AD dataset by extending COCO to the AD field. This enables fair evaluation and sustainable development for different methods on this challenging benchmark. Moreover, current metrics such as AU-ROC have nearly reached saturation on simple datasets, which prevents a comprehensive evaluation of different methods. Inspired by the metrics in the segmentation field, we further propose several more practical threshold-dependent AD-specific metrics, i.e., mF1.2.8 , mAcc.2.8 , mIoU.2.8 , and mIoU-max. Motivated by GAN inversion's high-quality reconstruction capability, we propose a simple but more powerful InvAD framework to achieve high-quality feature reconstruction. Our method improves the effectiveness of reconstruction-based methods on popular MVTec AD, VisA, and our newly proposed COCO-AD datasets under a multi-class unsupervised setting, where only a single detection model is trained to detect anomalies from different classes. Extensive ablation experiments have demonstrated the effectiveness of each component of our InvAD.
Current field of Anomaly Detection (AD) suffers from the following three issues:
1) The AD datasets are relatively small and specific to certain domains.
     ==> We extend the COCO dataset to general visual AD scenarios and propose a highly challenging benchmark to evaluate different methods fairly.
2) The evaluation criteria are not uniquely designed for segmentation-based AD tasks, reducing the guidance of the metrics.
     ==> We propose four additional pixel-level metrics that are more in line with actual application scenarios and suggest reporting five average metrics to holistically evaluate the merits of different methods.
3) The results of existing methods under multi-class unsupervised anomaly detection tasks still need to be improved.
     ==> Drawing on the concept of GAN inversion, we design a novel InvAD and propose an Spatial Style Modulation (SSM) module to ensure input-dependent and high-quality reconstruction.
Our InvAD shows a significant improvement over comparative methods on the more challenging COCO-AD dataset, and we also conduct experiments on the currently popular MVTec AD and VisA datasets, achieving obvious new heights in multi-class unsupervised AD tasks.
Top: AD datasets comparison. Middle: AD metric comparison. Bottom: Structure of our InvAD.
@article{invad,
title={Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark},
author={Jiangning Zhang and Chengjie Wang and Xiangtai Li and Guanzhong Tian and Zhucun Xue and Yong Liu and Guansong Pang and Dacheng Tao},
journal={arXiv preprint arXiv:2404.10760},
year={2024}
}