Repository logo
 
No Thumbnail Available
Publication

Fast anomaly detection with locality-sensitive hashing and hyperparameter autotuning

Use this identifier to reference this record.
Name:Description:Size:Format: 
ART2_GECAD_MGT_2022.pdf1.63 MBAdobe PDF Download

Advisor(s)

Abstract(s)

This paper presents LSHAD, an anomaly detection (AD) method based on Locality Sensitive Hashing (LSH), capable of dealing with large-scale datasets. The resulting algorithm is highly parallelizable and its implementation in Apache Spark further increases its ability to handle very large datasets. Moreover, the algorithm incorporates an automatic hyperparameter tuning mechanism so that users do not have to implement costly manual tuning. Our LSHAD method is novel as both hyperparameter automation and distributed properties are not usual in AD techniques. Our results for experiments with LSHAD across a variety of datasets point to state-of-the-art AD performance while handling much larger datasets than state-of-the-art alternatives. In addition, evaluation results for the tradeoff between AD performance and scalability show that our method offers significant advantages over competing methods.

Description

Keywords

Anomaly detection Unsupervised learning AutoML Scalability Big data

Citation

Organizational Units

Journal Issue

Publisher

Elsevier

Altmetrics