Anomaly Detection of Hadoop Log Data Using Moving Average and 3-Sigma


KIPS Transactions on Software and Data Engineering, Vol. 5, No. 6, pp. 283-288, Jun. 2016
10.3745/KTSDE.2016.5.6.283,   PDF Download:
Keywords: Big data, Apache Hadoop, Apache Hive, Log Data, Anomaly Detection
Abstract

In recent years, there have been many research efforts on Big Data, and many companies developed a variety of relevant products. Accordingly, we are able to store and analyze a large volume of log data, which have been difficult to be handled in the traditional computing environment. To handle a large volume of log data, which rapidly occur in multiple servers, in this paper we design a new data storage architecture to efficiently analyze those big log data through Apache Hive. We then design and implement anomaly detection methods, which identify abnormal status of servers from log data, based on moving average and 3-sigma techniques. We also show effectiveness of the proposed detection methods by demonstrating that our methods identifies anomalies correctly. These results show that our anomaly detection is an excellent approach for properly detecting anomalies from Hadoop log data.


Statistics
Show / Hide Statistics

Statistics (Cumulative Counts from September 1st, 2017)
Multiple requests among the same browser session are counted as one view.
If you mouse over a chart, the values of data points will be shown.


Cite this article
[IEEE Style]
S. Son, M. Gil, Y. Moon, H. Won, "Anomaly Detection of Hadoop Log Data Using Moving Average and 3-Sigma," KIPS Transactions on Software and Data Engineering, vol. 5, no. 6, pp. 283-288, 2016. DOI: 10.3745/KTSDE.2016.5.6.283.

[ACM Style]
Siwoon Son, Myeong-Seon Gil, Yang-Sae Moon, and Hee-Sun Won. 2016. Anomaly Detection of Hadoop Log Data Using Moving Average and 3-Sigma. KIPS Transactions on Software and Data Engineering, 5, 6, (2016), 283-288. DOI: 10.3745/KTSDE.2016.5.6.283.