Currently, flow-level anomaly detection systems get widely deployed in ISP networks to provide fast detection in case of large-scale anomalies such as worms, denial-of-service attacks, or flash crowds. Unfortunately, benchmark evaluation traces which would allow for systematically evaluating these anomaly detection systems are not available to neither research nor industry. In this paper, we identify three major problems that hinder a systematic evaluation of flow-level anomaly detection systems. (1) Only very few backbone traffic traces are available to the research community due to privacy concerns of ISPs and their customers. (2) Available traces do not contain anomalies of varying intensities which are required for assessing the sensitivity of anomaly detection systems. And (3) available traces do not contain annotated anomalies, also referred to as ground truth. We discuss existing approaches that aim at overcoming these three problems, and identify their drawbacks. We propose an alternative approach for generating benchmark evaluation traces, namely synthetic generation of flow-level traffic traces, and discuss why and how this approach can provide a solution to the identified problems. The two main challenges with such an approach are to define normal and anomalous network behavior, and to find realistic models describing normal and anomalous traffic at the flow level. We discuss our ideas for defining normal and anomalous traffic, and specify the framework for a novel flow traffic model targeted at anomaly detection. Finally, we provide an initial design for a synthetic flow trace generator.
Download Full PDF Version (Non-Commercial Use)