Machine learning for intrusion detection: Modeling the distribution shift

Bassam Farran*, Craig Saunders, Mahesan Niranjan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper addresses two important issue that arise in formulating and solving computer intrusion detection as a machine learning problem, a topic that has attracted considerable attention in recent years including a communitywide competition using a common data set known as the KDD Cup '991. The first of these problems we address is the size of the data set, 5×106 by 41 features, which makes conventional learning algorithms impractical. In previous work, we introduced a one-pass non-parametric classification technique called Voted Spheres, which carves up the input space into a series of overlapping hyperspheres. Training data seen within each hypersphere is used in a voting scheme during testing on unseen data. Secondly, we address the problem of distribution shift whereby the training and test data may be drawn from slightly different probability densities, while the conditional densities of class membership for a given datum remains the same. We adopt two recent techniques from the literature, density weighting and kernel mean matching, to enhance the Voted Spheres technique to deal with such distribution disparities. We demonstrate that substantial performance gains can be achieved using these techniques on the KDD cup data set.

Original languageEnglish
Title of host publicationProceedings of the 2010 IEEE International Workshop on Machine Learning for Signal Processing, MLSP 2010
Pages232-237
Number of pages6
DOIs
Publication statusPublished - 2010
Event2010 IEEE 20th International Workshop on Machine Learning for Signal Processing, MLSP 2010 - Kittila, Finland
Duration: 29 Aug 20101 Sep 2010

Conference

Conference2010 IEEE 20th International Workshop on Machine Learning for Signal Processing, MLSP 2010
CountryFinland
CityKittila
Period29/08/101/09/10

Fingerprint Dive into the research topics of 'Machine learning for intrusion detection: Modeling the distribution shift'. Together they form a unique fingerprint.

Cite this