Hack me if you can: Aggregating AutoEncoders for countering persistent access threats within highly imbalanced data

Sidahmed Benabderrahmane*, Ngoc Hoang, Petko Valtchev, James Cheney, Talal Rahwan

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

Advanced Persistent Threats (APTs) are sophisticated, targeted cyberattacks designed to gain unauthorized access to systems and remain undetected for extended periods. To evade detection, APT cyberattacks deceive defense layers with breaches and exploits, thereby complicating exposure by traditional anomaly detection-based security methods. The challenge of detecting APTs with machine learning is compounded by the rarity of relevant datasets and the significant imbalance in the data, which makes the detection process highly burdensome. We present AE-APT, a deep learning-based tool for APT detection that features a family of AutoEncoder methods ranging from a basic one to a Transformer-based one. We evaluated our tool on a suite of provenance trace databases produced by the DARPA Transparent Computing program, where APT-like attacks constitute as little as 0.004% of the data. The datasets span multiple operating systems, including Android, Linux, BSD, and Windows, and cover two attack scenarios. The outcomes showed that AE-APT has significantly higher detection rates compared to its competitors, indicating superior performance in detecting and ranking anomalies.
Original languageEnglish
Pages (from-to)926-941
Number of pages16
JournalFuture Generation Computer Systems
Volume160
Early online date2 Jul 2024
DOIs
Publication statusE-pub ahead of print - 2 Jul 2024

Keywords / Materials (for Non-textual outputs)

  • anomaly detection
  • transformers
  • advanced persistent threats
  • attention mechanism
  • deep learning
  • cyber-security

Fingerprint

Dive into the research topics of 'Hack me if you can: Aggregating AutoEncoders for countering persistent access threats within highly imbalanced data'. Together they form a unique fingerprint.

Cite this