The dataset

The data sources are primarily sourced from the 1998 DARPA Intrusion Detection Evaluation Program by MIT Lincoln Labs. This dataset contains a variety of network events that have been simulated in the military network environment. The data is a TCP dump that has been accumulated from the local area network of an Air Force environment. The data is peppered with multiple attacks.

In general, a typical TCP dump looks as follows:

The training data set is about four gigabytes in size and consists of a compressed transmission control protocol dump distributed across seven weeks. This dataset consists of about five million network connections. We also collected two weeks of test data of the same type as the training data, and the total test data set size consists of approximately two million connections.

The preceding attacks in the data can be distinguished into the following categories:

  • Denial-Of-Service (DOS) attacks: A more advanced form of this attack is called the Distributed Denial-Of-Service (DDoS) attack
  • Password-guessing attacks: These are unauthorized access from a remote machine
  • Buffer overflow attacks: These are unauthorized access to local superuser (root) privileges
  • Reconnaissance attacks: These deal with probing surveillance and port scanning
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset