Importing packages 

We use the machine learning/data science packages such as numpy, sklearn, pandas, and matplotlib for visualization:

from time import time
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.model_selection import cross_val_score

To implement the isolation forest, we use the sklearn.ensemble package:

from sklearn.ensemble import IsolationForest

To measure the performance, we use the ROC and AUC, and we will discuss these in details in a later part of this chapter.

The following code imports the relevant packages and loads the KDD data:

 from sklearn.metrics import roc_curve, auc
from sklearn.datasets import fetch_kddcup99
%matplotlib inline

dataset = fetch_kddcup99(subset=None, shuffle=True, percent10=True)
# http://www.kdd.org/kdd-cup/view/kdd-cup-1999/Tasks
X = dataset.data
y = dataset.target
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset