How to do it...

Perform the following steps to find highly correlated attributes:

  1. Remove the features that are not coded in numeric characters:
        > new_train = trainset[,! names(churnTrain) %in% c("churn",
"international_plan", "voice_mail_plan")]
  1. Then, you can obtain the correlation of each attribute:
        >cor_mat = cor(new_train)  
  1. Next, we use findCorrelation to search for highly correlated attributes with a cut off equal to 0.75:
        > highlyCorrelated = findCorrelation(cor_mat, cutoff=0.75) 
  1. We then obtain the name of highly correlated attributes:
        > names(new_train)[highlyCorrelated]
        Output
        [1] "total_intl_minutes"  "total_day_charge"   
"total_eve_minutes" "total_night_minutes"
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset