Quite a long chapter! Isn't it? But, this chapter will form the core of anything you learn and implement in data-science. Let us wrap-up the chapter by summarizing the key takeaways from the chapter:
.ix
method and [ ]
method, and creating new columns.randint()
, raandarrange()
in the random
library of numpy
. There are also methods like shuffle
and choice
to randomly select an element out of a list. Randn()
and uniform()
are used to generate random numbers following normal and uniform probability distributions. Random numbers can be used to run simulations and generate dummy data frames.groupby()
method creates a groupby
element on which aggregate
, transform
, and filter
operations can be applied. This is a good method to summarize data for each categorical variable at once.choice
and shuffle
. Scikit-learn has a readymade method for this.Wrangling data and bringing it in the form you desire is a big challenge before one proceeds to modelling. But, once done, it opens up a plethora of insights and information to be discovered using predictive models. As Bob Marley said, "If it is easy, it won't be amazing; if it is amazing, it won't be easy."