Chapter 10

Why should we use a special stack of packages for data analysis?

Data analysis requires a fast and easy way to operate on multiple elements at once—a so-called vectorized approach. Python's scientific stack allows this by using numpypackage for fast array operations.

Why are NumPy computations so fast compared to normal Python?

NumPy is drastically faster than vanilla Python on numerical operations. This is all thanks to a different data representation—NumPy arrays, in contrast to standard Python collections, require all the elements to be of the same data type. Because of that, an array can be passed to a CPU as one entity and computed more effectively.

What is the use case and benefit of using Pandas over NumPy?

NumPy only supports numeric arrays. Pandas, on the other hand, supports datetime, string, and categorical arrays. In addition, it has tons of helpful functions and operations that are useful for everyday data processing, such as groupby aggregation, resampling, and plotting.

What does sklearn stand for?

sklearn stands for SciPy kit for machine learning and has this name due to its origin as a SciPy subpackage.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset