Summary

Before we delve into the awesomeness of pandas, it is mission critical that we install Python and pandas correctly, choose the right IDEs, and set the right options. In this chapter, we discussed these and more. Here is a summary of key takeaways from the chapter:

  • Python 3.x is available, but many users still prefer to use version 2.7 as it is more stable and scientific-computation friendly.
  • The support and bug fixing for version 2.7 has now been stopped.
  • Translating code from one version to other is a breeze. One can also use both versions together using the virtualenv package, which comes pre-installed with Anaconda.
  • Anaconda is a popular Python distribution that comes with 700+ libraries/packages and several popular IDEs, such as Jupyter and Spyder.
  • Python codes are callable from, and usable in, other tools, like R, Azure ML Studio, H20.ai, and Julia.
  • Some of the day-to-day data operations, like breaking a large file into smaller chunks, reading a few lines of data, and so on, can be performed in the command line/shell as well.
  • The default setting options for pandas can be seen and changed via the get_option() and set_option() commands. Some of the options that can be changed are the maximum number of rows and columns displayed, the number of decimal points for float variables, and so on.

 

In the next chapter, we will expand our scope a little bit from pandas and explore tools such as NumPy that enrich the capabilities of pandas in the Python ecosystem. It will be an exhaustive NumPy tutorial with real-life case studies.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset