Title Page Copyright and Credits Learn Python by Building Data Science Applications About Packt Why subscribe? Contributors About the authors About the reviewers Packt is searching for authors like you Preface Who this book is for What this book covers To get the most out of this book Download the example code files Download the color images Code in Action Conventions used Get in touch Reviews Section 1: Getting Started with Python Preparing the Workspace Technical requirements Installing Python Downloading materials for running the code Installing Python packages Working with VS Code The VS Code interface Beginning with Jupyter Notebooks The Jupyter interface Pre-flight check Summary Questions Further reading First Steps in Coding - Variables and Data Types Technical requirements Assigning variables Naming the variable  Understanding data types Floats and integers Operations with self-assignment Order of execution Strings Formatting Format method F-strings Legacy formatting Formatting mini-language Strings as sequences Booleans Logical operators Converting the data types Exercise Summary Questions Further reading Functions Technical requirements Understanding a function Interface functions The input function The eval function Variable properties The help function The type function The isinstance function dir Math abs The round function Iterables The len function The sorted function The range function The all and any functions The max, min, and sum functions Defining the function Default values Var-positional and var-keyword Docstrings Type annotations Refactoring the temperature conversion Understanding anonymous (lambda) functions Understanding recursion Summary Questions Further reading Data Structures Technical requirements What are data structures? Lists Slicing Tuples Immutability Dictionaries Sets More data structures frozenset defaultdict Counter Queue deque namedtuple Enumerations Using generators Useful functions to use with data structures The sum, max, and min functions The all and any functions The zip function The map, filter, and reduce functions Comprehensions Summary Questions Further reading  Loops and Other Compound Statements Technical requirements Understanding if, else, and elif statements Inline if statements Using if in a comprehension Running code many times with loops The for loop itertools cycle chain product Enumeration The while loop Additional loop functionality – break and continue Handling exceptions with try/except and try/finally  Exceptions try/except try/except/finally Understanding the with statements Summary Questions Further reading First Script – Geocoding with Web APIs Technical requirements Geocoding as a service Learning about web APIs Working with HTTPS Working with the Nominatim API The requests library Starting to code Caching with decorators Reading and writing data Geocoding the addresses Moving code to a separate module Collecting NYC Open Data from the Socrata service Summary Questions Further reading Scraping Data from the Web with Beautiful Soup 4 Technical requirements When there is no API HTML in a nutshell Scraping with Beautiful Soup 4 CSS and XPath selectors Developer console Scraping WWII battles Step 1 – Scraping the list of battles Unordered list Step 2 – Scraping information from the Wiki page Key information Additional information Step 3 – Scraping data as a whole Quality control Beyond Beautiful Soup Summary Questions Further reading Simulation with Classes and Inheritance Technical requirements Understanding classes Special (dunder) methods __init__ __repr__ and __str__  Arithmetical and logical operations Equality/relationship methods __len__ __getitem__ __class__ Inheritance Using super() Data classes Using classes in simulation Writing the base classes Writing the Island class Herbivore haven Harsh islands Visualization Summary Questions Further reading Shell, Git, Conda, and More – at Your Command Technical requirements Shell Pipes Executing Python scripts Command-line interface Git Concept GitHub Practical example gitignore Conda Conda for virtual environments Conda and Jupyter Make Cookiecutter Summary Questions Section 2: Hands-On with Data Python for Data Applications Technical requirements Introducing Python for data science Exploring NumPy Beginning with pandas Trying SciPy and scikit-learn Understanding Jupyter Summary Questions Data Cleaning and Manipulation Technical requirements Getting started with pandas Selection – by columns, indices, or both Masking Data types and data conversion Math Merging Working with real data Initial exploration Defining the scope of work to be done Getting to know regular expressions Parsing locations Geocoding Time Belligerents Understanding casualties Multilevel slicing Quality assurance Writing the file Summary Questions Further reading Data Exploration and Visualization Technical requirements Exploring the dataset Descriptive statistics Data visualization with matplotlib (and its pandas interface) Aggregating the data to calculate summary statistics  Resampling Mapping Declarative visualization with vega and altair Drawing maps with Altair Storing the Altair chart Big data visualization with datashader Summary Questions Further reading Training a Machine Learning Model Technical requirements Understanding the basics of ML Exploring unsupervised learning Moving on to supervised learning k-nearest neighbors Linear regression Decision trees Summary Questions Further reading Improving Your Model – Pipelines and Experiments Technical requirements Understanding cross-validation Exploring feature engineering Failed attempts Optimizing the hyperparameters Using a random forest model Tracking your data and metrics with version control Starting with data Adding code to the equation Metrics Summary Questions Further reading Section 3: Moving to Production Packaging and Testing with Poetry and PyTest Technical requirements Building a package Bringing your own package Using a package manager – pip and conda Creating a package scaffolding A few ways to build your package Trying out code with Poetry Adding actual code Defining dependencies Non-code resources Publishing the package Development workflow Testing the code so far Testing with PyTest Writing our own tests Automating the process with CI services Generating documentation generation with sphinx Installing a package in editable mode Summary Questions Further reading Data Pipelines with Luigi Technical requirements Introducing the ETL pipeline Redesigning your code as a pipeline Building our first task in Luigi Connecting the dots Understanding time-based tasks Scheduling with cron Exploring the different output formats Writing to an S3 bucket Writing to SQL Expanding Luigi with custom template classes Summary Questions Further reading Let's Build a Dashboard Technical requirements Building a dashboard – three types of dashboard Static dashboards Debugging Altair Connecting your app to the Luigi pipeline Understanding dynamic dashboards First try with panel Reading data from the database Creating an interactive dashboard in Jupyter Summary Questions Further reading Serving Models with a RESTful API Technical requirements What is a RESTful API? Python web frameworks Building a basic API service Exploring service with OpenAPI Finalizing our naive first iteration Data validation Sending data in with POST requests Adding features to our service Building a web page Speeding up with asynchronous calls Deploying and testing your API loads with Locust Summary Questions Further reading Serverless API Using Chalice Technical requirements Understanding serverless Getting started with Chalice Setting up a simple model Externalizing medians Building a serverless API for an ML model When we're still out of memory Building a serverless function as a data pipeline S3-triggered events Summary Questions Further reading Best Practices and Python Performance Technical requirements Speeding up your Python code Rewriting the code with NumPy Specialized data structures and algorithms Dask Dask-ML Numba Concurrency and parallelism Different types of concurrency Two types of problems Before you start rewriting your code Using best practices for coding in your project Code formatting with black Measuring code quality with Wily Writing tests with hypothesis Beyond this book – packages and technologies to look out for Different Python flavors Docker containers Kubernetes Summary Questions Further reading Assessments Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Other Books You May Enjoy Leave a review - let other readers know what you think