Home Page Icon
Home Page
Table of Contents for
CoverImage
Close
CoverImage
by Zacharias Voulgaris PhD
Data Science: Mindset, Methodologies, and Misconceptions
Introduction
Part 1 Overview of Data Science and the Data Scientist’s Work
Chapter 1 What is Data Science?
Data Science vs. Business Intelligence vs. Statistics
Data Science
Business Intelligence
Statistics
Big Data, Machine Learning, and AI
Big Data
Machine Learning
AI – The Scientific Field, Not the Sci-fi Movie!
The Need for Data Scientists and the Products/Services Provided
What Does a Data Scientist Actually Do?
What Does a Data Scientist Not Do?
The Ever-growing Need for Data Science Professionals
Summary
Chapter 2 The Data Science Pipeline
Data Engineering
Data Preparation
Data Exploration
Data Representation
Data Modeling
Data Discovery
Data Learning
Information Distillation
Data Product Creation
Insight, Deliverance, and Visualization
Putting It All Together
Summary
Chapter 3 Data Science Methodologies
Predictive Analytics
Classification
Regression
Time-series Analysis
Anomaly Detection
Text Prediction
Recommender Systems
Content-based Systems
Collaborative Filtering
Non-negative Matrix Factorization (NMF or NNMF)
Automated Data Exploration Methods
Data Mining
Association Rules
Clustering
Graph Analytics
Dimensionless Space
Graph Algorithms
Other Graph-related Topics
Natural Language Processing (NLP)
Sentiment Analysis
Topic Extraction/Modeling
Text Summarization
Other NLP Methods
Other Methodologies
Chatbots
Artificial Creativity
Other AI-based Methods
Summary
Chapter 4 The Data Scientist’s Toolbox
Database Platforms
SQL-based Databases
NoSQL Databases
Graph-based Databases
Programming Languages for Data Science
Julia
Python
R
Scala
Which Language is Best for You?
The Most Useful Packages for Julia and Python
Other Data Analytics Software
MATLAB
Analytica
Mathematica
Visualization Software
Plot.ly
D3.js
WolframAlpha
Tableau
Data Governance Software
Spark
Hadoop
Storm
Version Control Systems (VCS)
Git
Github
CVS
Summary
Part 2 Setting the Stage for Data Analytics
Chapter 5 Data Science Questions and Hypotheses
Importance of Asking (the Right) Questions
Formulating a Hypothesis
Questions Related to Most Common Use Cases
Is Feature X Related to Feature Y?
Is Subset X Significantly Different to Subset Y?
Do Features X and Y Collaborate Well with Each Other for Predicting Variable Z?
Should We Remove X from the Feature Set?
How Similar are Variables X and Y?
Does Variable X Cause Variable Y?
Other Question Types
Questions Not to Ask
Summary
Chapter 6 Data Science Experiments and Evaluation of Their Results
The Importance of Experiments
How to Construct an Experiment
Experiments for Assessing the Performance of a Predictive Analytics System
A Matter of Confidence
Evaluating the Results of an Experiment
Summary
Chapter 7 Sensitivity Analysis of Experiment Conclusions
The Importance of Sensitivity Analysis
The Butterfly Effect
Global Sensitivity Analysis Using Resampling Methods
Bootstrapping
Permutation Methods
Jackknife
Monte Carlo
Local Sensitivity Analysis Employing “What If?” Questions
Some Useful Considerations on Sensitivity Analysis
Summary
Part 3 Common Errors in Data Science
Chapter 8 Programming Bugs
The Importance of Understanding and Dealing with Programming Bugs
Places You Usually Find Bugs
Types of Bugs Commonly Encountered
Some Useful Considerations on Programming Bugs
Summary
Chapter 9 Mistakes Through the Data Science Process
How Mistakes Differ From Bugs
Most Common Types of Mistakes
Choosing the Right Model
Value of a Mentor
Some Useful Considerations on Mistakes
Summary
Chapter 10 Handling Bugs and Mistakes
Strategies for Coping with Bugs
Strategies for Coping with High-level Mistakes
Preventing Erroneous Situations in the Pipeline
Types of Models
Evaluating the Data at Hand and Pairing It with a Model
Choosing the Right Model for a Classification Methodology
Combining Different Options in an Ensemble Setting
Other Considerations for Choosing the Right Model
Summary
Part 4 Other Aspects of Data Science
Chapter 11 The Role of Heuristics in Data Science
Heuristics as Information in the Making
Problems that Require Heuristics
Why Heuristics are Essential for an AI System
Applications of Heuristics in Data Science
Heuristics and Machine Learning Processes
Custom Heuristics and Data Engineering
Heuristics for Feature Evaluation
Other Applications of Heuristics
Anatomy of a Good Heuristic
Some Final Considerations on Heuristics
Summary
Chapter 12 The Role of AI in Data Science
Problems AI Solves
Types of AI Systems Used in Data Science
Deep Learning Networks
Autoencoders
Other Types of AI Systems
AI Systems Using Data Science
Computer Vision
Chatbots
Artificial Creativity
Other AI Systems Using Data Science
Some Final Considerations on AI
Summary
Chapter 13 Data Science Ethics
The Importance of Ethics in Data Science
Confidentiality Matters
Privacy
Data Anonymization
Data Security
Licensing Matters
Other Ethical Matters
Some Final Considerations on Ethics
Summary
Chapter 14 Future Trends and How to Remain Relevant
General Trends in Data Science
The Role of AI in the Years to Come
Big Data: Getting Bigger and More Quantitative
New Programming Paradigms
The Rise of Hadoop Alternatives
Other Trends
Remaining Relevant in the Field
The Versatilist Data Scientist
Data Science Research
The Need to Educate Oneself Continuously
Collaborative Projects
Mentoring
Summary
Final Words
Glossary
Index
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Next
Next Chapter
FrontMatter
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset