Home Page Icon
Home Page
Table of Contents for
CoverImage
Close
CoverImage
by Zacharias Voulgaris PhD
Julia for Data Science
Introduction
CHAPTER 1: Introducing Julia
How Julia Improves Data Science
Data science workflow
Julia’s adoption by the data science community
Julia Extensions
Package quality
Finding new packages
About the Book
CHAPTER 2: Setting Up the Data Science Lab
Julia IDEs
Juno
IJulia
Additional IDEs
Julia Packages
Finding and selecting packages
Installing packages
Using packages
Hacking packages
IJulia Basics
Handling files
Creating a notebook
Saving a notebook
Renaming a notebook
Loading a notebook
Exporting a notebook
Organizing code in .jl files
Referencing code
Working directory
Datasets We Will Use
Dataset descriptions
Magic dataset
OnlineNewsPopularity dataset
Spam Assassin dataset
Downloading datasets
Loading datasets
CSV files
Text files
Coding and Testing a Simple Machine Learning Algorithm in Julia
Algorithm description
Algorithm implementation
Algorithm testing
Saving Your Workspace into a Data File
Saving data into delimited files
Saving data into native Julia format
Saving data into text files
Help!
Summary
Chapter Challenge
CHAPTER 3: Learning the Ropes of Julia
Data Types
Arrays
Array basics
Accessing multiple elements in an array
Multidimensional arrays
Dictionaries
Basic Commands and Functions
print(), println()
typemax(), typemin()
collect()
show()
linspace()
Mathematical Functions
round()
rand(), randn()
sum()
mean()
Array and Dictionary Functions
in
append!()
pop!()
push!()
splice!()
insert!()
sort(), sort!()
get()
Keys(), values()
length(), size()
Miscellaneous Functions
time()
Conditionals
if-else statements
string()
map()
VERSION()
Operators, Loops and Conditionals
Operators
Alphanumeric operators (<, >, ==, <=, >=, !=)
Logical operators (&&, ||)
Loops
for-loops
while-loops
break command
Summary
Chapter Challenge
CHAPTER 4: Going Beyond the Basics in Julia
String Manipulation
split()
join()
Regex functions
ismatch()
match()
matchall()
eachmatch()
Custom Functions
Function structure
Anonymous functions
Multiple dispatch
Function example
Implementing a Simple Algorithm
Creating a Complete Solution
Summary
Chapter Challenge
CHAPTER 5: Julia Goes All Data Science-y
Data Science Pipeline
Data Engineering
Data preparation
Data exploration
Data representation
Data Modeling
Data discovery
Data learning
Information Distillation
Data product creation
Insight, deliverance, and visualization
Keep an Open Mind
Applying the Data Science Pipeline to a Real-World Problem
Data preparation
Data exploration
Data representation
Data discovery
Data learning
Data product creation
Insight, deliverance, and visualization
Summary
Chapter Challenge
CHAPTER 6: Julia the Data Engineer
Data Frames
Creating and populating a data frame
Data frames basics
Variable names in a data frame
Accessing particular variables in a data frame
Exploring a data frame
Filtering sections of a data frame
Applying functions to a data frame’s variables
Working with data frames
Altering data frames
Sorting the contents of a data frame
Data frame tips
Importing and Exporting Data
Accessing .json data files
Storing data in .json files
Loading data files into data frames
Saving data frames into data files
Cleaning Up Data
Cleaning up numeric data
Cleaning up text data
Formatting and Transforming Data
Formatting numeric data
Formatting text data
Importance of data types
Applying Data Transformations to Numeric Data
Normalization
Discretization (binning) and binarization
Binary to continuous (binary classification only)
Applying data transformations to text data
Case normalization
Vectorization
Preliminary Evaluation of Features
Regression
Classification
Feature evaluation tips
Summary
Chapter Challenge
CHAPTER 7: Exploring Datasets
Listening to the Data
Packages used in this chapter
Computing Basic Statistics and Correlations
Variable summary
Correlations among variables
Comparability between two variables
Plots
Grammar of graphics
Preparing data for visualization
Box plots
Bar plots
Line plots
Scatter plots
Basic scatter plots
Scatter plots using the output of t-SNE algorithm
Histograms
Exporting a plot to a file
Hypothesis Testing
Testing basics
Types of errors
Sensitivity and specificity
Significance and power of a test
Kruskal-Wallis tests
T-tests
Chi-square tests
Other Tests
Statistical Testing Tips
Case Study: Exploring the OnlineNewsPopularity Dataset
Variable stats
Visualization
Hypotheses
T-SNE magic
Conclusions
Summary
Chapter Challenge
CHAPTER 8: Manipulating the Fabric of the Data Space
Principal Components Analysis (PCA)
Applying PCA in Julia
Independent Components Analysis (ICA): most popular alternative of PCA
Feature Evaluation and Selection
Overview of the methodology
Using Julia for feature evaluation and selection using cosine similarity
Using Julia for feature evaluation and selection using DID
Pros and cons of the feature evaluation and selection approach
Other Dimensionality Reduction Techniques
Overview of the alternative dimensionality reduction methods
Genetic algorithms
Discernibility-based approach
When to use a sophisticated dimensionality reduction method
Summary
Chapter Challenge
CHAPTER 9: Sampling Data and Evaluating Results
Sampling Techniques
Basic sampling
Stratified sampling
Performance Metrics for Classification
Confusion matrix
Accuracy metrics
Basic accuracy
Weighted accuracy
Precision and recall metrics
F1 metric
Misclassification cost
Defining the cost matrix
Calculating the total misclassification cost
Receiver Operating Characteristic (ROC) Curve and related metrics
ROC Curve
AUC Metric
Gini Coefficient
Performance Metrics for Regression
MSE Metric and its variant, RMSE
SSE Metric
Other metrics
K-fold Cross Validation (KFCV)
Applying KFCV in Julia
KFCV tips
Summary
Chapter Challenge
CHAPTER 10: Unsupervised Machine Learning
Unsupervised Learning Basics
Clustering types
Distance metrics
Grouping Data with K-means
K-means using Julia
K-means tips
Density and the DBSCAN Approach
DBSCAN algorithm
Applying DBSCAN in Julia
Hierarchical Clustering
Applying hierarchical clustering in Julia
When to use hierarchical clustering
Validation Metrics for Clustering
Silhouettes
Clustering validation metrics tips
Effective Clustering Tips
Dealing with high dimensionality
Normalization
Visualization tips
Summary
Chapter Challenge
CHAPTER 11: Supervised Machine Learning
Decision Trees
Implementing decision trees in Julia
Decision tree tips
Regression Trees
Implementing regression trees in Julia
Regression tree tips
Random Forests
Implementing random forests in Julia for classification
Implementing random forests in Julia for regression
Random forest tips
Basic Neural Networks
Implementing neural networks in Julia
Neural network tips
Extreme Learning Machines
Implementing ELMs in Julia
ELM tips
Statistical Models for Regression Analysis
Implementing statistical regression in Julia
Statistical regression tips
Other Supervised Learning Systems
Boosted trees
Support vector machines
Transductive systems
Deep learning systems
Bayesian networks
Summary
Chapter Challenge
CHAPTER 12: Graph Analysis
Importance of Graphs
Custom Dataset
Statistics of a Graph
Cycle Detection
Julia the cycle detective
Connected Components
Cliques
Shortest Path in a Graph
Minimum Spanning Trees
Julia the MST botanist
Saving and loading graphs from a file
Graph Analysis and Julia’s Role in it
Summary
Chapter Challenge
CHAPTER 13: Reaching the Next Level
Julia Community
Sites to interact with other Julians
Code repositories
Videos
News
Practice What You’ve Learned
Some features to get you started
Some thoughts on this project
Final Thoughts about Your Experience with Julia in Data Science
Refining your Julia programming skills
Contributing to the Julia project
Future of Julia in data science
APPENDIX A: Downloading and Installing Julia and IJulia
APPENDIX B: Useful Websites Related to Julia
APPENDIX C: Packages Used in This Book
APPENDIX D: Bridging Julia with Other Platforms
Bridging Julia with R
Running a Julia script in R
Running an R script in Julia
Bridging Julia with Python
Running a Julia script in Python
Running a Python script in Julia
APPENDIX E: Parallelization in Julia
APPENDIX F: Answers to Chapter Challenges
Chapter 2
Chapter 3
Chapter 4
Chapter 5
Chapter 6
Chapter 7
Chapter 8
Chapter 9
Chapter 10
Chapter 11
Chapter 12
Chapter 13
Index
Search in book...
Toggle Font Controls
Playlists
Add To
Create new playlist
Name your new playlist
Playlist description (optional)
Cancel
Create playlist
Sign In
Email address
Password
Forgot Password?
Create account
Login
or
Continue with Facebook
Continue with Google
Sign Up
Full Name
Email address
Confirm Email Address
Password
Login
Create account
or
Continue with Facebook
Continue with Google
Next
Next Chapter
FrontMatter
Add Highlight
No Comment
..................Content has been hidden....................
You can't read the all page of ebook, please click
here
login for view all page.
Day Mode
Cloud Mode
Night Mode
Reset