Preface

About

This section briefly introduces the authors, the coverage of this book, the technical skills you'll need to get started, and the hardware and software requirements required to complete all of the included activities and exercises.

About the Book

Practical Machine Learning with R gives you the tools to solve a wide range of business problems - starting by forming a good problem statement, selecting the most appropriate model to solve your problem, and then ensuring that you do not overtrain the model.

About the Authors

Brindha Priyadarshini Jeyaraman is a senior data scientist at AIDA Technologies. She has completed her M.Tech in knowledge engineering with a gold medal from the National University of Singapore. She has more than 10 years of work experience and she is an expert in understanding business problems, and designing and implementing solutions using machine learning. She has worked on several real data science projects in the insurance and finance domain. This book provides a great platform for her to share the knowledge she has gained over the past few years of working in data science and machine learning.

Ludvig Renbo Olsen, BSc in Cognitive Science from Aarhus University, is the author of multiple R packages, such as groupdata2 and cvms. With 4 years of R and python experience, including working as a machine learning researcher at the Danish startup UNSILO, he is passionate about creating tools and tutorials for students and scientists. Guided by Effective Altruism, he intends to positively impact the world through his career.

Monicah Wambugu is the lead data scientist at a financial technology company that offers micro-loans by leveraging on data, machine learning, and analytics to perform alternative credit scoring. She is a graduate student at the School of Information at UC Berkeley Masters in Information Management and Systems. Monicah is particularly interested in how data science and machine learning can be used to design products and applications that respond to the behavioral and socio-economic needs of target audiences.

Description

With huge amounts of data being generated every moment, businesses need applications that apply complex mathematical calculations to data repeatedly and at speed. With machine learning techniques and R, you can easily develop these kinds of applications in an efficient way.

Practical Machine Learning with R begins by helping you grasp the basics of machine learning methods, while also highlighting how and why they work. You will understand how to get these algorithms to work in practice, rather than focusing on mathematical derivations. As you progress from one chapter to another, you will gain hands-on experience of building a machine learning solution in R. Next, using R packages such as rpart, random forest, Multiple Imputation by Chained Equations (MICE) and neuralnet, you will learn to implement algorithms including neural networks, decision trees, and linear and logistic regression. As you progress through the book, you'll delve into various machine learning techniques for both supervised and unsupervised learning approaches. In addition to this, you'll gain insights into partitioning the datasets and mechanisms to evaluate the results from each model and be able to compare them.

By the end of this book, you will have gained expertise in solving your business problems, starting by forming a good problem statement, selecting the most appropriate model to solve your problem, and then ensuring that you do not overtrain it.

Learning Objectives

  • Define a problem that can be solved by training a machine learning model
  • Obtain, verify and clean data before transforming it into the correct format for use
  • Perform exploratory analysis and extract features from data
  • Build models for regression, classification and clustering
  • Evaluate the performance of a model with the right metrics
  • Solve a classification problem using the neuralnet package
  • Implement a decision tree using the random forest library

Audience

If you are a data analyst, data scientist, or a business analyst who wants to understand the process of machine learning and apply it to a real dataset using R, this book is just what you need. Data scientists who use Python and want to implement their machine learning solutions using R will also find this book very useful. The book will also enable novice programmers to start their journey in data science. Basic knowledge of any programming language is all you need to get started.

Approach

Practical Machine Learning with R uses a practical and hands-on approach to teach all concepts. You will explore different machine learning algorithms with a project-based approach. By solving problems using concepts taught in the previous chapters, the book demystifies the complexity of machine learning and gives you the confidence to tackle even more challenging problems.

Minimum Hardware Requirements

For the optimal student experience, we recommend the following hardware configuration:

  • Processor: Intel Core i5 or equivalent
  • Memory: 4GB RAM (8 GB Preferred)
  • Storage: 16 GB available space

Software Requirements

You'll also need the following software installed in advance:

  • OS: Windows 7 SP1 64-bit, Windows 8.1 64-bit or Windows 10 64-bit, Ubuntu Linux, or a newer version of OS X
  • Browser
  • R Studio
  • R version 3.6 or later
  • R libraries as needed (mice, caret, rpart, groupdata2, cvms, neuralnet, NeuralNetTools, rPref, mlbench, knitr, interplot, doParallel, car, and so on)

Conventions

Code, database table names, file and folder names, file extensions, pathnames, URLs, user input, and Twitter handles are shown as follows: "The pre-loaded datasets of R can be viewed using the data() command"

A block of code is set as follows:

# Installing necessary packages

install.packages("mlbench")

install.packages("caret")

# Loading the datasets

data(package = .packages(all.available = TRUE))

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "The Help tab will display all the information about the dataset. "

Installation and Setup

We will be installing R, Rstudio and R packages.

Installing R

  1. R can be downloaded from https://www.r-project.org by clicking the Download CRAN as shown below:
    Figure 0.1: Screenshot of download link
    Figure 0.1: Screenshot of download link
  2. Chose the mirror you would want to download from.
  3. Further, based on the operating system that you use; Windows, Linux or (MAC) OS X, select the relevant link:
    Figure 0.2: Screenshot of links based on operating system
    Figure 0.2: Screenshot of links based on operating system
  4. Select install R for first time.
  5. Download and run the executable.
  6. Run the exe file and install in your local directory as shown below:
Figure 0.3: Selecting the setup location
Figure 0.3: Selecting the setup location

Note

The current installation process is for Windows and will be similar for other operating systems.

Installing R Studio

  1. Rstudio can be download from https://www.rstudio.com/products/rstudio/download/. Choose the Free version.
  2. Based on the OS chose the relevant executable from below and download it:
    Figure 0.4: Image caption in sentence case
    Figure 0.4: Image caption in sentence case
  3. Run the executable and install it, as shown below:
    Figure 0.5: Image caption in sentence case
    Figure 0.5: Image caption in sentence case
  4. After installation, open R Studio in your computer. The following dialog box should be displayed:
Figure 0.6: R studio dialog box
Figure 0.6: R studio dialog box

Installing Libraries

  1. Click on packages tab as shown below:
    Figure 0.7: Packages in R Studio
  2. Select the packages that needs to be installed. The package will be attached on the left as shown below:
    Figure 0.8: Attaching the package MASS
    Figure 0.8: Attaching the package MASS
  3. Now the function of this library can be used. To install packages that are not displayed in the above, click on the Install option:
    Figure 0.9: Install option
    Figure 0.9: Install option
  4. Type the package you would want to install. For instance, select ggplot2 and click Install:
Figure 0.10: Install packages pop-up
Figure 0.10: Install packages pop-up

The status of the installation of the package can be viewed in the console.

Alternatively, the install.packages("packagename") can also be used.

Installing the Code Bundle

Copy the code bundle for the class to the C:/Code folder (for Windows).

Additional Resources

The code bundle for this book is hosted on GitHub at: https://github.com/TrainingByPackt/Practical-Machine-Learning-with-R.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset