Index

Symbols & Numerics

%>% (pipe operator), 271-273

: (colon), interaction terms, 396

... (ellipsis), 157-159

= (equal sign), 19

$ (dollar sign)

referencing list elements, 77-79

shortened $ referencing, 78-79

\ (double backslash), 21

/ (forward slash), 21

& operator, 144-145

[] (square brackets), 43

double square bracket referencing, 76-77

~ (tilde), formula relationships, 381

3D lattice graphics, 352-354

A

abline function, 389-390

acf function, 448

active bindings, 544

adding

columns, 266, 277-278

list elements, 79-80

rows, 278-279

aggregate function, 252

specifying variables, 254-256

using with a formula, 252-254

multiple return values, 253-254

summarizing by multiple variables, 252-253

summarizing multiple columns, 253

aggregating data, 246

data.table package, 280-282

dplyr package, 268-271

grouped data, 269-270

analysis of variance, comparing nested models, 395

anova function, 395

appending, 237-238. See also combining

applications, Shiny

server component, 564-566

sharing, 570

structure, 561-562

ui component, 562-564

apply functions, 181-195, 250-251

applying to data frames, 193-195

example, 184-186

lapply, 195-204

order of “apply” inputs, 201-203

using with vectors, 199-201

margin values, 183-184

multiple margins, 186-187

passing extra arguments to “applied” function, 188-191

sapply function, 204-208

returns, 205-207

tapply

multiple grouping variables, 209-210

multiple returns, 210-212

return values, 212

using with higher dimension structures, 187-188

xapply, 182

arguments, 116

apply function, 183

breaks, 111

defining, 132-133

ellipsis, passing graphical parameters, 159-161

for merge function, 238

named arguments, 131

arima function, 449

ARIMA models, in time series analysis, 448-451

arrange function, 263

arrays, 34, 58-60

creating, 58-60

subscripting, 60

as.numeric function, 121

assessing models, 382

abline function, 389-390

extractor functions, 385-386

interaction terms, 396-398

as list objects, 386-388

plot function, 383-385

predict function, 390-391

summary function, 382-383

assignment arrows, 19-20

attributes

of data frames, querying, 87

of lists, 72-73

of matrices, 52-54

of single mode data structures, comparing, 60-62

of vectors, 41-43

autocorrelations, in time series analysis, 448

axis limits, setting for plots, 294-295

B

bar charts, 291

Becker, Rick, 345

benchmarking, 457-458

bigmemory package, 282

binaries, installing packages from, 26

bivariate lattice graphics, 350-351

blank inputs, 44-45, 74

boxplots, 290

breaks argument, 111

bugs, reporting, 8

building packages, 471-472, 482-485

C

c function, creating vectors, 35-36

C++

incorporating code in R, 501-502

integrating with, 464-468

using R functions in, 467-468

capturing input definitions, 164-167

case sensitivity for file paths, 219

cast function, 245-246

categorical data, 108-112

cbind function, 49

censoring in survival analysis, 431-432

Chambers, John, 2, 535

character data

manipulating, 123-124

searching and replacing, 124-125

character value inputs, 48, 57-58, 76

checking

function inputs, 136, 155-157

multivalue inputs, 162-164

packages, 482-484

classes, 505-509

creating, constructor functions, 510-511

example of, 507-508

extending, 518

generics, 511-516

creating, 515-516

naming conventions, 512

methods

defining for arithmetic operators, 513-514

updating, 513

object orientation, 506-508

inheritance, 508

R6, 542-544

active bindings, 544

example of, 543-544

private members, 542

public members, 542

Reference Classes, 535-542

creating, 535-537

documenting, 542

methods, defining, 537-540

objects, copying, 540-542

removing, 510

S3, 509

creating, 509-511

documenting, 518

inheritance, 516-518

limitations of, 518-519

lists versus attributes, 514-515

naming conventions, 512

S4, 523-535

defining, 525-529

documenting, 534-535

inheritance, 532-534

methods, 529-530

multiple dispatch, 531-532

summary function and, 405

writing, 505

Cleveland, William, 345

clipboard, 219

closing graphics devices, 288

code

C++, incorporating in R, 501-502

improving efficiency

benchmarking, 457-458

initialization, 458-459

integrating with C++, 464-468

with memory management, 463-464

using alternative functions, 462-463

vectorization, 459-462

including in documents

LaTex documents, 556

RMarkdown documents, 550-552

profiling, 456

quality of, 476-477

coef function, 385-386

coefficients from logistic regression, 419-420

colon (:), interaction terms, 396

color function, 288

colors, specifying, 288

column index, 55

columns

adding, 266, 277-278

referencing, 179-180

selecting, 264-266

selecting from data frames, 88

subscripting, 88-90

combining

data.tables, 279-280

lists, 80

plot types, 318-321

vectors, 49-51

comment blocks, 15

comparing

attributes and lists, 514-515

nested models, 393-395

R and C++, 465-466

reshape and reshape2 packages, 245

single mode data structures, 60-62

conferences, 6

confint function, 420

connecting

to Excel from R, 228

to R from Excel, 226

constructor functions, 510-511

continuation prompts, 15

continuous variables, creating factors, 111-112

contrast methods, 400

controlling

aesthetics in ggplot2 package, 322-324

layout, 305-308

grid layouts, 306-307

layout function, 307-308

strip headers, 363-364

styles for lattice graphics, 372-376

converting objects, 156-157

coordinate systems, 338-339

copying Reference Class objects, 540-542

core packages, 23

counting records, 281

covariates, in survival analysis, 436

coxph function, 438-439

CRAN, 7

METACRAN website, 24

navigating to, 573

packages

finding, 23-24

installing, 25-26

create function, 472-474

creating

arrays, 58-60

classes

constructor functions, 510-511

S3, 509-511

data frames, 86-87

data.tables, 273-274

date objects, 103-104

factors, 108-110

from continuous data, 111-112

functions, 130-136, 151-155

error messages, 152-153

warnings, 153-155

generics, 515-516

lattice graphs, 346-355

lists, 71-72

with element names, 71

empty lists, 69

non-empty lists, 70

matrices, 49-52

with a single vector, 51-52

package structure, 472-474

reactive functions, 567-568

Reference Classes, 535-537

sequence of integers, 37-38

sequence of numeric values, 38-39

sequence of repeated values, 39-41

tbl_df objects, 262-263

themes for lattice graphics, 374-376

time objects, 104-105

vectors, 35-41

with c function, 35-36

CSV files, reading, 220

custom functions

applying over dimensions, 191-192

passing extra arguments, 192-193

custom plots, 333-339

aes function, 333-336

coordinate systems, 338-339

ggplot function, 333

multiple data frames, 336-338

cut function, 111

D

data, including in packages, 494-496

data aggregation

aggregate function, 252

apply functions, 250-251

calculating differences from baseline, 257-258

“for” loops, 250

data argument (lm function), 381

data frames, 86-93

apply functions, 193-195

attributes, querying, 87

columns, selecting, 88

creating, 86-87

factors, creating, 108-110

graphing, 97-98

lapply function, 203-204

referencing as a matrix, 90-92

returning top and bottom of data, 93-94

sorting, 236-237

splitting, 197-199

subscripting, 92-93

summarizing, 96

viewing, 94-96

working with multiple, 336-338

“data” lattice graphics, 354-355

data munging, 235

data types, 33-34

factors, 108-112

manipulating levels, 110-111

numeric factors, 109

reordering, 110

DataCamp, 5

data.table package, 273-282

aggregation, 280-282

columns

adding, 277-278

renaming, 277-278

rows, adding, 278-279

setting a key, 274-275

subscripting, 275-276

data.tables

counting records, 281

creating, 273-274

merging, 279-280

date objects, creating, 103-104

dates

lubridate package, 107-108

manipulating, 105-106

DBI (database interface), 225-226

decomposition, in time series analysis, 443-445

defining

function arguments, 132-133

keys, 274-275

methods

for arithmetic operators, 513-514

for Reference Classes, 537-540

S4 classes, 525-529

S4 generics, 530-531

time zones, 105

deleting packages, 24

deparse function, 166

dependencies, 27

descending sorts, 237

DESCRIPTION file, 474-475

developing a test framework, 490-494

incorporating tests into packages, 493-494

test_that function, 490-493

devices (graphics)

closing, 287-288

creating, 287-288

devtools, building packages, 482-485

diagnostic plots, 383-385

comparing, 387-394

in GLM framework, 416

for time series analysis, 449-450

diff function, 106

difftime function, 106

dimensions

dropping, 56

functions, applying, 191-192

dimnames function, 53-54

distribution types, GLM framework, 412

distributions

hist function, 160-162

statistical distributions, 119-120

documentation. See also dynamic reporting; reporting

interactive documents, 569-570

package documentation, generating, 477-482

function headers, 478-480

help pages, 480-482

R Documentation, 5

R manuals, 4-5

Reference Classes, 542

S3 class system, 518

S4 class system, 534-535

vignettes

including in packages, 496-498

markdown notation, 499

writing, 498-501

double square bracket referencing, 76-77

dplyr package, 261-273

aggregation, 268-271

grouped data, 269-270

merge function, 267-268

mutate function, 266

pipe operator, 271-273

sorting, 263

subscripting, 264-266

with filter function, 264

with select function, 264-265

tbl_df objects, creating, 262-263

dropping dimensions, 56

duplicated function, 241-242

dynamic reporting, 547-548

LaTex, 553-556

RMarkdown, 548-552

code chunks, including, 550-552

HTML files, building, 550

dynamic typing, 19

E

EARL (Effective Applications of the R Language) conference, 6

Eclipse, 13

efficiency of code, improving

benchmarking, 457-458

initialization, 458-459

integrating with C++, 464-468

with memory management, 463-464

profiling, 456

using alternative functions, 462-463

vectorization, 459-462

elements

extracting from named lists, 84

list elements

adding, 79-80

referencing, 76-79

ellipsis, 157-159

passing graphical parameters, 159-161

empty lists, creating, 69

errors

bugs, reporting, 8

returning, 152-153

escape sequences, 21

estimating survival function in survival analysis, 432-436

example

of apply function, 184-186

of classes, 507-508

of merge function, 239

of R6 class system, 543-544

Excel

connecting to R, 226

reading structured data, 226-227

XLConnect package, 228-231

exporting text files, 220

extending

classes, 518

packages, 489-490

extensions

to GLM framework, 422-423

to nonlinear models, 430

to survival analysis, 441

to time series analysis, 452

extracting elements from named lists, 84

extractor functions, 385-386

F

facet_grid function, 329-331

facet_wrap function, 331-332

factor variables

in linear models, 398-401

in logistic regression, 419

factors, 108-112

creating, 108-110

from continuous data, 111-112

manipulating levels, 110-111

numeric factors, 109

reordering, 110

ff package, 282

file.choose function, 217

filter function, 264

finding

duplicate values, 241-242

packages, 23-24

fitted function, 385-386

flow control, if/else statements, 136-146

& and | operators, 144-145

example, 145-146

mixing conditions, 143

multiple test values, 139-140

nested statements, 138-139

returning early, 145

reversing logical values, 142-143

summarizing to a single logical, 140-141

switching with logical input, 141-142

using one condition, 139

for function, 174-176

loop variable, 175-176

“for” loops, 174, 250

foreign package, 222

formulas, using with aggregate function, 252-254

multiple return values, 253-254

summarizing by multiple variables, 252-253

summarizing multiple columns, 253

fread function, 221

function keyword, 130-131

functions

abline, 389-390

acf, 448

aes, 333-336

aggregate, 252-254

aggregate function, specifying variables, 254-256

anova, 395

apply, 181-195, 250-251

applying to data frames, 193-195

example, 184-186

margin values, 183-184

multiple margins, 186-187

passing extra arguments to “applied” function, 188-191

using with higher dimension structures, 187-188

arguments, 116

defining, 132-133

named arguments, 131

arima, 449

arrange, 263

as.numeric, 121

c, creating vectors, 35-36

calling, 116

shortened argument calling, 162-161

cast, 245-246

cbind, 49

coef, 385-386

color, 288

confint, 420

constructor functions, 510-511

coxph, 438-439

create, 472-474

creating, 130-136, 151-155

cut, 111

deparse, 166

diff, 106

difftime, 106

dimnames, 53-54

distribution functions, 119-120

duplicated, 241-242

error handling, 462

error messages, creating, 152-153

extractor functions, 385-386

facet_grid, 329-331

facet_wrap, 331-332

file.choose, 217

filter, 264

fitted, 385-386

for, 174-176

loop variable, 175-176

fread, 221

gather, 247-248

gc, 464

get, 164

ggplot, 333

glm, 413

logistic regression, 418-419

methods for, 415-416

Poisson regression, 420-422

grep, 124-125

group_by, 269-271

gsub, 124-125

head, 93-94

help, 28-29

hist, 160-162

HoltWinters, 446-447

I, 404

ifelse, 461

if/else structure, 136-146

example, 145-146

mixing conditions, 143

multiple test values, 139-140

nested statements, 138-139

returning early, 145

reversing logical values, 142-143

summarizing to a single logical, 140-141

switching with logical input, 141-142

using one condition, 139

inputs

capturing, 164-167

checking, 136, 155-157, 162-164

ellipsis, 157-159

is.x, 122

lapply, 195-204

order of “apply” inputs, 201-203

using with data frames, 203-204

using with vectors, 199-201

layout, 307-308

legend, 302-304

length, 41-42, 53

library, 27

lines, 299-300

in nonlinear models, 428

lm, 380-381

methods for, 406-407

logRange, 155

ls.str, 18-19

mathematical functions, 117-118

matrix, 51-52

melt, 243-245

merge, 238-241, 267-268

inner joins, 240

outer joins, 240-241

missing data functions, 122-123

mode, 34

mutate, 266

names, 42-43, 386-388

naming, 132

nchar, 123

ncol, 53

nested calls, 41

nls, 423-425

nrow, 53

objects, 18

odbcConnectAccess, 224

order, 236-237

output, saving, 131

pacf, 448

panel functions, 365-371

par, 304-305

paste, 124, 157-158

plot, 291-299, 383-385

in GLM framework, 416

parameters, setting, 304-305

in proportional hazards regression, 439-441

in survival analysis, 434

in time series analysis, 442-443

plyr, 213

points, 299-300

predict, 390-391

in ARIMA models, 450-451

in logistic regression, 419

in nonlinear models, 428

in survival analysis, 435

in time series analysis, 447

qplot, 314-315

layers, 316

rbind, 50, 237-238

reactive, 566-568

read.table, 218

remove.packages, 24

rep, 39-41

replace, 122

resid, 385-386

return objects, 134-136

Rprof, 456

runif, 157

sapply, 204-208

returns, 205-207

save, 22

scoping rules, 133-134

searchpaths, 17-18

select, 264-265

self-starting, 427

separate, 249

seq, 38-39

split, 195-197

spread, 248

sqlcolumns, 224

statistical summary functions, 118-119

stl, 443-445

stop, 152

structure, 129-130

substitute, 166

substring, 123

summary, 96, 382-383, 405

classes and methods, 405

in GLM framework, 415-416

with names function, 388

in survival analysis, 433-434

survfit, 433-434

in proportional hazards regression, 439-441

switch, 159

table function, 121

tail, 94

tapply, 208-213

multiple grouping variables, 209-210

multiple returns, 210-212

return values, 212

test_that, 490-493

text, 300-302

ts, 441-443

tsdiag, 449-450

update, 392-393

UseMethod, 512

warning, 153

warnings, 153-155

while, 180-181

window, 443

xapply, 182

G

garbage collection, 464

gather function, 247-248

Gaussian model fitting, 414

gc function, 464

generating

classes with constructor function, 510-511

documentation

with LaTex, 553-556

with RMarkdown, 548-552

package documentation with roxygen headers, 477-482

function headers, 478-480

help pages, 480-482

reports, 547-548

generics, 511-516

creating, 515-516

multiple dispatch, 531-532

naming conventions, 512

S4, defining, 530-531

Gentleman, Robert, 3

get function, 164

ggplot function, 333

ggplot2 package, 313

aes function, 333-336

aesthetics, 321-329

controlling, 322-324

grouped data, 327-329

legend, 324-327

combining plot types, 318-321

custom plots, 333-339

coordinate systems, 338-339

working with multiple data frames, 336-338

ggplot function, 333

global themes, 340-341

legend layout, 341

paneling, 329-333

facet_grid function, 329-331

facet_wrap function, 331-332

philosophy of, 313-314

plots

changing, 317-320

as objects, 316-317

qplot function, 314-315

layers, 316

theme layers, 339-340

ggvis package, 342

GitHub, installing packages from, 26-27

GLM (Generalized Linear Model) framework

defined, 412-413

distribution types, 412

extensions, 422-423

Gaussian model fitting, 414

glm function, 413

logistic regression, 417-420

methods for, 415-416

Poisson regression, 420-422

glm function, 413

logistic regression, 418-419

methods for, 415-416

Poisson regression, 420-422

Global Environment. See workspaces

global themes, 340-341

graphical parameters, passing, 159-161

graphics

colors, 288

devices

closing, 288

creating, 287-288

ggplot2 package, 313

aes function, 333-336

aesthetics, 321-329

combining plot types, 318-321

custom plots, 333-339

ggplot function, 333

global themes, 340-341

legend layout, 341

philosophy of, 313-314

plots as objects, 316-317

qplot function, 314-315

theme layers, 339-340

ggvis package, 342

high-level graphics functions, plot, 291-299

lattice graphics, 345

3D, 352-354

bivariate, 350-351

“data” graphics, 354-355

graph options, 356-358

graph types, 347

graphs, creating, 346-355

groups of data, representing, 360-362

panels, 362-371

plotting multiple variables, 358-360

plotting subsets of data, 355

styles, controlling, 372-376

themes, creating, 374-376

transposing the axes, 351-352

univariate, 348-350

layout, controlling, 305-308

grid layouts, 306-307

layout function, 307-308

low-level graphics functions, 299-304

legend, 302-304

lines, 299-300

points, 299-300

text, 300-302

parameters, 304-305

trellis graphics, 345

univariate graphics, 289-291

graphing

bar charts, 291

data frames, 97-98

hist function, 160-162

Greek letters, adding to plots, 294

grep function, 124-125

grid layouts, 306-307

group_by function, 269-271

grouped data, 327-329

gsub function, 124-125

H

head function, 93-94

help function, 28-29

help pages, generating, 480-482

Help pane (RStudio), 28-29

high-level graphics functions, plot, 291-299

hist function, 160-162

histograms, 289

HoltWinters function, 446-447

Holt-Winters method, 446-447

HTML files, building, 550

I

I function, 404

IDEs (integrated development environments), 13

Eclipse, 13

Notepad++, 13

R GUI, 11-12

RStudio, 12-13

ifelse function, 461

if/else statements

& and | operators, 144-145

example, 145-146

mixing conditions, 143

multiple test values, 139-140

nested statements, 138-139

returning early, 145

reversing logical values, 142-143

structure, 136-146

summarizing to a single logical, 140-141

switching with logical input, 141-142

using one condition, 139

Ihaka, Ross, 3

Import Wizard, 218

importing text files, 218

improving code efficiency

benchmarking, 457-458

initialization, 458-459

integrating with C++, 464-468

with memory management, 463-464

profiling, 456

using alternative functions, 462-463

vectorization, 459-462

incorporating tests into packages, 493-494

independent variables

factor variables as, 398-401

indexed printing, 36

inheritance, 508

in S3, 516-518

in S4, 532-534

inhibiting formula interpretation, 404

initialization, 458-459

inner joins, 240

inputs

ellipsis, 157-159

function inputs

capturing, 164-167

checking, 136, 155-157, 162-164

order of “apply” inputs, 201-203

list subscripting inputs

blank inputs, 74

negative integer inputs, 75

positive integer inputs, 74-75

vector subscripting inputs, 44

blank inputs, 44-45

character values, 48

logical values, 46-47

negative integer, 45-46

positive integers, 45

installing

packages, 24-27

from binaries, 26

from CRAN, 25-26

from source, 26-27

R, 573

on Linux, 574-575

on Mac OS X, 574

on Windows, 573-574

RStudio, 577-578

Rtools on Windows, 575-577

integers, creating sequence of, 37-38

interaction terms, 396-398

interactive documents, 569-570

intercepts, removing, 381

is.x functions, 122

iteration, loops

“for” loops, 250

nested loops, 177-179

performance, 180

referencing data with, 176-177

“while” loops, 176-177

J

J function, 275-276

joins

inner joins, 240

merging data in dplyr package, 267-268

outer joins, 240-241

K

Kaplan-Meier estimates, 433-434

keys

defining, 274-275

numeric keys, 276-277

keywords, function, 130-131

knitr package, 548

L

lapply function, 195-204

order of “apply” inputs, 201-203

using with data frames, 203-204

using with vectors, 199-201

LaTex, 548

dynamic reporting, 553-556

lattice graphics, 345

3D, 352-354

bivariate, 350-351

“data” graphics, 354-355

graph options

plot types and formatting, 357-358

title and axes, 356-357

graphs

creating, 346-355

types, 347

groups of data, representing, 360-362

panels, 362-371

controlling strip headers, 363-364

functions, 365-371

multiple “by” variables, 364-365

plotting multiple variables, 358-360

plotting subsets of data, 355

styles

controlling, 372-376

previewing, 373

themes, creating, 374-376

transposing the axes, 351-352

univariate, 348-350

layers in quick plots, 316

layout

controlling, 305-308

layout function, 307-308

grid layouts, 306-307

legend function, 302-304

length function, 41-42, 53

library function, 27

licenses for R packages, 475

limitations of S3, 518-519

linear models, 380-381

assumptions, 411-412

factor variables, 398-401

interaction terms, 396-398

methods for, 406-407

multiple linear regression

comparing nested models, 393-395

creating new models, 391-392

updating existing models, 392-393

variable transformations, 402-404

lines function, 299-300

in nonlinear models, 428

lines on plots, adding, 389-390

Linux

installing R, 574-575

installing Rtools, 575

list objects, models as, 386-388

listing

empty lists, creating, 69

non-empty lists, creating, 70

objects, 18-19

lists, 68-86

attributes, 72-73

combining, 80

creating, 71-72

with element names, creating, 71

elements

adding, 79-80

referencing, 76-79

motivation for using, flexible simulation, 83-84

named lists, 81-82

extracting elements from, 84

printing, 72, 85-86

subscripting, 73

blank inputs, 74

character value inputs, 76

logical value inputs, 75

negative integer inputs, 75

positive integer inputs, 74-75

subsetting, 73

unnamed lists, 81

lm function, 380-381

methods for, 406-407

loading packages, 27-28

logical values

as list subscripting input, 75

as matrix subscripting input, 56-57

reversing, 142-143

specifying, 36

as vector subscripting input, 46-47

logistic regression, 417-420

logRange function, 155

loop variable, 175-176

loops

in C++, 467

“for” loops, 174, 250

initialization, 458-459

nested loops, 177-179

performance, 180

referencing data with, 176-177

“while” loops, 174

low-level graphics functions, 299-304

legend, 302-304

lines, 299-300

points, 299-300

text, 300-302

ls.str function, 18-19

lubridate package, 107-108

M

Mac OX S

installing R, 574

installing RStudio, 577-578

installing Rtools, 575

mailing lists, 4

manipulating. See also sorting

character data, 123-124

dates, 105-106

factor levels, 110-111

times, 105-106

manuals, 4-5

margin values (apply function), 183-184

Markdown, 548. See also RMarkdown

masking, 27-28

mathematical functions, 117-118

matrices, 34, 49-58

attributes, 52-54

column index, 55

creating, 49-52

with a single vector, 51-52

dropping dimensions, 56

referencing data frames as, 90-92

subscripting, 55

character values, 57-58

logical values, 56-57

transposing, 50-51

matrix function, 51-52

melt function, 243-245

memory management, 463-464

merge function, 238-241, 267-268

inner joins, 240

outer joins, 240-241

merging data.tables, 279-280

METACRAN website, 24

methods, 512

defining for arithmetic operators, 513-514

for GLM framework, 415-416

for linear models, 406-407

parametric methods in survival analysis, 434-435

for Reference Classes, defining, 537-540

S4, 529-530

summary function and, 405

updating, 513

microbenchmark package, 457-458

Microsoft Excel. See Excel

missing data functions, 122-123

mode function, 34

models, 379

assessing, 382

abline function, 389-390

extractor functions, 385-386

interaction terms, 396-398

as list objects, 386-388

plot function, 383-385

predict function, 390-391

summary function, 382-383

GLM framework

defined, 412-413

distribution types, 412

extensions, 422-423

Gaussian model fitting, 414

glm function, 413

logistic regression, 417-420

methods for, 415-416

Poisson regression, 420-422

linear models, 380-381

assumptions, 411-412

factor variables, 398-401

interaction terms, 396-398

methods for, 406-407

variable transformations, 402-404

multiple linear regression

comparing nested models, 393-395

creating new models, 391-392

updating existing models, 392-393

nonlinear regression

assumptions, 423

extensions, 430

nls function, 423-425

Puromycin data example, 425-429

survival analysis, 430

censoring in, 431-432

estimating survival function, 432-436

extensions, 441

ovarian data frame example, 431

proportional hazards regression, 437-441

time series analysis

ARIMA models, 448-451

autocorrelations, 448

decomposition, 443-445

extensions, 452

smoothing, 446-447

ts function, 441-443

modes. See data types

motivation for using lists, flexible simulation, 83-84

multimode data structures, 36, 67-68

data frames, 86-93

apply functions, 193-195

attributes, querying, 87

columns, selecting, 88

columns, subscripting, 88-90

creating, 86-87

graphing, 97-98

lapply function, 203-204

referencing as a matrix, 90-92

returning top and bottom of data, 93-94

sorting, 236-237

splitting, 197-199

subscripting, 92-93

viewing, 94-96

working with multiple, 336-338

lists, 68-86

attributes, 72-73

creating, 71-72

with element names, creating, 71

empty lists, creating, 69

motivation for using, 83-84

named lists, 81-82

non-empty lists, creating, 70

printing, 72, 85-86

subscripting, 73

unnamed lists, 81

multiple dispatch, 531-532

multiple linear regression

comparing nested models, 393-395

creating new models, 391-392

updating existing models, 392-393

Murrell, Paul, 313

mutable objects, 538-539

mutate function, 266

N

named arguments, 131

named lists, 81-82

extracting elements from, 84

names function, 42-43, 386-388

NAMESPACE file, 475-476

naming

functions, 132

generics, 512

objects, 20

S3 classes, 512

variables, 241

navigating to CRAN, 573

nchar function, 123

ncol function, 53

negative integer inputs, 45-46, 75

nested calls, 41

nested loops, 177-179

nested models, comparing, 393-395

nicknames, 7

nls function, 423-425

non-empty lists, creating, 70

nonlinear regression

assumptions, 423

extensions, 430

nls function, 423-425

Puromycin data example, 425-429

Notepad++, 13

nrow function, 53

numeric factors, 109

numeric keys, 276-277

numeric values

creating sequence of, 38-39

simulating, 83-84

O

object orientation, 505-508

inheritance, 508

R and, 405-406

objects, 16-22. See also packages

converting, 156-157

date objects, creating, 103-104

listing, 18-19

mutable objects, 538-539

naming, 20

packages, 17

search path, 17-18

plots as, 316-317

Reference Class objects, copying, 540-542

removing from workspace, 20

return objects, 134-136

saving, 22

tbl_df objects, creating, 262-263

time objects, creating, 104-105

workspaces, 19-22

objects function, 18

odbcConnectAccess function, 224

online resources, 4-5

operating systems

Mac OX S

installing R, 574

installing RStudio, 577-578

installing Rtools, 575

Windows

building packages, 482

clipboard, 219

operators, 117-118

&, 144-145

arithmetic operators, defining methods for, 513-514

pipe, 248, 271-273

order function, 236-237

outer joins, 240-241

output of functions, saving, 131

ovarian data frame example (survival analysis), 431

P

pacf function, 448

packages, 7, 17, 23-28

bigmemory, 282

building, 471-472

with devtools, 482-485

checking, 482-484

code quality, 476-477

data, including, 494-496

data.table, 273-282

aggregation, 280-282

columns, adding, 277-278

columns, renaming, 277-278

merging data tables, 279-280

rows, adding, 278-279

setting a key, 274-275

subscripting, 275-276

deleting, 24

dependencies, 27

documentation, generating with roxygen headers, 477-482

dplyr, 261-273

aggregation, 268-271

merge function, 267-268

mutate function, 266

pipe operator, 271-273

sorting, 263

subscripting, 264-266

dplyr package, creating tbl_df objects, 262-263

extending, 489-490

ff, 282

finding, 23-24

foreign, 222

ggplot2, 313

aes function, 333-336

aesthetics, 321-329

combining plot types, 318-321

paneling, 329-333

philosophy of, 313-314

plots as objects, 316-317

qplot function, 314-315

ggplot2 package

ggplot function, 333

global themes, 340-341

legend layout, 341

theme layers, 339-340

ggvis package, 342

installing, 24-27, 485

from binaries, 26

from CRAN, 25-26

from source, 26-27

knitr, 548

lattice, 346

licenses, 475

loading, 27-28

lubridate, 107-108

masking, 28

METACRAN website, 24

microbenchmark, 457-458

proto, 544

Rcpp, 501-502

repositories, 23

reshape, 243

cast function, 245-246

melt function, 243-245

RODBC, 223-225

sas7bdat, 223

search path, 17-18

Shiny, 561-566

applications, 561-566

interactive documents, 569-570

reactive functions, 566-568

sharing applications, 570

structure, 472-476

creating, 472-474

DESCRIPTION file, 474-475

NAMESPACE file, 475-476

tests, incorporating, 493-494

tidyr, 246-249

gather function, 247-248

separate function, 249

spread function, 248

vignettes, 496-498

markdown notation, 499

writing, 498-501

XLConnect, 228-231

zoo, 123

Packages pane (RStudio), 24

paneling, 329-333

facet_grid function, 329-331

facet_wrap function, 331-332

with lattice graphics, 362-371

controlling strip headers, 363-364

functions, 365-371

multiple “by” variables, 364-365

par function, 304-305

parameters, setting for plotting functions, 304-305

parametric methods in survival analysis, 434-435

passing graphical parameters, 159-161

paste function, 124, 157-158

performance, loop performance, 180

pipe operator, 248, 271-273

plot function, 291-299, 383-385

in GLM framework, 416

paneling, facet_grid function, 329-331

parameters, setting, 304-305

in proportional hazards regression, 439-441

qplots, layers, 316

in survival analysis, 434

in time series analysis, 442-443

plots

custom plots, 333-339

aes function, 333-336

coordinate systems, 338-339

ggplot function, 333

mulltiple data frames, 336-338

diagnostic plots, 383-385

comparing, 387-394

in GLM framework, 416

for time series analysis, 449-450

lines on, adding, 389-390

in nonlinear models, 428-429

as objects, 316-317

paneling, 329-333

quick plots, 314-315

faceting, 333

layers, 316

symbols, 296-297

types, 298-299

changing, 317-320

types, combining, 318-321

plyr function, 213

points function, 299-300

Poisson regression, 420-422

positive integer inputs, 45, 74-75

POSIX functions, 105

pre-allocation, 458-459

predict function, 390-391

in ARIMA models, 450-451

in logistic regression, 419

in nonlinear models, 428

in survival analysis, 435

in time series analysis, 447

previewing lattice graphics styles, 373

printing

indexed printing, 36

lists, 72, 85-86

profiling code, 456

proportional hazards regression, 437-441

proto package, 544

Puromycin data example (nonlinear regression), 425-429

Q

qplot function, 314-315

faceting, 333

layers, 316

QQ plots, 289

quality of code, 476-477

querying

data frame attributes, 87

vector attributes, 41-43

quotes, 34

object naming conventions, 20

development of, 3, 7-8

installing, 573

on Linux, 574-575

on Mac OS X, 574

on Windows, 573-574

nicknames, 7

object orientation and, 405-406

resources, 4-6

syntax, 14-16

user events, 6

versions, 7-8

R

R Console, 14-15

R Consortium, 3, 5-6

R Development Core Team, 3

R Documentation, 5

R GUI, 11-12

R models. See models

R6 class system, 542-544

active bindings, 544

example of, 543-544

private members, 542

public members, 542

rbind function, 50, 237-238

Rcpp package, 464-468, 501-502

.RData format, 221

reading

CSV files, 220

structured data from Excel, 226-227

text files, 218-220

read.table function, 218

recommended packages, 23

records, counting, 281

re-creating simulated values, 120

Reference Classes, 535-542

creating, 535-537

documenting, 542

methods, defining, 537-540

objects, copying, 540-542

referencing

columns, 179-180

data frames as a matrix, 90-92

data with loops, 176-177

list elements, 76-79

with $, 77-79

double square bracket referencing, 76-77

regular expressions, 124, 182

relational databases, 223-226

DBI, 225-226

RODBC package, 223-225

remove.packages function, 24

removing

classes, 510

intercepts, 381

objects from workspace, 20

renaming columns, 277-278

reordering factors, 110

rep function, 39-41

repeated values, creating sequence of, 39-41

replace function, 122

reporting

bugs, 8

dynamic reporting, 547-548

LaTex, 553-556

RMarkdown, 548-552

repositories

CRAN

METACRAN website, 24

packages, finding, 23-24

for packages, 23

representing groups of data, 360-362

reshape package, 243

cast function, 245-246

melt function, 243-245

resid function, 385-386

restoring R sessions, 221

restructuring, 242-249

with reshape package, 243

cast function, 245-246

melt function, 243-245

with tidyr package, 246-249

gather function, 247-248

spread function, 248

return objects, 134-136

returning error messages, 152-153

reversing logical values, 142-143

RExcel, 13

RMarkdown, dynamic reporting, 548-552

code chunks, including, 550-552

HTML files, building, 550

RODBC package, 223-225

rows, adding, 278-279

roxygen headers, generating documentation with, 477-482

function headers, 478-480

help pages, 480-482

Rprof function, 456

RStudio, 12-13

data frames, viewing, 94-96

Help pane, 28-29

Import Wizard, 218

Installing, 577-578

packages, loading, 27-28

Packages pane, 24

script window, 132

sessions, restoring, 221

Source pane, 16

text files

importing, 218

reading, 218-220

Rtools, installing on Windows, 575-577

runif function, 157

R Console, 14-15

R Consortium, 3, 5-6

R Development Core Team, 3

R Documentation, 5

R GUI, 11-12

R models. See models

R6 class system, 542-544

active bindings, 544

example of, 543-544

private members, 542

public members, 542

rbind function, 50, 237-238

Rcpp package, 464-468, 501-502

.RData format, 221

reading

CSV files, 220

structured data from Excel, 226-227

text files, 218-220

read.table function, 218

recommended packages, 23

records, counting, 281

re-creating simulated values, 120

Reference Classes, 535-542

creating, 535-537

documenting, 542

methods, defining, 537-540

objects, copying, 540-542

referencing

columns, 179-180

data frames as a matrix, 90-92

data with loops, 176-177

list elements, 76-79

with $, 77-79

double square bracket referencing, 76-77

regular expressions, 124, 182

relational databases, 223-226

DBI, 225-226

RODBC package, 223-225

remove.packages function, 24

removing

classes, 510

intercepts, 381

objects from workspace, 20

renaming columns, 277-278

reordering factors, 110

rep function, 39-41

repeated values, creating sequence of, 39-41

replace function, 122

reporting

bugs, 8

dynamic reporting, 547-548

LaTex, 553-556

RMarkdown, 548-552

repositories

CRAN

METACRAN website, 24

packages, finding, 23-24

for packages, 23

representing groups of data, 360-362

reshape package, 243

cast function, 245-246

melt function, 243-245

resid function, 385-386

restoring R sessions, 221

restructuring, 242-249

with reshape package, 243

cast function, 245-246

melt function, 243-245

with tidyr package, 246-249

gather function, 247-248

spread function, 248

return objects, 134-136

returning error messages, 152-153

reversing logical values, 142-143

RExcel, 13

RMarkdown, dynamic reporting, 548-552

code chunks, including, 550-552

HTML files, building, 550

RODBC package, 223-225

rows, adding, 278-279

roxygen headers, generating documentation with, 477-482

function headers, 478-480

help pages, 480-482

Rprof function, 456

RStudio, 12-13

data frames, viewing, 94-96

Help pane, 28-29

Import Wizard, 218

Installing, 577-578

packages, loading, 27-28

Packages pane, 24

script window, 132

sessions, restoring, 221

Source pane, 16

text files

importing, 218

reading, 218-220

Rtools, installing on Windows, 575-577

runif function, 157

S

S, development of, 1-3

S3 class system, 406, 509

classes, creating, 509-511

documenting, 518

inheritance, 516-518

limitations of, 518-519

lists versus attributes, 514-515

naming conventions, 512

S4 class system, 523-535

defining classes, 525-529

documenting, 534-535

generics, defining, 530-531

inheritance, 532-534

methods, 529-530

multiple dispatch, 531-532

sapply function, 204-208

returns, 205-207

Sarkar, Deepayan, 346

sas7bdat package, 223

save function, 22

saving

function output, 131

workspace objects, 22

workspaces, 221-222

scoping rules for functions, 133-134

script window (RStudio), 132

scripting, 16

search path, 17-18

masking, 28

searching and replacing character data, 124-125

searchpaths function, 17-18

select function, 264-265

selecting columns from data frames, 88

self-starting functions, 427

separate function, 249

seq function, 38-39

sequence of repeated values, creating, 39-41

server component of Shiny applications, 564-566

sharing Shiny applications, 570

Shiny package, 561-566

applications

server component, 564-566

sharing, 570

structure, 561-562

ui component, 562-564

interactive documents, 569-570

reactive functions, 566-568

shortened $ referencing, 78-79

simulated values, re-creating, 120

simulating numeric values, 83-84

single mode data structures, 34-35. See also multimode data structures

arrays, 58-60

creating, 58-60

comparing, 60-62

matrices, 49-58

attributes, 52-54

column index, 55

creating, 49-52

dropping dimensions, 56

subscripting, 55

transposing, 50-51

vectors, 35-49

attributes, 41-43

combining, 49-51

creating, 35-41

lapply function, 199-201

subscripting, 43-49

smoothing in time series analysis, 446-447

sorting

with arrange function, 263

data frames, 236-237

descending sorts, 237

Source pane (RStudio), 16

special characters, adding to plots, 294

specifying

colors, 288

logical values, 36

variables for aggregate function, 254-256

split function, 195-197

splitting data frames, 197-199

S-PLUS, 3

spread function, 248

sqlcolumns function, 224

statistical distributions, 119-120

statistical models. See models

Statistical Sciences, Inc., 3

statistical summary functions, 118-119

missing data, 122-123

stl function, 443-445

stop function, 152

structure

of functions, 129-130

of if/else statements, 136-146

of R packages, 472-476

creating, 472-474

DESCRIPTION file, 474-475

NAMESPACE file, 475-476

of Shiny applications, 561-562

tidy structure, 243

structured data, reading from Excel, 226-227

styles for lattice graphics

controlling, 372-376

previewing, 373

subscripting, 60-62

arrays, 60

columns, 88-90

data frames, 92-93

data.tables, 275-276

with filter function, 264

lists, 73

blank inputs, 74

character value inputs, 76

logical values, 75

negative integer inputs, 75

positive integer inputs, 74-75

matrices, 55

character values, 57-58

logical values, 56-57

with select function, 264-265

vectors, 43-49

blank inputs, 44-45

character values, 48

logical values, 46-47

negative integers, 45-46

positive integers, 45

subsets of time series, 443

subsetting lists, 73

substitute function, 166

substring function, 123

summarizing data frames, 96

summary function, 96, 382-383, 405

classes and methods, 405

in GLM framework, 415-416

with names function, 388

in survival analysis, 433-434

survfit function, 433-434

in proportional hazards regression, 439-441

survival analysis, 430

censoring in, 431-432

estimating survival function, 432-436

extensions, 441

ovarian data frame example, 431

proportional hazards regression, 437-441

switch function, 159

symbols, plotting symbols, 296-297

syntax

comment blocks, 15

continuation prompts, 15

lists

named lists, 81-82

unnamed lists, 81

R Console, 14-15

T

table function, 121

tail function, 94

tapply function, 208-213

multiple grouping variables, 209-210

multiple returns, 210-212

return values, 212

Task Views, 23-24

tbl_df objects, creating, 262-263

test framework, developing, 490-494

incorporating tests into packages, 493-494

test_that function, 490-493

test_that function, 490-493

test-driven development, 494

text files, 217-223

exporting, 220

importing, 218

reading, 218-220

text function, 300-302

theme layers, 339-340

themes, creating for lattice graphics, 374-376

tidy data, 243

tidyr package, 246-249

gather function, 247-248

separate function, 249

spread function, 248

tilde (~), formula relationships, 381

time

lubridate package, 107-108

manipulating, 105-106

time objects, creating, 104-105

time series analysis

ARIMA models, 448-451

autocorrelations, 448

decomposition, 443-445

extensions, 452

smoothing, 446-447

ts function, 441-443

time zones, defining, 105

titles, labeling on plots, 293-294

transforming variables, 402-404

transposing matrices, 50-51

trellis graphics, 345

ts function, 441-443

tsdiag function, 449-450

U

ui component of Shiny applications, 562-564

univariate graphics, 289-291

lattice, 348-350

unnamed lists, 81

update function, 392-393

updating methods, 513

UseMethod function, 512

user events, 6

V

variables

continuous variables, creating factors, 111-112

factor variables

in linear models, 398-401

in logistic regression, 419

loop, 175-176

naming, 241

plotting, 358-360

specifying for aggregate function, 254-256

transforming, 402-404

univariate graphics, 289-291

lattice, 348-350

vectorization, 459-462

vectors, 15, 34-49

attributes, 41-43

combining, 49-51

creating, 35-41

with c function, 35-36

lapply function, 199-201

subscripting, 43-49

blank inputs, 44-45

character values, 48

logical values, 46-47

negative integers, 45-46

positive integers, 45

versions of R, 7-8

nicknames, 7

viewing data frames, 94-96

vignettes, 477

including in packages, 496-498

markdown notation, 499

writing, 498-501

Visualizing Data, 345

visualizing data frames, 97-98

W

warnings for functions, returning, 153-155

websites

METACRAN, 24

R Documentation, 5

R Project website, 3

which argument (plot function), 385

while function, 180-181

“while” loops, 174

white space, 45

Wickham, Hadley, 213, 242, 261, 313

window function, 443

Windows operating system

building packages, 482

clipboard, 219

installing R, 573-574

installing RStudio, 577-578

installing Rtools, 575

working directory, 21

workspaces, 19-22

objects

removing, 20

saving, 22

saving, 221-222

working directory, 21

writing

classes, 505

generics, 511-516

object orientation, 506-508

S3, 509

vignettes, 498-501

X

xapply function, 182

X-axis, labeling on plots, 293-295

XCode, installing Rtools, 575

XLConnect package, 228-231

Y-Z

Y-axis, labeling on plots, 293-295

zoo package, 123

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset