Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Index

a priori algorithm
1. association rules
  1. minimum confidence
  2. Modeler results
  3. one antecedent
  4. two antecedents
  5. two-step process
2. frequent itemsets
ADABoost algorithm
1. final boosted classifier
2. initial base classifier
3. original dataset
4. second base classifier
5. third base classifier
adjusted cost matrix
1. bank loan
2. equivalent cost
3. false negative cost
4. false positive cost
5. retailer cost
analysis of variance (ANOVA)
1. Minitab results
2. MSTR
3. multiple regression model
4. R code
5. sample mean age
6. sum of squares
artificial neuron model
association rules
1. a priori property (see a priori algorithm)
2. affinity analysis
3. antecedent and consequent
4. business and research
5. categorical data
6. confidence and support
7. frequent itemsets
8. J-measure
9. lift ratio
10. market basket analysis
11. patterns and models
12. R code
13. strong rules
14. supervised/unsupervised learning
15. worst case scenario
attribute-relation file format (ARFF) file
back-propagation algorithm
1. cross validation termination
2. downstream node
3. error propagation
4. learning rate
5. momentum term
6. squared prediction error
7. upstream node
bagging model
1. algorithm for
2. bootstrap samples
3. vs. CART model
4. prediction method
5. R code
6. stable/unstable classification
balanced iterative reducing and clustering using hierarchies (BIRCH) clustering
1. bank loans data set
  1. cost matrix
  2. data sorting
  3. No Interest model
  4. With Interest model
2. CF/CF tree
  1. Additivity Theorem
  2. algorithm
  3. building process
  4. clustering sub-clusters
  5. definition
  6. one-dimensional toy data set
  7. radius
  8. tree structure
3. Modeler's two-step algorithm
4. optimal number of clusters
5. pseudo-F statistic method
6. R code
7. two-step clustering
baseline model
1. Captain Kirk's situation
2. regression model
Bayesian approach see also Nave Bayes classifier
1. balancing data set
2. drawbacks
3. frequentist/classical approach
4. likelihood function
5. MAP method (see maximum a posteriori (MAP))
6. marginal distribution
7. MCMC methods
8. posterior distribution
9. posterior odds ratio
10. prior distribution
11. R code
Bayesian belief networks (BBNs)
1. clothing purchase
2. conditional probability
3. directed acyclic graph
4. joint probability distribution
5. prior probabilities
6. WEKA
  1. Explorer Panel
  2. positive and negative classification
  3. prior probabilities
  4. test set predictions
bias–variance trade-off
boosting model
1. ADABoost algorithm
  1. final boosted classifier
  2. initial base classifier
  3. original dataset
  4. second base classifier
  5. third base classifier
2. vs. CART model
3. R code
C4.5 algorithm
1. adult data set
2. candidate splits
3. capital gains
4. categorical variables
5. decision node A
6. entropy reduction
7. initial split
8. marital status
9. numerical variables
10. savings split
11. threshold partition
12. training data set
churn data set
1. account length
2. adult data set
3. age predictor
4. area code field
5. balanced data set
6. categorical variables
  1. clustered bar chart
  2. comparative pie chart
  3. directed web graph
  4. International Plan
  5. marginal distribution
  6. non-churners
  7. row percentages
  8. software packages
  9. two-way interaction
  10. voice mail plan
7. clustering analysis
  1. CART decision trees
  2. churn proportion
  3. contingency tables
  4. international plan people
  5. no-plan majority
  6. voice mail plan people
8. conditional independence
9. continuous predictor (see continuous predictor)
10. correlation coefficient
  1. account length
  2. matrix plot
  3. Minitab regression tool
  4. optimal solution
  5. p-values
  6. thresholds
11. customer service calls
12. data preparation
  1. contingency table
  2. HighDayEveMins_Flag variable
  3. voice mail messages
  4. z-score standardization
13. day minutes
14. dichotomous predictor (see dichotomous predictor)
15. education-num variable
16. field values
17. flag variables
18. hours-per-week
19. income overlay
20. International Plan
21. maximum a posteriori
  1. complement probabilities
  2. conditional probability
  3. International Plan
  4. joint conditional probabilities
  5. marginal and conditional probabilities
  6. posterior probabilities
  7. Voice Mail Plan
22. multivariate graphics
23. numerical predictors
  1. binning methods
  2. churn proportion
  3. churners vs. non-churners
  4. customer service call
  5. International Calls
  6. normalized and non-normalized histogram
  7. t-test
24. numerical variables
25. polychotomous predictor (see polychotomous predictor)
26. posterior odds ratio
27. vs. variables
28. visualization
29. voice mail plan
30. VoiceMail Plan adopters
classification and regression trees (CART)
1. adult data set
2. bank loans
3. candidate splits
4. capital gains
5. categorical variables
6. classification error
7. components
8. contingency table
9. cost matrix
10. data-driven misclassification costs
11. decision node A
12. decision node B
13. decision tree output
14. estimated revenue increase
15. evaluation measures
16. initial split
17. lift chart
18. marital status
19. maximum value
20. numerical variables
21. optimal split
22. scaled cost matrix
23. training data set
cluster feature (CF)
1. Additivity Theorem
2. building process
3. clustering sub-clusters
4. definition
5. one-dimensional toy data set
6. radius
7. tree structure
cluster validation
1. cross-validation
  1. loans data sets
  2. methodology
  3. prediction strength
  4. R code
2. loans data sets
3. methodology
4. prediction strength
5. pseudo-F statistic method
  1. clustering model
  2. distribution
  3. Iris data set
  4. R code
  5. SSB and SSE
6. R code
7. silhouette method
  1. cohesion/separation
  2. Iris data set
  3. mean silhouette
  4. positive/negative values
  5. R code
clustering analysis
1. CART decision trees
2. churn proportion
3. contingency tables
4. definition
5. hierarchical clustering
  1. agglomerative clustering
  2. complete-linkage clustering
  3. divisive clustering methods
  4. single-linkage clustering
6. international plan people
7. k-means clustering algorithm
  1. data points
  2. definition
  3. MSE
  4. processing steps
  5. pseudo-F statistic method
  6. SAS Enterpriser Miner (see churn data set)sub
  7. statistics behavior
8. no-plan majority
9. R code
10. voice mail plan people
confidence interval
1. customer service call
2. lower bound
3. margin of error
4. population proportion
5. subgroup analyses
6. t-interval
7. upper bound
continuous predictor
1. categorical predictor
2. confidence intervals
3. day minute usage
4. deviance
5. p-value
6. test statistics
7. unit-increase interpretation
Cook's distance
correlation coefficient
1. account length
2. matrix plot
3. Minitab regression tool
4. optimal solution
5. p-values
6. PCA
7. thresholds
cost-benefit analysis
1. CART model
  1. contingency table
  2. cost matrix
  3. estimated revenue increase
  4. evaluation measures
  5. scaled cost matrix
2. cost matrix
3. decision invariance
  1. binary classifier
  2. scaling
4. direct cost
5. k-nary classification
  1. accuracy
  2. contingency table
  3. Loans data sets
  4. overall error rate
  5. predicted/actual categories
  6. sensitivity
6. Loans data set
  1. adjusted cost matrix
  2. assumptions
  3. CART model
  4. direct cost matrix
  5. simplified cost matrix
  6. strategies
7. opportunity cost
8. positive classification
  1. adjusted cost matrix
  2. C5.0 models
9. R code
10. rebalancing cost
  1. CART model
  2. confidence and positive confidence
  3. definition
  4. network models
11. trinary classification
  1. accuracy
  2. assumptions
  3. contingency table
  4. cost calculation
  5. cost matrix
  6. false negative
  7. false positive
  8. number of customers
  9. number of records
  10. overall error rate
  11. predicted/actual categories
  12. principal and interest
  13. true negative
  14. true positive
cross-industry standard process for data mining (CRISP-DM)
1. adaptive process
2. business understanding phase
3. business/research phase
4. clustering analysis
  1. BIRCH clustering algorithm
  2. cluster profiles
  3. cross-validation
  4. k-means clustering
5. data phase
6. data preparation phase
  1. deriving flag variable
  2. negative amounts
  3. product uniformity
  4. standardization
7. data understanding phase
  1. absolute pairwise correlation
  2. continuous predictors
  3. dataset, fields
  4. de-transformation
  5. lifestyle cluster types
  6. missing values
  7. predictors and response
  8. zip code fields
8. deployment phase
9. evaluation phase
10. modeling and evaluation strategy
  1. baseline model
  2. cost-benefit analysis
  3. high performance model
  4. input variables
  5. misclassification cost
  6. model voting
  7. processing steps
  8. profitable classification model
  9. propensity averaging
  10. rebalanced data set
11. modeling phase
12. principal components analysis
  1. data set partitioning
  2. input variables
  3. low communality predictors
  4. principal component profiles
  5. rotated component matrix
cross-validation
customer service calls (CSC) see polychotomous predictor
data balancing
data cleaning
1. age field
2. American zip code
3. data set
4. income field
5. marital status field
6. measures of center
  1. customer service calls
  2. measures of location
  3. measures of spread
  4. price/earning ratio
  5. standard deviation
7. missing data
  1. data imputation method
  2. field values
  3. frequency distribution
  4. random values
  5. replacement values
  6. variable brand
8. outliers
9. poverty
10. R code
11. transaction amount field
data imputation method
data preparation
1. contingency table
2. HighDayEveMins_Flag variable
3. voice mail messages
4. z-score standardization
data summarization
1. bivariate relationship
2. boxplot
3. discrete variable
4. levels of measurement
5. measures of center
6. measures of position
7. measures of variability
8. qualitative/quantitative variable
data transformation
1. binning methods
2. categorical variables
  1. reclassification
  2. region_num variable
  3. survey_response variable
3. correlated variables
4. decimal scaling
5. donation_dollar field
6. duplicate records
7. flag variables
8. ID fields
9. index field
10. min–max normalization
11. R code
12. unary variables
13. Z-score standardization
  1. inverse_sqrt (weight) transformation
  2. natural log transformation
  3. negative standardization
  4. normal probability plot
  5. normal Z distribution
  6. outliers
  7. positive standardization
  8. skewness
  9. square root transformation
  10. weighted data
data visualization
1. bar chart
2. bivariate relationship
3. cumulative frequency distribution
4. dotplot
5. frequency distribution
6. histogram
7. pie chart
8. skewness
9. stem-and-leaf display
data-driven misclassification costs see cost-benefit analysis
decision tree
1. C4.5 algorithm, information-gain
  1. adult data set
  2. candidate splits
  3. capital gains
  4. categorical variables
  5. decision node A
  6. entropy reduction
  7. initial split
  8. marital status
  9. numerical variables
  10. savings split
  11. threshold partition
  12. training data set
2. CART (see Classification and regression trees (CART))
3. credit risk
4. decision rules
5. diverse attributes
6. R code
7. requirements
dichotomous predictor
1. reference cell coding
2. voice mail plan
dimension-reduction method
1. applications
2. factor analysis (see factor analysis)
3. houses data set
  1. median income
  2. predictor variables
4. multicollinearity
5. PCA (see principal components analysis (PCA))
6. R code
7. user-defined composites
  1. definition
  2. houses data set
  3. measurement error
  4. summated scales
direct cost matrix
distance function
1. age variable
2. Euclidean distance
3. min–max normalization
4. properties
5. Z-score standardization
EDA see exploratory data analysis (EDA)
ensemble methods
1. bagging model
  1. algorithm for
  2. bootstrap samples
  3. vs. CART model
  4. prediction method
  5. R code
  6. stable/unstable classification
2. bias-variance trade-off
3. boosting model
  1. adaptive boosting (see ADABoost algorithm)sub
  2. algorithm for
  3. vs. CART model
  4. R code
4. model voting
  1. alternative models
  2. contingency tables
  3. evaluative measures
  4. majority classification
  5. processing steps
  6. R code
  7. working test data set
5. prediction error
6. propensity averaging
  1. evaluative measures
  2. histogram model
  3. m base classifiers
  4. processing steps
exploratory data analysis (EDA)
1. churn data set (see churn data set)
2. data understanding phase
  1. absolute pairwise correlation
  2. de-transformation
  3. predictors and response
3. vs. hypothesis testing
4. R code
5. segmentation modeling
  1. capital gains/losses
  2. contingency tables
  3. overall error rate
factor analysis model
1. adult data set
  1. Bartlett's test
  2. correlation matrix
  3. factor loadings
  4. KMO statistics
  5. principal axis
2. factor rotation
  1. oblique rotation method
  2. orthogonal rotation
  3. percentage of variance
  4. rotated vectors
  5. unrotated vectors
  6. varimax rotation
flag variables
GAs see genetic algorithms (GAs)
gas mileage prediction
1. backward elimination
2. best subsets method
3. forward selection method
4. Mallows' C^p statistics
  1. predictors
  2. regression assumptions
5. stepwise selection regression
6. target variable MPG
generalized rule induction (GRI) method
genetic algorithms (GAs)
1. crossover operator
  1. definition
  2. multi-point crossover
  3. real-valued data
  4. uniform crossover
2. framework
3. mutation operator
4. neural networks
  1. backpropagation
  2. feed-forward nature
  3. learning method
  4. modified discrete crossover
  5. random shock mutation
  6. sum of squared errors
  7. topology and operation
5. R code
6. selection operator
  1. Boltzmann selection
  2. crowding phenomenon
  3. definition
  4. elitism
  5. fitness sharing
  6. rank selection
  7. sigma scaling
  8. tournament ranking
7. terminologies
8. WEKA
  1. AttributeSelectiedClassifier
  2. class distribution
  3. initial population characteristics
  4. Preprocess tab
  5. WrapperSubsetEval evaluation method
gradient-descent method
graphical evaluation
1. gains charts
2. lift chart
3. profits charts
4. R code
5. response charts
6. return-on-investment charts
hierarchical clustering
1. agglomerative clustering
2. complete-linkage clustering
3. divisive clustering methods
4. single-linkage clustering
hypothesis testing
1. confidence interval
2. criminal trial, outcomes
3. null hypothesis
4. p-value
5. population proportion
6. standard error
7. treatment
indicator variable
1. cereals, y-intercepts
2. estimated nutritional rating
3. p-values
4. parallel planes
5. reference category
6. regression coefficient values
7. relative estimation error
8. shelf effect
instance-based learning
1. issues
2. sodium/potassium ratio
3. training data points
4. voting
k-means clustering algorithm
1. data points
2. definition
3. MSE
4. processing steps
5. pseudo-F statistic method
6. SAS Enterpriser Miner (see churn data set)
7. statistics behavior
k-nary classification
1. accuracy
2. contingency table
3. Loans data sets
4. overall error rate
5. predicted/actual categories
6. sensitivity
k-nearest neighbor (KNN) algorithm
1. classification
  1. data set
  2. income bracket
2. ClassifyRisk data set
3. combination function
  1. simple unweighted voting
  2. weighted voting
4. cross-validation approach
5. database
6. distance function
  1. age variable
  2. Euclidean distance
  3. min–max normalization
  4. properties
  5. Z-score standardization
7. instance-based learning
  1. issues
  2. sodium/potassium ratio
  3. training data points
  4. voting
8. locally weighted averaging
9. modeler's results
10. outliers/unusual observations
11. R code
Kaiser–Meyer–Olkin (KMO) statistics
Kohonen networks
1. age and income data set
2. algorithm
3. CART decision tree model
4. cluster profiles
5. flag variables
6. International Plan adopters
7. mean analysis
8. numerical variables
9. R code
10. SOM
  1. architecture
  2. characteristic processes
  3. goal
  4. networks connection
11. topology
12. validation
13. variables distribution
14. VoiceMail Plan adoption
logistic regression model
1. conditional mean
2. disease vs. age
3. linear regression model
4. logit transformation
5. maximum-likelihood estimation
  1. confidence interval
  2. interpretation
  3. likelihood ratio test
  4. log-likelihood estimators
  5. mean square regression
  6. negative response
  7. parameters
  8. positive response
  9. saturated model
  10. Wald test, parameters
6. odds ratio (see odds ratio (OR))
7. R code
8. sigmoidal curve
9. training data set
  1. education variable
  2. marital status
10. WEKA
  1. explorer panel
  2. RATING field
  3. regression coefficients
  4. test set prediction
  5. training file
market basket analysis
Markov chain Monte Carlo (MCMC) methods
maximum a posteriori (MAP), churn data set
1. complement probabilities
2. conditional probability
3. International Plan
4. joint conditional probabilities
5. marginal and conditional probabilities
6. posterior probabilities
7. Voice Mail Plan
McKinsey Global Institute (MGI) report
1. association task
2. classification
  1. income bracket
  2. sodium/potassium ratio
3. clustering
4. continuous quality monitoring
5. CRISP-DM
  1. adaptive process
  2. business/research phase
  3. data phase
  4. deployment phase
  5. evaluation phase
  6. modeling phase
6. estimation model
7. factors
8. Forbes magazine
9. HMO
10. patterns and trends
11. prediction
12. problem solving, human process
13. profitable results
14. R code
15. software packages
16. tools
mean absolute error (MAE)
mean square error (MSE)
mean square treatment (MSTR)
missing data imputation
1. CART model
2. data weighting
3. flag variable
4. multiple regression model
5. R code
6. SEI formula
model evaluation techniques
1. classification task
  1. accuracy
  2. building and data model
  3. C5.0 model
  4. contingency table
  5. cost/benefit analysis
  6. error rate
  7. false negative
  8. false-negative rate
  9. false-positive
  10. false-positive rate
  11. financial lending firm
  12. gains chart
  13. income classification
  14. lift charts
  15. misclassification cost adjustment
  16. true negative
  17. true positive
2. description task
3. estimation and prediction tasks
  1. MAE
  2. MSE
  3. standard error of the estimate
4. R code
model voting process
1. alternative models
2. contingency tables
3. evaluative measures
4. majority classification
5. processing steps
6. R code
7. working test data set
multicollinearity
1. correlation coefficients
2. fiber variable
3. matrix plot
4. potassium variable
5. stability coefficient
6. user-defined composite
7. variable coefficients
8. variance inflation factor
multinomial data
1. chi-square test
2. expected frequency
3. observed frequency
4. R code
5. test statistics
multiple regression model
1. ANOVA table
2. coefficient of determination, R²
3. confidence interval
  1. mean value, y
  2. particular coefficient, β_i
4. estimation error
5. indicator variable
  1. cereals, y-intercepts
  2. estimated nutritional rating
  3. p-values
  4. parallel planes
  5. reference category
  6. regression coefficient values
  7. relative estimation error
  8. shelf effect
6. inference
  1. F-test
  2. t-test
7. multicollinearity
  1. correlation coefficients
  2. fiber variable
  3. matrix plot
  4. potassium variable
  5. stability coefficient
  6. user-defined composite
  7. variable coefficients
  8. variance inflation factor
8. nutritional rating vs. sugars
9. population
10. prediction interval
11. predictor variables
12. principal components
  1. Box–Cox transformation
  2. component values
  3. unrotated and rotated component weights
  4. varimax-rotated solution
13. R code
14. regression plane/hyperplane
15. slope coefficients
16. Spoon Size Shredded Wheat
17. SSR
18. three-dimensional scatter plot
19. variable selection method (see variable selection method)
Nave Bayes classifier see also Bayesian approach
1. conditional independence
2. posterior odds ratio
3. predictor variables
4. WEKA
  1. ARFF
  2. conditional probabilities
  3. Explorer Panel
  4. load training file
  5. test set predictions
5. zero-frequency cells
neural network model
1. adult data set
2. artificial neuron model
3. back-propagation algorithm
  1. cross validation termination
  2. downstream node
  3. error propagation
  4. learning rate
  5. momentum term
  6. squared prediction error
  7. upstream node
4. combination function
5. data preprocessing
6. estimation and prediction
7. gradient-descent method
8. hidden layer
9. input and output encoding
  1. categorical variables
  2. dichotomous classification
  3. drawback
  4. min–max normalization
  5. thresholds
10. input layer
11. output layer
12. prediction accuracy
13. R code
14. real neuron
15. sensitivity analysis
16. sigmoid function
neural networks
1. backpropagation
2. feed-forward nature
3. learning method
4. modified discrete crossover
5. random shock mutation
6. sum of squared errors
7. topology and operation
odds ratio (OR)
1. assumptions
  1. capnet variable
  2. churn overlay
  3. customer service calls
2. continuous predictor (see continuous predictor)
3. dichotomous predictor (see dichotomous predictor)
4. estrogen replacement therapy
5. interpretation
6. polychotomous predictor (see polychotomous predictor)
7. relative risk
8. response variable
9. zero-count cell
overfitting
1. complexity model
2. provisional model
partitioning variable
PCA see Principal components analysis (PCA)
polychotomous predictor
1. confidence interval
2. estimated probability
3. medium customer service call
4. reference cell encoding
5. standard error
6. Wald test
principal components analysis (PCA)
1. communality
2. component matrix
3. component size
4. component weights
5. coordinate system
6. correlation coefficient
7. correlation matrix
8. covariance matrix
9. data set partitioning
10. eigenvalues
11. eigenvectors
12. geographical component
13. housing median age
14. input variables
15. linear combination
16. low communality predictors
17. matrix plot
18. median income
19. multiple regression analysis
20. orthogonal vectors
21. principal component profiles
22. rotated component matrix
23. scree plot
24. standard deviation matrix
25. validation
26. variance proportion
profits charts
propensity averaging process
1. evaluative measures
2. histogram model
3. m base classifiers
4. processing steps
pseudo-F statistic method
1. clustering model
2. distribution
3. Iris data set
4. R code
5. SSB and SSE
regression modeling
1. ANOVA table
2. baseline model
3. Box–Cox transformation
4. cereals data set
5. coefficient of determination, r²
  1. data points
  2. distance and time estimation
  3. estimation error
  4. maximum value
  5. minimum value
  6. predicted score column
  7. prediction error
  8. predictor and response variables
  9. predictor information
  10. residual error
  11. sample variance
  12. standard deviation
  13. sum of squares regression
  14. sum of squares total
6. Cook's distance
7. correlation coefficient, r
  1. confidence interval
  2. linear correlation
  3. negative correlation
  4. positive correlation
  5. quantitative variables
8. dangers of extrapolation
  1. chocolate frosted sugar bombs
  2. observed and unobserved points
  3. policy recommendations
  4. prediction error
  5. predictor variable
9. end-user
  1. confidence interval
  2. prediction interval
10. field values
11. high leverage point
  1. characteristics
  2. distance vs. time
  3. hard-core orienteer
  4. mild outlier
  5. observation
  6. regression results
  7. standard error
12. inference
13. least-squares estimation
  1. error term
  2. estimated nutritional rating
  3. nutritional rating vs. sugar content
  4. prediction error
  5. statistics
  6. sum of squared errors
  7. y-intercept b₀
14. linearity transformation
  1. bulging rule
  2. log transformation
  3. point value vs. letter frequency
  4. response variable
  5. Scrabble®
  6. square root transformation
  7. standardized residual
15. normal probability plot
  1. Anderson–Darling (AD) statistics
  2. assumptions
  3. chi-square distribution
  4. distance vs. time
  5. horizontal zero line
  6. normal distribution
  7. p-value
  8. Rorschach effect
  9. uniform distribution
16. outliers
  1. Minitab
  2. nutritional rating vs. sugars
  3. positive and negative values
  4. standardized residuals
17. population regression equation
  1. assumptions
  2. bivariate observation
  3. constant variance
  4. true regression line
18. R code
19. regression equation
20. standard error
  1. mean square error
  2. standard deviation, response variable
  3. sum of squares regression
  4. sum of squares total
  5. time and distance calculation
21. t-test
  1. assumptions
  2. confidence interval
  3. null hypothesis
  4. nutritional rating vs. sugar content
  5. p-value method
  6. sampling distribution
response charts
return-on-investment (ROI) charts
scatter plot
segmentation modeling
1. clustering analysis
  1. CART decision trees
  2. churn proportion
  3. contingency tables
  4. international plan people
  5. no-plan majority
  6. voice mail plan people
2. exploratory analysis
  1. capital gains/losses
  2. contingency tables
  3. overall error rate
3. performance enhancement
4. processing steps
5. R code
SEI see standard error of the imputation (SEI)
self-organizing map (SOM)
1. architecture
2. characteristic processes
3. goal
4. networks connection
sigmoid function
silhouette method
1. cohesion/separation
2. Iris data set
3. mean silhouette
4. positive/negative values
5. R code
simplified cost matrix
squashing function
standard error of the imputation (SEI)
statistical inference
1. confidence interval
  1. customer service call
  2. lower bound
  3. margin of error
  4. population proportion
  5. subgroup analyses
  6. t-interval
  7. upper bound
2. crystal ball gazers
3. definition
4. hypothesis testing (see hypothesis testing)
5. point estimation
6. population parameters
7. R code
8. sample proportion
9. sampling error
statistical methods
stem-and-leaf display
sum of squares between (SSB)
sum of squares error (SSE)
sum of squares regression (SSR), multiple regression model
supervised methods
target variable
unsupervised methods
user-defined composites
1. definition
2. houses data set
3. measurement error
4. summated scales
variable selection method
1. all-possible-regression
2. backward elimination
3. best subsets method
4. forward selection
5. gas mileage data set (see gas mileage prediction)
6. partial F-test
7. stepwise regression
Waikato Environment for Knowledge Analysis (WEKA)

Bayesian belief networks
1. Explorer Panel
2. positive and negative classification
3. prior probabilities
4. test set predictions
explorer panel
genetic search algorithm
1. AttributeSelectiedClassifier
2. class distribution
3. initial population characteristics
4. Preprocess tab
5. WrapperSubsetEval
Nave Bayes
1. ARFF
2. conditional probabilities
3. Explorer Panel
4. load training file
5. test set predictions
RATING field
regression coefficients
test set prediction
training file

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Index

Create new playlist

Sign In

Sign Up

Table of Contents for
Index