1-grams, 211
10-Fold Cross-Validation, 220
2-grams, 213
2D Structure Descriptors, 315
3-Grams, 213
3D Scatter Plot, 162, 166, 173
3D Structure Descriptors, 315
A-B-C Segments, 8
Accuracy, 94, 275
Adjusted Rand Index, 161
Advanced Analytics, 3
Affinity, 78
Affinity-Based Marketing, 77
Agglomerative Clustering, 159
Aggregation, 84
AML, 262
AML Data Import, 262
Analogy Reasoning, 6
Analysis of Variances (ANOVA), 294
Analytics, 3
Anomaly Detection, 395
ANOVA, 294
API, 207
Area Detection, 340
Area under the Curve, 92, 94, 96
Artificial Neural Network, 290, 291
Artificial Neural Network Learner, 316
ASCII, 211
Association Rule Mining, 97, 113, 114, 234, 235, 239
Association Rule Visualization, 116
Association Rules, 22, 100, 249, 284
Astronomy, 257
Astroparticle Physcis, 257
Attribute Role, 13, 14, 288
Attribute Roles, 41, 86, 321
Attribute Selection, 12, 67, 109, 150, 242,
261, 264, 322
Attribute Value Type, 40, 82, 92, 111
Attribute Value Type Transformation, 82
Attribute Value Types, 14, 321
Attribute Weighting, 22, 264, 322
Attributes, 11, 12, 14
AUC, 92, 94, 96
Audio Recommender System, 121
Automated Text Classification, 194
Backward Elimination, 323
Bag-of-Words Model, 215
Bag-of-Words Representation, 213
Balanced Training Set, 89
Banking Industry, 77
Bayesian Personalized Ranking Matrix
Factorization, 122
Beam Search, 325
Behavior, 84
Big Data, 4
Bigrams, 213
Binary Classification, 78, 91, 95, 275
Binomial Attribute, 82
Binomial Classification, 91
Binominal Classification, 275
Biological Activities, 313, 314
Biological Property, 311
Block Plot, 173
Boolean Attribute, 82
Bootstrap Validation, 326
Bootstrapping, 22, 271, 326
Business Understanding, 78
Carcinogenicity Prediction, 314
Carpal Tunnel Syndrome (CTS), 281, 285
CART, 150
Causal Relations, 6
Centroid, 159, 160
Changing Attribute Roles, 41, 288, 321
Changing the Attribute Value Type, 40
Channel Selection, 8
Character N-Grams, 210, 223
Characteristics, 11
Chemical Properties, 314
Chemical Structures, 313
Chemistry, 319
Chemoinformatic Model, 314
Chemoinformatic Models, 311
Chemoinformatic Prediction Model, 311, 315
Chemoinformatics, 311, 319
Churn Prediction, 7, 9, 94
Churn Prevention, 7, 9, 94
Class Imbalance, 88
Classification, 8, 22, 25, 33, 45, 53, 78, 145, 149, 208, 211, 214, 229, 272, 283, 288, 314, 350
Classification Accuracy, 275
Classification and Regression Tree (CART), 150
Classification of Images, 340, 344
Classification of Text, 207
Cluster Centroids, 250
Cluster Density, 160
Cluster Internal Validation, 182
Cluster Model, 250
Cluster Validity Measures, 159
Cluster Visualization, 166, 250
Clustering, 8, 22, 157, 158, 181, 234, 240, 242, 250, 284
Clustering Validity Measures, 157
Coincidence, 4
Collaborative Filtering, 141
Collaborative Filtering Recommender System, 120
Collaborative Recommender System, 127, 130
Comma Separated Values File, 101
Concept, 13
Confidence, 90, 275
Confidence Threshold, 90
Confusion Matrix, 95
Construction, 10
Content Filtering, 208
Content-Based Recommendation, 141
Content-Based Recommender System, 120, 132
Contingency Table, 95
Contingency Table, 89
Conversion of Images, 338
Convert Images, 335
Correlation, 6, 22, 127, 322
Cosine Correlation, 127
Cosine Similarity, 135, 215
Cost-Based Performance Evaluation, 293
Covering Algorithm, 364
Credit Default Prediction, 8, 53
Credit Risk Scoring, 53
Credit Scoring, 8
Cross-Distance, 135
Cross-Industry Standard for Data Mining, 78
Cross-Marketing, 284
Cross-Selling, 9, 284
Cross-Validation, 22, 87, 95, 151, 202, 220, 263, 275, 290, 326
CSV File, 101
CSV File Import, 46, 58, 67, 101, 166, 217, 263, 315
CTS, 281, 285
Customer Behavior, 84, 97
Customer Churn Prevention, 94
Customer Insight, 8
Customer Lifetime Value, 9
Customer Loyalty, 9, 98
Customer Profile, 12
Customer Relationship, 86
Customer Retention, 9
Customer Segmentation, 8
Customer Service Process Automation, 9
Data, 15, 16
Data Cleaning, 86, 95
Data Cleansing, 111
Data Export, 22
Data Import, 21, 39, 46, 48, 58, 67, 82,
101, 166, 195, 217, 320
Data Import Wizard, 195
Data Loading Wizzard, 263
Data Mining, 4
Data Preparation, 81, 95, 286
Data Preprocessing, 286, 321
Data Transformation, 22, 111
Data Type, 92
Data Types, 14, 82, 92
Data Understanding, 79
Data Warehouse, 79
Database, 21{23
Database Import, 105
Dataset, 16
Davies Bouldien, 160
Document Frequency, 208
Decay Parameter, 292
Decision Support, 284
Decision Tree, 25, 27, 150, 272, 289, 316, 323, 341, 345
Decision Tree Induction, 25, 27
Decision Trees, 92
Demand Forecasting, 10
Deployment, 93, 95, 203
Detecting Text Message Spam, 193
Diabetes, 281, 282
Dimensionality Reduction, 339
Direct Mail, 77
Direct Mailing, 284
Direct Marketing, 8, 284
Direct Marketing Campaign Optimization, 8, 77
Discretization, 61, 87, 145
Distance Measure, 127
Distance-Based Decision Tree, 364
Document Frequency, 236
Document Representation, 211
Document Vector, 222, 236
Document Vector Model, 213
Download, 19, 35, 45, 48, 58, 65, 126, 136, 138, 140, 141, 164, 167, 195, 216, 217, 234, 261, 262, 312, 334
Drug Design, 319
Drug Effect Prediction, 320
Dummy Coding, 93
E-Coli Data, 159, 161, 163, 167, 176
E-Commerce, 7
Edge Detection, 340, 350
Edge Enhancement, 340
Educational Data Mining, 143, 145, 181
Effect Coding, 93
Electronics, 10
Encoding, 211
Ensemble Classifier, 150
Ensemble of Classifiers, 272
Entropy, 28
Error Prediction, 8
ETL, 207, 228
Euclidean Distance, 215
Evaluating Feature Selection Algorithms, 264
Evaluating Feature Selection Stability, 267
Evaluating Feature Weighting Algorithms, 264
Evaluation, 22, 37, 48, 50, 61, 69, 87, 146, 148, 150, 202, 292, 325
Example, 14, 95
Example Selection, 363
Example Set, 15, 249, 347
Example Weights, 88
Examples, 13, 15
Excel File Import, 23, 67, 315
Export, 22
Export Images, 335
Extensions, 235, 334
Fact Table, 80
Factorization-Based Recommender System, 128
Failure Prediction, 8, 10
False Negatives, 95
False Positive Rate, 95
False Positives, 95, 275
Feature Extraction, 234, 334, 338, 339, 349
Feature Selection, 150, 261, 264, 322
Features, 11
Feature Selection Stability Validation, 267
Feed-Forward Backpropagation Neural
Network, 291
Filter Examples, 271
Filtering Examples, 287
Finance Sector, 194
Financial Services, 7
Forward Selection, 264, 323
Fowlkes-Mallow Index, 161
FP-Growth, 113, 114, 239
Fraud Detection, 7
Frequency Distribution of Words, 250
Frequent Item Set, 113
Frequent Item Set Mining, 9, 239
Gaussian Blur, 335, 350
Gaussian Mixture Clusters, 162, 166
Generate Attributes, 269, 275
Generating Attributes, 82, 321
Glas Identification, 45
Global-Level Feature Extraction, 334, 339, 340, 344
Global-Level Features, 347
Graphical User Interface, 19, 235
GUI, 19, 235
Handling Missing Values, 322
Health Care Sector, 280
Hierarchical Clustering, 158
Hotel Review Analysis, 234
HSV, 333
HTML, 229
HTTP, 208, 228
Human Resources, 194
Hybrid Recommender System, 120, 135, 141
Hypothesis Test, 284, 294
ID Attribute, 14
Image Classification, 340, 344, 350
Image Combinations, 339
Image Conversion, 335, 338, 339
Image Data, 334, 339
Image Database, 335
Image Export, 335
Image Feature Extraction, 334, 338, 340, 349
Image Import, 335
Image Mining, 281, 333, 347, 349
Image Mining Extension for RapidMiner, 333, 334
Image Segmentation, 340, 341
Image Transformation, 339
Image Transformations, 339
IMMI Extension, 333, 334
Import, 21, 23, 217
Import CSV Files, 195, 197
Import Data, 82, 166
Import Data from Database, 105
Import Images, 335
Indicator Attributes, 85
Indicators, 11
Influence Factors, 6, 11
Information Gain, 28, 322
Installation, 19, 139, 164, 194, 234, 235, 261, 312, 334
Instance Selection, 363
Integration, 208
Intrusion Detection, 395
Item Recommendation, 122
Item Sets, 239
Iterating over a Set of Files, 337
Iterating over a Set of Images, 335, 336
Iteration, 148
Jaccard Index, 161, 268
Java Database Connectivity, 101
JDBC, 101
Join, 82, 135
k-Means, 158, 159, 167, 182
k-means Clustering, 242
k-Medoids, 158, 182
k-Nearest Neighbor, 33, 45, 131
k-Nearest Neighbors, 122, 137, 289, 364
k-Nearest Neighbours, 208, 214, 226
k-NN, 33, 45, 122, 131, 137, 208, 214, 226, 289, 364
Kennard-Stone Sampling, 271
Knowledge Discovery from Textual Databases, 234
Kuncheva Index, 268
Label, 11, 13, 14, 86
Label Type Conversion, 321
Labeling, 337
Language Identification, 207, 209
Latent Features, 129
Learning Algorithm, 334
Learning Rate, 292
Leave-One-Out Validation, 326
Lift Chart, 90
Linear Regression, 93, 283, 289, 315, 317
Local Level Feature Extraction, 340
Local Maxima, 292
Local Minima, 292
Local Outlier Factor, 395
Local-Level Feature Extraction, 334, 339
Local-Level Features, 347
LOF, 395
Logging, 432
Logistic Model Tree, 150
Logistic Regression, 93, 323
Logistics, 8
Loop, 148, 263, 266
Loop Files, 217
Loop over Attributes, 322
Loop Parameters, 263, 266
Loyalty Cards, 99
M5 Prime, 317
Machine Failure Prediction, 10
Machine Failure Prevention, 10
Machine Learning Algorithm, 334
Machine Learning Research, 425
Machine Translation, 208
Macro Variables, 148, 336
Manufacturing Process Optimization, 10
Manufacturing, 10
Market Basket Analysis, 9, 97, 284
Marketing, 12
Marketing Campaign Optimization, 77
Markov Models, 214
Martketing, 8
Matrix Factorization, 122, 127, 128, 141
Maximum Relevance Minimum Redundancy
Feature Selection, 261, 264
Media, 9
Medical Data Mining, 280
Meta Data, 15, 16
Meta-Learning, 425
Meta-learning, 436
Missing Value Handling, 86, 111
Model, 16
Model Application, 203, 291
Model Updates, 131
Modeling, 16, 22, 25, 87, 288, 323
Molecular Descriptors, 311
Molecular Properties, 311, 320
Molecular Structure Formats, 311
Molecular Structures, 313
Momentum, 292
Monte Carlo Simulation, 269
Movie Recommender System, 121
MRMR Feature Selection, 261, 264, 265
Multi-Layer Neural Network, 291
Multi-Layer Perceptron, 291
Multiple Linear Regression Model, 315, 317
Music Recommender System, 121
MySQL, 106
N-Grams, 210, 213, 223, 240, 250
Naive Bayes, 149, 201, 214, 222, 289
Naive Bayes, 53, 65
Natural Language Processing, 208, 228
Nearest Neighbor, 33, 45
Nearest Neighbours, 208, 214, 226
Negative Example, 14
Neighborhood-Based Recommender System, 127
Network Analysis, 9
Neural Network, 92, 289{291
Neural Network Learner, 316
Neural Networks, 334
Neutrino Astronomy, 257
News Categorization, 194
News Filtering, 194
Next Best Action, 8
NLP, 208, 228
Normalization, 292
Nursery Data, 65
Object Detection, 335, 340
Online Analytical Processing, 3
Open Color Image, 336
Open Grayscale Image, 335
Operational Model, 16
Operator, 21
Opinion Mining, 8, 228
Optimization, 150
Optimize Parameters, 266
Optimizing Feature Selection and Machine
Learning, 264
Optimizing Throughput Rates, 10
Outlier Detection, 7
Outlier Factor, 395
Over-Fitting, 150, 317, 322
PaDEL, 311, 320
PaDEL Extension for RapidMiner, 312
Parallelization, 436
Parameter Loop, 263, 266
Parameter Optimization, 93, 266
Partitional Clustering, 158
Patent Text Analysis, 10
PCA, 47, 50
Pearson Correlation, 127
Performance Evaluation, 38, 48, 50, 61, 69, 89, 265, 275, 292
Performance Measures, 124
Performance Metrics, 89, 95, 265
Permutation, 326
Personalized Recommender System, 120
Perspective, 19
Pharmaceutical Data Exploration Laboratory, 311
Pharmaceutical Industry, 280, 319
Physics, 257
Plotters, 173
PMML Extension for RapidMiner, 235
Point of Interest Detection, 340
Porter Stemmer, 135, 237
Ports, 22
Positive Example, 14
Precision, 89, 95, 125
Prediction, 275
Prediction of Carcinogenicity, 314
Predictive Accuracy, 275
Predictive Analytics, 8, 9
Predictive Maintenance, 10
Predictive Model, 25
Preventive Maintenance, 10
Price Prediction, 10
Principal Component Analysis, 47, 50
Probabilistic Classifier, 149
Process, 22, 23
Process Documents, 222, 223, 229, 235
Product Recommendation, 120
Product Recommendations, 9
Prototype Selection, 363
Prototype-Based Rules, 363
Pruning, 236
QSAR, 313
Quality Assurance, 10
Quality Optimization, 10
Quality Prediction, 10
Quantitative Structure-Activity Relationship, 313
R Console, 163
R Extension for RapidMiner, 235
R Packages, 164
R Script, 164, 169
Radial-Basis Function Kernel, 343
Rand Index, 161
Random Forest, 150, 265, 272, 323
Random Forest Learner, 316
Random Forests, 341
Ranking, 78, 121, 135, 215, 284
RapidAnalytics, 135, 138, 208, 228
RapidMiner, 19
RapidMiner Feature Selection Extension, 261
RapidMiner Image Mining Extension, 349
RapidMiner IMMI Extension, 349
RapidMiner Instance Selection and Prototype-
Based Rules Extension, 363
RapidMiner ISPR Extension, 363
RapidMiner PaDEL Extension, 320
RapidMiner R Extension, 163
RapidMiner Recommender Extension, 121
RapidMiner Text Processing Extension, 194
RapidMiner Weka Extension, 150, 273
RapidMiner WhiBo Extension, 182
Rating, 121, 122
RBF, 343
Re-Balancing, 88
Reasoning by Analogy, 6
Recall, 89, 95
Receiver Operating Characteristics, 95
Recommender Performance Evaluation, 121
Recommender Performance Measures, 124
Recommender System, 119, 121, 141, 143
Recommender System Web Service, 138
Recommender Systems, 9
Redundancy, 322
Redundant Attributes, 87
Regression, 22, 283, 315, 317
Regular Attribute, 14
Regular Expression, 212
Regular expressions, 430
Relative Validity Measures, 161
Relief, 322
Removing Useless Attributes, 323
Renaming Attributes, 321
Reporting Extension for RapidMiner, 235
Repository, 21, 239, 242, 249
Reputation Monitoring, 194
Retail, 7, 8, 97
RGB, 333, 347
Risk Analysis, 8
Risk Management, 8
ROC, 95
ROC Chart, 90, 91, 94
ROI Statistics, 343
Roles, 86
Rule-Based Model, 214
Running a Process, 242
Sales, 8, 12
Sampling, 88, 109, 271
SAR, 313
Saving a Process, 242
Saving Process Results, 249
Script, 438
SDF, 315
Segment-Level Feature Extraction, 334,
339, 340
Segment-Level Features, 347
Segmentation, 340
Select Attributes, 109, 145, 287
Select Examples, 271
Selecting Attributes, 82
Selecting Columns, 82
Selecting Examples, 287
Selecting Machine Learning Algorithms, 289, 425
Sensor Data, 10
Sentence Tokenization, 208, 212
Sentiment Analysis, 8, 208
Series Plot, 173, 176
Similarity, 22
Similarity Measure, 127, 335
Similarity Score, 135
Similarity-Based Content Recommendation, 134
Similarity-Based Model, 214, 226
Singular Value Decomposition, 137
SMILES, 315, 316
SMS, 193, 195, 197
Social Media Analysis, 208, 215
Spam Detection, 193, 195, 364
Sparse Data Format, 122, 124
SPR, 313
SQL, 106
SQL Database, 140
Star Schema, 80
Statistical Analysis, 294
Statistical analysis, 433
Stemming, 135, 208, 237
Stopword Filter, 237
Stopword Removal, 135, 137
Stratification, 88
Stratified Sampling, 271
Structure-Activity Relationship, 313
Structure-Property Relationship, 313
Subprocess, 21, 145, 275
Sum of Squares Item Distribution, 160
Supply Chain Management, 8, 10
Support Vector Clustering, 182
Support Vector Machine, 208, 214, 289, 316, 334, 341, 343
SVM, 208, 214, 289, 316, 334, 341, 343
t-test, 434
Target Attribute, 11, 13, 14
Target Property, 313
Target Variable, 11, 13
Teacher Assistant Evaluation Data, 35
Telecommunications, 7, 9
Term Frequency, 208, 236
Term N-Grams, 240, 250
Text Analysis, 10
Text Categorization, 207, 234
Text Classification, 193, 194, 199, 207, 234
Text Clustering, 234, 240, 242, 250
Text Data, 207, 233
Text Document Filtering, 194
Text Message Spam, 193
Text Mining, 10, 135, 193, 197, 207, 233, 234
Text Processing, 135, 200
Text Processing Extension for Rapid-
Miner, 235
Text Representation, 211, 213
TF-IDF, 208, 214, 236
TFIDF Word Vector Representation, 135
Thermography, 281
Threshold, 90, 94
Time Series Analysis, 8
Time Series Forecasting, 8
Token, 212
Token Filter, 237
Token Length Filter, 137
Tokenization, 208, 212, 222, 237
Tokenization of Text Documents, 137
Tokenizing Text Documents, 135, 197
Trading Analytics, 8
Training, 96
Training Cycles, 291
Transform Cases, 222, 237
Transport, 8
Trend Analysis, 8, 10, 234
True Negative, 96
True Positive, 96
True Positive Rate, 96
True Positives, 275
Type Conversion, 93, 111, 321
Unicode, 208, 211
Unigrams, 211
Unstructured Data, 207, 233, 334, 339
Unsupervised Learning, 158
Up-Selling, 9
Update, 235
Updates, 234
URL, 229
User-Item Matrix, 141
UTF-8, 208, 211
Utility Matrix, 122
Validation, 22, 37, 48, 87, 146, 148, 150,
202, 220, 292, 318, 325
Value Type, 82, 92, 111
Value Type Conversion, 321
Value Type Transformation, 82, 111
Value Types, 14
Variables, 11, 336
Video Recommender System, 121, 126, 134
View, 19
Virtual Drug Screening, 319
Visualization, 116, 157, 173
Wallace Indices, 161
Web Mining, 208
Web Mining Extension for RapidMiner, 208, 229
Web Page Language Identification, 228
Web Services, 138, 208, 228
Weight Attribute, 14
Weighted Regularized Matrix Factorization, 122
Weka, 273
Weka Extension for RapidMiner, 235
Word Frequency, 197
Word List, 135, 197, 198, 229, 249, 250
Word Stemming, 208, 237
Word Vector, 135, 197, 198, 222
Wrapper Validation, 150
Wrapper X-Validation, 263{265
X-Validation, 22, 87, 95, 151, 202, 220, 263, 275, 290, 326
XML, 208, 235
Y-Randomization, 326