Course Code:
deepmclrg
Duration:
14 hours
Course Outline:
MACHINE LEARNING
1: Introducing Machine Learning
- The origins of machine learning
- Uses and abuses of machine learning
- Ethical considerations
- How do machines learn?
- Abstraction and knowledge representation
- Generalization
- Assessing the success of learning
- Steps to apply machine learning to your data
- Choosing a machine learning algorithm
- Thinking about the input data
- Thinking about types of machine learning algorithms
- Matching your data to an appropriate algorithm
- Using R for machine learning
- Installing and loading R packages
- Installing an R package
- Installing a package using the point-and-click interface
- Loading an R package
- Summary
2: Managing and Understanding Data
- R data structures
- Vectors
- Factors
- Lists
- Data frames
- Matrixes and arrays
- Managing data with R
- Saving and loading R data structures
- Importing and saving data from CSV files
- Importing data from SQL databases
- Exploring and understanding data
- Exploring the structure of data
- Exploring numeric variables
- Measuring the central tendency – mean and median
- Measuring spread – quartiles and the five-number summary
- Visualizing numeric variables – boxplots
- Visualizing numeric variables – histograms
- Understanding numeric data – uniform and normal distributions
- Measuring spread – variance and standard deviation
- Exploring categorical variables
- Measuring the central tendency – the mode
- Exploring relationships between variables
- Visualizing relationships – scatterplots
- Examining relationships – two-way cross-tabulations
- Summary
3: Lazy Learning – Classification Using Nearest Neighbors
- Understanding classification using nearest neighbors
- The kNN algorithm
- Calculating distance
- Choosing an appropriate k
- Preparing data for use with kNN
- Why is the kNN algorithm lazy?
- Diagnosing breast cancer with the kNN algorithm
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Transformation – normalizing numeric data
- Data preparation – creating training and test datasets
- Step 3 – training a model on the data
- Step 4 – evaluating model performance
- Step 5 – improving model performance
- Transformation – z-score standardization
- Testing alternative values of k
- Summary
4: Probabilistic Learning – Classification Using
- Naive Bayes
- Understanding naive Bayes
- Basic concepts of Bayesian methods
- Probability
- Joint probability
- Conditional probability with Bayes' theorem
- The naive Bayes algorithm
- The naive Bayes classification
- The Laplace estimator
- Using numeric features with naive Bayes
- Example – filtering mobile phone spam with the naive Bayes algorithm
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Data preparation – processing text data for analysis
- Data preparation – creating training and test datasets
- Visualizing text data – word clouds
- Data preparation – creating indicator features for frequent words
- Step 3 – training a model on the data
- Step 4 – evaluating model performance
- Step 5 – improving model performance
- Summary
5: Divide and Conquer – Classification Using
- Decision Trees and Rules
- Understanding decision trees
- Divide and conquer
- The C5.0 decision tree algorithm
- Choosing the best split
- Pruning the decision tree
- Example – identifying risky bank loans using C5.0 decision trees
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Data preparation – creating random training and test datasets
- Step 3 – training a model on the data
- Step 4 – evaluating model performance
- Step 5 – improving model performance
- Boosting the accuracy of decision trees
- Making some mistakes more costly than others
- Understanding classification rules
- Separate and conquer
- The One Rule algorithm
- The RIPPER algorithm
- Rules from decision trees
- Example – identifying poisonous mushrooms with rule learners
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Step 3 – training a model on the data
- Step 4 – evaluating model performance
- Step 5 – improving model performance
- Summary
6: Forecasting Numeric Data – Regression Methods
- Understanding regression
- Simple linear regression
- Ordinary least squares estimation
- Correlations
- Multiple linear regression
- Example – predicting medical expenses using linear regression
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Exploring relationships among features – the correlation matrix
- Visualizing relationships among features – the scatterplot matrix
- Step 3 – training a model on the data
- Step 4 – evaluating model performance
- Step 5 – improving model performance
- Model specification – adding non-linear relationships
- Transformation – converting a numeric variable to a binary indicator
- Model specification – adding interaction effects
- Putting it all together – an improved regression model
- Understanding regression trees and model trees
- Adding regression to trees
- Example – estimating the quality of wines with regression trees
- and model trees
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Step 3 – training a model on the data
- Visualizing decision trees
- Step 4 – evaluating model performance
- Measuring performance with mean absolute error
- Step 5 – improving model performance
- Summary
7: Black Box Methods – Neural Networks and
- Support Vector Machines
- Understanding neural networks
- From biological to artificial neurons
- Activation functions
- Network topology
- The number of layers
- The direction of information travel
- The number of nodes in each layer
- Training neural networks with backpropagation
- Modeling the strength of concrete with ANNs
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Step 3 – training a model on the data
- Step 4 – evaluating model performance
- Step 5 – improving model performance
- Understanding Support Vector Machines
- Classification with hyperplanes
- Finding the maximum margin
- The case of linearly separable data
- The case of non-linearly separable data
- Using kernels for non-linear spaces
- Performing OCR with SVMs
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Step 3 – training a model on the data
- Step 4 – evaluating model performance
- Step 5 – improving model performance
- Summary
8: Finding Patterns – Market Basket Analysis Using
- Association Rules
- Understanding association rules
- The Apriori algorithm for association rule learning
- Measuring rule interest – support and confidence
- Building a set of rules with the Apriori principle
- Example – identifying frequently purchased groceries with
- association rules
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Data preparation – creating a sparse matrix for transaction data
- Visualizing item support – item frequency plots
- Visualizing transaction data – plotting the sparse matrix
- Step 3 – training a model on the data
- Step 4 – evaluating model performance
- Step 5 – improving model performance
- Sorting the set of association rules
- Taking subsets of association rules
- Saving association rules to a file or data frame
- Summary
9: Finding Groups of Data – Clustering with k-means
- Understanding clustering
- Clustering as a machine learning task
- The k-means algorithm for clustering
- Using distance to assign and update clusters
- Choosing the appropriate number of clusters
- Finding teen market segments using k-means clustering
- Step 1 – collecting data
- Step 2 – exploring and preparing the data
- Data preparation – dummy coding missing values
- Data preparation – imputing missing values
- Step 3 – training a model on the data
- Step 4 – evaluating model performance
- Step 5 – improving model performance
- Summary
10: Evaluating Model Performance
- Measuring performance for classification
- Working with classification prediction data in R
- A closer look at confusion matrices
- Using confusion matrices to measure performance
- Beyond accuracy – other measures of performance
- The kappa statistic
- Sensitivity and specificity
- Precision and recall
- The F-measure
- Visualizing performance tradeoffs
- ROC curves
- Estimating future performance
- The holdout method
- Cross-validation
- Bootstrap sampling
- Summary
11: Improving Model Performance
- Tuning stock models for better performance
- Using caret for automated parameter tuning
- Creating a simple tuned model
- Customizing the tuning process
- Improving model performance with meta-learning
- Understanding ensembles
- Bagging
- Boosting
- Random forests
- Training random forests
- Evaluating random forest performance
- Summary
DEEP LEARNING with R
1: Getting Started with Deep Learning
- What is deep learning?
- Conceptual overview of neural networks
- Deep neural networks
- R packages for deep learning
- Setting up reproducible results
- Neural networks
- The deepnet package
- The darch package
- The H2O package
- Connecting R and H2O
- Initializing H2O
- Linking datasets to an H2O cluster
- Summary
2: Training a Prediction Model
- Neural networks in R
- Building a neural network
- Generating predictions from a neural network
- The problem of overfitting data – the consequences explained
- Use case – build and apply a neural network
- Summary
3: Preventing Overfitting
- L1 penalty
- L1 penalty in action
- L2 penalty
- L2 penalty in action
- Weight decay (L2 penalty in neural networks)
- Ensembles and model averaging
- Use case – improving out-of-sample model performance
- using dropout
- Summary
4: Identifying Anomalous Data
- Getting started with unsupervised learning
- How do auto-encoders work?
- Regularized auto-encoders
- Penalized auto-encoders
- Denoising auto-encoders
- Training an auto-encoder in R
- Use case – building and applying an auto-encoder model
- Fine-tuning auto-encoder models
- Summary
5: Training Deep Prediction Models
- Getting started with deep feedforward neural networks
- Common activation functions – rectifiers, hyperbolic tangent,
- and maxout
- Picking hyperparameters
- Training and predicting new data from a deep neural network
- Use case – training a deep neural network for automatic
- classification
- Working with model results
- Summary
6: Tuning and Optimizing Models
- Dealing with missing data
- Solutions for models with low accuracy
- Grid search
- Random search
- Summary
DEEP LEARNING WITH PYTHON
I Introduction
1 Welcome
- Deep Learning The Wrong Way
- Deep Learning With Python
- Summary
II Background
2 Introduction to Theano
- What is Theano?
- How to Install Theano
- Simple Theano Example
- Extensions and Wrappers for Theano
- More Theano Resources
- Summary
3 Introduction to TensorFlow
- What is TensorFlow?
- How to Install TensorFlow
- Your First Examples in TensorFlow
- Simple TensorFlow Example
- More Deep Learning Models
- Summary
4 Introduction to Keras
- What is Keras?
- How to Install Keras
- Theano and TensorFlow Backends for Keras
- Build Deep Learning Models with Keras
- Summary
5 Project: Develop Large Models on GPUs Cheaply In the Cloud
- Project Overview
- Setup Your AWS Account
- Launch Your Server Instance
- Login, Configure and Run
- Build and Run Models on AWS
- Close Your EC2 Instance
- Tips and Tricks for Using Keras on AWS
- More Resources For Deep Learning on AWS
- Summary
III Multilayer Perceptrons
6 Crash Course In Multilayer Perceptrons
- Crash Course Overview
- Multilayer Perceptrons
- Neurons
- Networks of Neurons
- Training Networks
- Summary
7 Develop Your First Neural Network With Keras
- Tutorial Overview
- Pima Indians Onset of Diabetes Dataset
- Load Data
- Define Model
- Compile Model
- Fit Model
- Evaluate Model
- Tie It All Together
- Summary
8 Evaluate The Performance of Deep Learning Models
- Empirically Evaluate Network Configurations
- Data Splitting
- Manual k-Fold Cross Validation
- Summary
9 Use Keras Models With Scikit-Learn For General Machine Learning
- Overview
- Evaluate Models with Cross Validation
- Grid Search Deep Learning Model Parameters
- Summary
10 Project: Multiclass Classification Of Flower Species
- Iris Flowers Classification Dataset
- Import Classes and Functions
- Initialize Random Number Generator
- Load The Dataset
- Encode The Output Variable
- Define The Neural Network Model
- Evaluate The Model with k-Fold Cross Validation
- Summary
11 Project: Binary Classification Of Sonar Returns
- Sonar Object Classification Dataset
- Baseline Neural Network Model Performance
- Improve Performance With Data Preparation
- Tuning Layers and Neurons in The Model
- Summary
12 Project: Regression Of Boston House Prices
- Boston House Price Dataset
- Develop a Baseline Neural Network Model
- Lift Performance By Standardizing The Dataset
- Tune The Neural Network Topology
- Summary
IV Advanced Multilayer Perceptrons and Keras
13 Save Your Models For Later With Serialization
- Tutorial Overview .
- Save Your Neural Network Model to JSON
- Save Your Neural Network Model to YAML
- Summary
14 Keep The Best Models During Training With Checkpointing
- Checkpointing Neural Network Models
- Checkpoint Neural Network Model Improvements
- Checkpoint Best Neural Network Model Only
- Loading a Saved Neural Network Model
- Summary
15 Understand Model Behavior During Training By Plotting History
- Access Model Training History in Keras
- Visualize Model Training History in Keras
- Summary
16 Reduce Overfitting With Dropout Regularization
- Dropout Regularization For Neural Networks
- Dropout Regularization in Keras
- Using Dropout on the Visible Layer
- Using Dropout on Hidden Layers
- Tips For Using Dropout
- Summary
17 Lift Performance With Learning Rate Schedules
- Learning Rate Schedule For Training Models
- Ionosphere Classification Dataset
- Time-Based Learning Rate Schedule
- Drop-Based Learning Rate Schedule
- Tips for Using Learning Rate Schedules
- Summary
V Convolutional Neural Networks
18 Crash Course In Convolutional Neural Networks
- The Case for Convolutional Neural Networks
- Building Blocks of Convolutional Neural Networks
- Convolutional Layers
- Pooling Layers
- Fully Connected Layers
- Worked Example
- Convolutional Neural Networks Best Practices
- Summary
19 Project: Handwritten Digit Recognition
- Handwritten Digit Recognition Dataset
- Loading the MNIST dataset in Keras
- Baseline Model with Multilayer Perceptrons
- Simple Convolutional Neural Network for MNIST
- Larger Convolutional Neural Network for MNIST
- Summary
20 Improve Model Performance With Image Augmentation
- Keras Image Augmentation API
- Point of Comparison for Image Augmentation
- Feature Standardization
- ZCA Whitening
- Random Rotations
- Random Shifts
- Random Flips
- Saving Augmented Images to File
- Tips For Augmenting Image Data with Keras
- Summary
21 Project Object Recognition in Photographs
- Photograph Object Recognition Dataset
- Loading The CIFAR-10 Dataset in Keras
- Simple CNN for CIFAR-10
- Larger CNN for CIFAR-10
- Extensions To Improve Model Performance
- Summary
22 Project: Predict Sentiment From Movie Reviews
- Movie Review Sentiment Classification Dataset
- Load the IMDB Dataset With Keras
- Word Embeddings
- Simple Multilayer Perceptron Model
- One-Dimensional Convolutional Neural Network
- Summary
VI Recurrent Neural Networks
23 Crash Course In Recurrent Neural Networks
- Support For Sequences in Neural Networks
- Recurrent Neural Networks
- Long Short-Term Memory Networks
- Summary
24 Time Series Prediction with Multilayer Perceptrons
- Problem Description: Time Series Prediction
- Multilayer Perceptron Regression
- Multilayer Perceptron Using the Window Method
- Summary
25 Time Series Prediction with LSTM Recurrent Neural Networks
- LSTM Network For Regression
- LSTM For Regression Using the Window Method
- LSTM For Regression with Time Steps
- LSTM With Memory Between Batches
- Stacked LSTMs With Memory Between Batches
- Summary
26 Project: Sequence Classification of Movie Reviews
- Simple LSTM for Sequence Classification
- LSTM For Sequence Classification With Dropout
- LSTM and CNN For Sequence Classification
- Summary
27 Understanding Stateful LSTM Recurrent Neural Networks
- Problem Description: Learn the Alphabet
- LSTM for Learning One-Char to One-Char Mapping
- LSTM for a Feature Window to One-Char Mapping
- LSTM for a Time Step Window to One-Char Mapping
- LSTM State Maintained Between Samples Within A Batch
- Stateful LSTM for a One-Char to One-Char Mapping
- LSTM with Variable Length Input to One-Char Output
- Summary
28 Project: Text Generation With Alice in Wonderland
- Problem Description: Text Generation
- Develop a Small LSTM Recurrent Neural Network
- Generating Text with an LSTM Network
- Larger LSTM Recurrent Neural Network
- Extension Ideas to Improve the Model
- Summary