Ad

Friday, February 22, 2019

Course Outline Summary for Convolutional Neural Network Section Udacity Deep Learning Nanodegree

Course Outline Summary for Convolutional Neural Network (CNN) Section Udacity Deep Learning Nanodegree.

Course Outline & Summary

This section reviews the materials covered in the Udacity nanodegree. You might find it useful for review this section before project 2.
Section 3 Convolutional Neural Network (CNN)
  • Lesson 1 Part 9 (3.1.9) — image loading, image show, ReLU activation
  • Lesson 1 Part 10 (3.1.10) — training loop, batch calculation, trainloader helps load data in batches, sometimes we add accumulative loss, sometimes average loss (outside the training loop). Cross entropy loss is a two step calculation, it takes in scores or logits, outputs AVERAGED loss not full loss.
  • (3.1.11) Jupyter notebook for MLP
  • (3.1.12)
  • (3.1.13) One Solution — Define MLP Layer. Because the model returns a vector of scores, need to use max() to find the top score and then map it to the top class. There’s a practical code snippet to calculate class accuracy and distribution (if classes are evenly distributed)
    Best practice: think about when to activate versus not. Training loss should decrease over time. Use model.eval() during inference, and then switch it back to model.train() for training.
    Lesson 5 Part 10 (3.5.10) — Cross Entropy Loss, Log Softmax, Log Loss NLLLoss() average loss over minibatch, training process
  • (3.1.14) — Model Evaluation: How many epochs to train before model is overfitting? How do you know? Split data into train test validation set!! It’s an important concept in practice. Model only looks at training set during training and weight updating. After each epoch, the model is evaluated against the validation set (note in most of deep learning this is referred to as testing set). The model never performs back propagation on the validation set. The validation set tells us if the model is generalizing well. Test set (note: in most deep learning this is known as the validation set) is withheld until the very end, after training, all together. It checks the accuracy of the trained model. The withheld dataset is the best simulation we have on hand of data that the model has never seen before.
  • (3.1.15) Validation Loss: choose a percentage, do random subset sampling to get the dataset partition. How to turn a dataset into indices, shuffled then choose which index to get into subsets of validation dataset. Use SubsetRandomSampler() (skip to the code snippet section to see the documentation). Use validation set to figure out programmatically when to stop training.
  • (3.2) Cloud Computing with AWS and get Udacity AWS credit
    (3.3) Transfer Learning
    (3.3.1) Intro to transfer learning using pre-trained CNN architecture such as VGG-16 model with 1000 class output, and ResNet
    (3.3.2) Visualize VGG architecture (with feature extraction layers and linear layers before output). Only need to train the final layers. Illustrate where the transfer learning happens in the last couple of layers.
  • 3.3.2 Useful layers: Convolutional Neural Network (CNN) hierarchical feature extract architecture and removable classification layers for transfer learning. Extract features and patterns. If available dataset is small, but it’s similar to imagenet dataset, can use this architecture pre-trained for our project.
  • 3.3.3 A very very detailed section! Lots of content here. What to do when dataset is large and different from ImageNet? Transfer learning with Inception by Sebastian Thrun and Stanford University partners to classify skin cancer, last densely connected layer was removed. Added a new fully connected layer with an output size we define. Output layer for each disease class. Random weight initialization was used for the final layer. Intialized the rest of the weights using pre-trained weights. Re-train the entire network in the end. What to do with the four scenarios of new data:
  1. New data set is small, new data is similar to original training data.
  2. New data set is small, new data is different from original training data.
  3. New data set is large, new data is similar to original training data.
  4. New data set is large, new data is different from original training data.
  • 3.3.4 VGG Model & Classifier: will train the last 3 fully connected layers. Since the final layer is new added, with the number of classes relevant to the new dataset, this process is called TRAIN. The 2nd and 3rd to last layer were there before, so it is called FINE TUNING when it is trained again. Check if CUDA is available. Use Pytorch ImageFolder class, which assumes the following conventions: the folder names are correct label names, e.g. all sunflower images should be in the sunflower folder. VGG model expects to see 224x224 images as input. use transforms.RandomResizedCrop(224) to prep inputs. DataLoader class loads data in BATCHES. How to access specific VGG16 layers and fully connected layers. Print out in_features and out_features.
  • 3.4.2 how to initialize constant weights and short fall of init constant weights. def __init__(self, hidden_1, hidden_2, constant_weight=None): … if constant_weight is not None: … #set constant_weight as an optional parameter. nn.init.constant_(variable_to_set, value_to_set) e.g. nn.init.constant_(m_bias, 0)
  • 3.5.5 Defining & Training an autoencoder. One compresses one unzips. Init a NN, with two fc’s one for encoding one for decoding. Dimensions (input, encoding_dim) and (encoding_dim, input) so that it can be connected and the result is comparable. Criterion compares input image and output image.
    3.5.6 test auto encoder by looking at its output image. reshape images back to original MNIST style output = output.view(batch_size, 1,28,28)
  • 3.5.6 A simple solution: can observe where training loss decreases drastically versus slowly. One way to check how the model is doing. Compare original image to reconstructions. Can display it. Can flatten image to autoencoder, reconstruct to 28x28 again.
  • 3.5.7 learnable upsampling; rather using a linear layer, can also use a convolutional layer, which preserves spatial information. The encoder now becomes a hierarchical structure with some CNN layers, that typically downsample (such as max pooling). How to go from compressed to reconstructed? Want to reverse the down sampling, upsampling (unpool). Such as using an interpolation technique nearest neighbors. This is just copying the existing values. But can train and learn how to upsample an image effectively. Tranpose convolutional layer. Dubbed de-convolutional layers with learnable parameters. It’s not undoing CNN.
  • 3.5.8 Tranpose Convolution: Math behind Tranpose Convolutional Layer

No comments:

Post a Comment

Understand the Softmax Function in Minutes

Reposted from Uniqtech's Medium publication with permission. This is retrieved on May 14 2019. Uniqtech may have a newer version. Unde...