Ad

Monday, October 31, 2016

Udacity Machine Learning Nanodegree Udacity Connect Intensive Syllabus


  • 1. Model evaluation and validation
    • 1.1 STATISTICAL ANALYSIS
    • 1.2 DATA MODEL
    • 1.3 EVALUATION AND VALIDATION
    • 1.4 MANAGING ERROR AND COMPLEXITY
    • 1.5 PROJECT

  • 1.3 EVALUATION AND VALIDATION
    • 1.3.1 TRAINING AND TESTING
      • 1.3.1.1 Benefit of testing
      • 1.3.1.2 Train / Test Split in sklearn
      • Useful concepts : train_test_split function
    • 1.3.2 EVALUATION METRICS
      • 1.3.2.1 Metrics
      • 1.3.2.2 Classification and Regression
        • Useful concepts: Categorical data vs continuous data
      • 1.3.2.3 Classification metrics
        • Useful concepts: discrete predictions
      • 1.3.2.4 Accuracy 
        • Useful concepts: proportion of items classified or labeled correctly, my_model.score(X_test, y_test). Shortcoming of accuracy if data is skewed, or need to err on side of innocence or git. Accuracy: no. of items in a class labeled correct / all items in that class (Erron has a small number of innocent people)
      • Picking the Most Suitable Metric
        • Concept: information asymmetry
      • Confusion Matrix
        • Concept: if care about asymmetric learning, may want to shift the decision front up or down to include certain results
      • Decision Tree: confusion matrix
      • Precision and Recall
      • Equation for Precision
        • Concept: precision = true positives / (true positives and  false positives)
      • Equation for Recall
        • Concept: precision = true positives / (true positives and  false negatives)



Precision vs Recall
F1 Score
Regression metrics

Mean Absolute Error
Mean Squared Error
Regression Scoring Function

Managing Error and Complexity
Cause of Error
Error due to bias
Linear Learner, Quadratic Data (programming learning curve)
Error due to Variance - Precision and Overfitting

Representative Power of a Model

  • 1.1. Curse of Dimensionality

1.2. Curse of Dimensionality Two
Learning Curves and Model Complexity
1.1 Learning Curves
1.2 Learning Curves II
1.3 Ideal Learning Curves
1.4 Model Complexity
1.5 Learning Curves and Model Complexity
1.6 Practical Use of Model Complexity

Section Syllabus

  1. Supervised Learning
    1. Regression
      1. Continuous supervised learning


No comments:

Post a Comment

K mean clustering sklearn best practice - Udacity Machine Learning Nanodegree Unsupervised Learning

There are three key k means clustering parameters in sklearn that you will need to pay attention to: Number of centroids, aka center of c...