Ad

Friday, June 12, 2020

Regularization in Machine Learning, Deep Learning

Regularization can prevent overfitting and potentially make algorithm converge faster and more performant. Useful in deep learning tasks, in neural networks. Regularization acts on the loss function (cost function) by adding an extra penalty term. The penalty term depending on the method of regularization, penalizing the weight parameters so it is a function of w

Two common regularization methods:
  • Lasso 
    • Uses L1-norm
  • Ridge
    • Uses L2-norm
A trick to remember the norm is that letter L comes before letter R, so Lasso is L1 norm and Ridge is L2 norm. 

One is more likely to result in sparse solutions turning one or more coefficients zero. Which one do you think it is? 

Quiz: which formula is Lasso? Which one is ridge?

  • Regularization penalizes overly complex models
  • Large weights usually make penalty term higher, so smaller effective weights are preferred
    • Larger weights cost more
  • Regularization = regular_loss_function + extra_penalty_term(lambda, weights)
    • The extra penalty term also depends on the weights parameter and the lambda rate parameter

No comments:

Post a Comment

Regularization in Machine Learning, Deep Learning

Regularization can prevent overfitting and potentially make algorithm converge faster and more performant. Useful in deep learning tasks, in...