Friday, March 8, 2019

ROC Curve Basics Cheatsheet

Receiver operating characteristic (ROC) plots the true positive rate (TPR) against the false positive rate (FPR) at various threshold setting (source: wikipedia). It's a measure of how good the decision frontier is at each split.

Aka the sensitivity and specificity curve.

Horizontal axis  plots the True Positive Rate. The vertical axis plots the False Positive Rate.

Benchmark is a random guess: the 45 degree line. The best scenario, perfect split, area under the curve is 1.

There are two "extreme" points, if we classify everything as positive, the TPR, FPR = (1,1). If we classify nothing as positive, the TPR, FPR = (0,0) because the true positive rate is true positive / all positive = zero / all positive.

True Positive Rate
True positive rate = true positive / all positives

False Positive Rate
False positive rate = false positive / all negatives

To plot the entire curve, do it for as many possible splits as possible.

Area under the curve is important metric.

No comments:

Post a Comment

Regularization in Machine Learning, Deep Learning

Regularization can prevent overfitting and potentially make algorithm converge faster and more performant. Useful in deep learning tasks, in...