The ROC curve (receiver operating characteristic curve) is a graphical illustration that can be used to visualize and compare the quality of binary classifiers.
Recall that for a classification experiment we can built the confusion matrix. In there we see the values of the true positives (TP), false positives (FP), false negatives (FN) and true negatives (TN). For the ROC curve we need the TP rate on the y-axis and the FP rate on the x-axis. As TP is also called sensitivity and FP is equal to 1-specificity, the ROC curve is also called the sensitivity vs (1-specificity) plot.
We define the ROC space to be the area in the following area:
Now run your classifier and calculate sensitivity and specificity and create the data point corresponding to your test into the ROC space. The closer it is to the upper left corner, the better is you classifier. If you have a 100% correctly predicting classifier - congratulations, you will find your classifyier directly at (0,1).
Now to create the ROC curve, begin with the values that are easiest to classify positive. Then stepwise include more examples. Every time the classifier classifies a dataset correctly as positive, the line will go up, every time the classifiere classifies a dataset incorrectly as positive, the curve will move to the right (for small values you will actually not get a curve, but a zigzag line.
What can you read from the ROC curve?
A random classifier would create equal proportions of true positives and false positives (independent of the number of actual true/false positives). So a random classifier would yield to the yellow line.
The perfect classifier would not create any false positives, so we get sensitivity 1 already for specificity 1 (false positive rate = 0), this would be the green line in the ROC space.
A non-trivial classifier would lie somewhere in between (red line), note that a bad classifier would lie below the random line and could be turned into a better classifier by simply inverting the predictions.
What else can be said about the ROC curve:
- Unlike cumulative gains chart the ROC curve does not depend on the densitiy of the positives in the test dataset.
- Sometimes the area under the ROC curve ("Area under the Curve", AUC) is determined in order to give a classifier a comparable number. A bad classifier would have an AUC close to 0.5 (random classifier), meanwhile a good classifier would have a value close to 1.
Note that this approach to compare classifiers has been questioned based upon recent machine learning research, among other reasons because AUC seems to be a quite noisy measure.