Modern Data Analysis: Sensitivity and Specificity

Sensitivity and Specificity

Apart from Accurancy and Precision there are othere measures for classification models. Today we will focus on another pair of classifiers, called sensitivity and specificity. Like accuracy and precision they are also numbers between 0 and 1 and the higher the values the better.
A perfect classification model would have 100% sensitivity and 100% specificity.

Before defining these values, we recall the confusion matrix:

Now

Sensitivity = True Positives / Actual Positives

In other words sensitivity describes the probability that a positive is recognized as such by the model, therefore sensitivity is also often called true positive rate.

Analogously

Specificity = True Negatives / Actual Negatives

In other words specificity describes the probability that a positive is recognized as such by the model, therefore specificity is also often called True Negative Rate.

In an example lets assume that we have a binary classifier for cat and dog pictures. We test it with 100 pictures, of which 50 cat pictures and 50 dog pictures. Our classifier however erroneously classifies 6 cats as dogs and 2 dogs as cats.
We would have the following confusion matrix:

Cats Dogs
Cats 44 6 50
Dogs 2 48 50

The sensitivity would be 44/50 = 88%, the specificity 48/50 = 96%.

Where are these values used?
A common way to compare the quality of different classifiers is to use a receiver operating characteristic curve or ROC-curve where the true positivie rate (sensitivity) is plotted against the false positive rate (1-specificity). It has its quite complicating name due to its first use during World War II to detect enemy objects in the battlefields. Every test result or confusion matrix represents one point in the ROC curve. But this will be a seperate post...

Modern Data Analysis

Sensitivity and Specificity

Mirko

0 Comment