Managing Alert Rates
From FraudWiki
A classifer of fraud will generate alerts. This article discusses ways of managing the alert-rate. It is recommended that the article on Measuring Performance is read before reading this article. Much of the algebra can be skipped !
Contents |
Definitions
Alert Rate
The Alert Rate (AR) is simply the number of alerts raised by a classifier within a period of time, usually 24 hours.
If we know the total number of transactions per day and the average frauds and legals then the relationship between AR, FPR and TPF follows from a little algebra:
Given
and
then we have
or
where
So, for a given AR and k, a minimal FPR will give a maximal TPF.
True Positive Alerts
It is useful for the following discussion to define the True Positive Alert Fraction (TPA) as the fraction of the total number of alerts that are correct.
So we have
It is easy to see how this relates to the False Positive Ratio as
For more information on True Positives, False Positives, False Positive Ratio see Measuring Performance
Percentage of Correct Alerts
A useful way of looking at alert rates is to plot the TPA against the True Positive Fraction (percentage of fraud caught) as the threshold is varied from 0 to 1:
Choosing an Alert Threshold
If we plot the True Positive Alerts (TPA) against threshold we may get a graph like the one below:
You will see that there is an optimal threshold at which the faction of correct alerts is maximal. At this point the quality of the system is at its best in terms of the percentage of alerts that are correct. As the threshold is reduced the total number of alerts will increase but so will the number of false positives and so the quality, as we have defined it, reduces. As the threshold is increased above the optimal point a more complicated effect starts to assert itself; the total number of alerts is reduced but the quality of the remaining alerts is also reduced because of the sparseness and quality of the training data.
This latter effect can be dramatically reduced by pre-processing and cleaning the training data. The existence of an optimal point is an artifact of the quality of the training data. However, in real life it is often the case that systems have to work with very 'noisey' data. The use of unsupervised outlier detection techniques can help improve the overall performance of the system but often plotting this graph will still suggest an optimal threshold.
In practice, we often need to select a threshold that gives as few false-positives as possible but with the best detection or true-positives given other external constraints. This is the same as saying the highest true positive alerts. However, it should be borne in mind that the detection rate or true positive fraction at this threshold may not be as high as required. To detect more fraud we necessarily have to compromise on the quality and expect more false positives. Ultimately, the threshold chosen must also be based on the resources available to process alerts.


