Managing Alert Rates

From FraudWiki

Jump to: navigation, search

A classifer of fraud will generate alerts. This article discusses ways of managing the alert-rate. It is recommended that the article on Measuring Performance is read before reading this article. Much of the algebra can be skipped !

Contents

Definitions

Alert Rate

The Alert Rate (AR) is simply the number of alerts raised by a classifier within a period of time, usually 24 hours.

If we know the total number of transactions per day and the average frauds and legals then the relationship between AR, FPR and TPF follows from a little algebra:

Given FPR = \frac{{FP}}{{TP}} = \frac{{FPF}}{{TPF}}.\frac{{N_L }}{{N_F }} and AR = \frac{{TP + FP}}{{N_F  + N_L }}

then we have AR = TPF\frac{{1 + FPR}}{{1 + k}} or TPF = AR\frac{{1 + k}}{{1 + FPR}} where k = \frac{{N_L }}{{N_F }}

So, for a given AR and k, a minimal FPR will give a maximal TPF.

True Positive Alerts

It is useful for the following discussion to define the True Positive Alert Fraction (TPA) as the fraction of the total number of alerts that are correct.

So we have TPA=\frac{TP}{TP+FP}

It is easy to see how this relates to the False Positive Ratio as TPA=\frac{1}{1+FPR}

For more information on True Positives, False Positives, False Positive Ratio see Measuring Performance


Percentage of Correct Alerts

A useful way of looking at alert rates is to plot the TPA against the True Positive Fraction (percentage of fraud caught) as the threshold is varied from 0 to 1:

The sample graph shows how the TPA changes with different overall detection rates (TPF). On the left of the graph (where the alert threshold is very high) almost 50% of alerts are correct but very little of the overall fraud is captured. At the other extreme, 80% or more of fraud can be detected at the cost of many false alerts.

This is a good example of how selecting a threshold is a compromise between alert rate

Image:PercentCorrectAlerts.png


Choosing an Alert Threshold

If we plot the True Positive Alerts (TPA) against threshold we may get a graph like the one below:

You will see that there is an optimal threshold at which the faction of correct alerts is maximal. At this point the quality of the system is at its best in terms of the percentage of alerts that are correct. As the threshold is reduced the total number of alerts will increase but so will the number of false positives and so the quality, as we have defined it, reduces. As the threshold is increased above the optimal point a more complicated effect starts to assert itself; the total number of alerts is reduced but the quality of the remaining alerts is also reduced because of the sparseness and quality of the training data.

This latter effect can be dramatically reduced by pre-processing and cleaning the training data. The existence of an optimal point is an artifact of the quality of the training data. However, in real life it is often the case that systems have to work with very 'noisey' data. The use of unsupervised outlier detection techniques can help improve the overall performance of the system but often plotting this graph will still suggest an optimal threshold.

In practice, we often need to select a threshold that gives as few false-positives as possible but with the best detection or true-positives given other external constraints. This is the same as saying the highest true positive alerts. However, it should be borne in mind that the detection rate or true positive fraction at this threshold may not be as high as required. To detect more fraud we necessarily have to compromise on the quality and expect more false positives. Ultimately, the threshold chosen must also be based on the resources available to process alerts.



See Also

Personal tools
Advertisement