Link Analysis
From FraudWiki
Link Analysis has become an important tool in anti-terrorism and anti-Money Laundering. It provides a means of finding relationships between individual transactions, or groups.
Contents |
Simple Link Analysis
As an example, consider a large company's PBX system which logs each call in a database record, generally refered to as the Call Data Record (CDR). The CDR will contain information like: the source extension, the destination extension, the call duration, etc. If we wanted to find the groups of people in the company that talk together we could adopt the simple procedure outlined below.
We define the matrix d which is the sum of the call durations originating at extension i and terminating at extension j .
We then define a function that measures the strength of a relationship between two extensions as:
which is just the the sum of the call durations from i to j plus the sum of the call durations from j to i.
We wish to discover groups where all members within the group have a strong relationship with all other members in that group.
Assume we have N extensions and have a set G of M groups. The strength of membership of extension i to some group
could therefore be defined as:
where
and
Similarly, if we take two groups we can compute the strength of the relationship between the groups as:
If we say that initially every extension is a group containing only itself and its strength of membership is 1 then the algorithm for grouping is then:
- for each group compute the strength of the relationship between itself and all the other groups (upper diagonal matrix as the relationship is symmetrical)
- merge the two groups with the strongest relationship (to speed convergence we could merge the top three or four groups at the start)
- compute the variance of the strength of membership of the merged group
- if the variance is below a chosen threshold then repeat from 1
For the application described the alogorithm is adequate but it is not suitable for large datasets. The computational requirement scales as ~ N2. Where there are a large number of data points (extensions in this case) to analyse it may be better to partition the data and to analyse each partition seperately. The partition groups could then be collected and analysed together.
Common Data Method
This is a technique used by the FinCEN Artificial Intelligence System (FIAS) system which is described in full in [ref 1]. We give a brief summary here as the basic principal is very simple and elegant.
See Also
