In 2012, Google’s neural network taught itself to recognise cats. Since then, there have been significant advancements in artificial intelligence and machine learning. So, why do I still feel nervous when someone asks, “how does your FMS identify new frauds we don’t know about?”.
Well, let’s put this into perspective. To recognise cats, Google combined a large team of the brightest minds with 16,000 computer processors (CPUs) and a lot of time. The people that ask me my least favourite question aren’t thinking in that kind of scale. Furthermore, although some cats are fraudsters, many are not.
Machine Learning is all about giving (a lot of) data to a computing device and asking it to learn something from it – hopefully something useful or interesting. Broadly speaking, there are two approaches to learning: supervised and unsupervised. Supervised learning is analogous to attending a class where the teacher teaches you geometry; unsupervised is like giving someone a tin can and asking them to discover something interesting (like how many baked beans it will hold or the value of the constant p).
Putting this in our context, supervised learning might help you to discover new instances of known frauds (you give the computer previous examples to learn from); unsupervised learning might help you discover frauds you were previously unaware of.
It is not unreasonable to expect that the application of unsupervised learning will uncover new frauds. But it is unreasonable to expect that it will only uncover new frauds. Put simply, you are just as likely to learn about baked beans as you are about the value of p.
To demonstrate this, I applied a clustering technique to the data of 5,347 roaming subscribers. To make it easier, I only considered their outgoing, voice calling records. After a fair amount of effort, I came up with five clusters (see below).
According to the algorithm, most subscribers (5,158) behave similarly. The 139 black dots are anomalies: that is, they are subscribers who behave differently from anyone else. Are the 139 subscribers fraudsters? Or are they your best subscribers? Or perhaps they are baked beans? The answer, of course, is that I don’t know.
With enough time and effort, I could investigate the 139 subscribers (or the subscribers of other small clusters) and make my conclusion. Luckily, though, I know something about this data. I know there is a single fraudster (which can be found by analysis of a few calls); and that fraudster is one of the 139. Does this mean that my efforts were successful? No, of course not. By the time I have investigated 139 subscribers, the one fraudster will be long gone – or at a minimum taken you for a lot of cash.
There are also some other things I didn’t tell you. By changing the parameters of the algorithm or adding/deleting characteristics of the subscribers, I get a completely different makeup of clusters. Simply put, machine learning is easy; getting good results is not.
Searching for something in a hay stack when you don’t know you’re looking for a needle takes a lot of time and effort. You will find the needle in the end and you might find some other interesting things along the way, but it isn’t as easy as you expect.