Chapter 6. Danske Bank Case Study Details

In this chapter, we take a closer look at how Danske Bank is achieving high-impact business outcomes by fighting fraud with machine intelligence.

As discussed in Chapter 5, Danske Bank was struggling to mitigate fraud by using legacy detection systems. With a low 40% fraud detection rate and a 99.5% rate of false positives, it was clear the bank needed to modernize its defenses. To do this, it made a strategic decision to apply new analytic techniques, including AI, to better identify instances of fraud while reducing false positives in real time.

In partnership with Teradata Consulting, the bank was able to develop analytic solutions that take advantage of its unique data and provide a significant improvement over its previous rules-based engine, reducing false-positive detections of fraud by 60% with machine learning (with expectations to reach as high as 80% using deep learning), and increasing the true-positive detection rate by 50%, as illustrated in Figure 6-1.

Figure 6-1. Danske Bank business outcomes enabled by AI

The Project, the Tools, and the Team

The project evolved through several phases. First, the team laid the foundation for the analytics platform and got the data plumbing in place that would serve the machine and deep learning models. Then, the team trained, tested, and deployed machine learning models before moving on to the deep learning track. As of this writing, the deep learning models are showing great promise at Danske Bank, and the team is working on putting them into production.

The data science lab brought together a variety of tools for training, validating, and promoting the machine and deep learning models after they were proven:

  • The lab used both CPUs and NVIDIA GPUs to process the data.

  • They employed a variety of software frameworks, including the following:

    • Anaconda Python distribution for some simpler models

    • Spark and Hive for data preparation and wrangling

    • TensorFlow and Keras for building deep learning models

    • Tracking and deployment software like Git

Getting the Right Data in Place

When the team kicked off the project, it began by building out the data layer, making sure it had access to the right kind and quality of data that it needed. The team also ensured that it had the right features in place to train the machine learning models.

Right away, the team faced a significant challenge. The bank had only a very limited set of accurate data to work with when the team came on site, and the team quickly established that it needed to improve on that data quality and size before it could develop accurate models.

To get more data, the team first needed to identify and extract historical fraud cases to use as positive examples. Because these were logged in unstructured Excel sheets, the team needed to extract the fraud cases from them in a semi-manual process using regular expression matching and manual processing. Though the work was tedious, one benefit was that the team was able to get a better understanding of what typical fraud schemes looked like as it examined thousands of cases during the extraction process.

Then, the team had to reconstruct all historical transactions within the previous three years between senders and receivers into single transactions from their respective subtransactions (because a single transaction between two parties goes through a variety of intermediate accounts, depending on type and origin). This reconstruction was no small feat, considering that billions of rows of data existed outside the normal business logic of the bank’s real-time transactions systems from which the team had to identify relevant transactions and apply the right rules.

As the third step, the team then needed to match the fraud cases to the billions of transactions to ensure that it could train on an accurate dataset that had correct information about which transactions were fraudulent and which were not.

Finally, the team had enough accurate data to work with and could train the models.

Ensemble Modeling and Champion/Challenger

There are a variety of ways to ensure that you’re getting the best output from your advanced analytics models. One of those is by using ensemble modeling, which is the process of running two or more related but distinct analytical models and then synthesizing the results into a single score. Combining multiple models in this way helps reduce noise, bias, and variance in your output to deliver superior prediction power—the advanced analytics version of “all of us are smarter than one of us.” Several well-known techniques (i.e., the use of bagging and boosting algorithms) can further embellish this method to enhance accuracy.

Another time-tested methodology for improving analytics outcomes is using champion/challenger, in which advanced analytics systems compare models in real time to determine which one or ones are most effective. At Danske Bank, for example, challenger models process data in real time to see which traits are more likely to indicate fraud. When a model dips below a certain predefined threshold, it is fed more data or augmented with additional features. And when a challenger outperforms another challenger, it is transformed into a champion, leading other models closer to better fraud prediction. Continual retraining helps retain the accuracy of the highest-performing models.

Together, these methods enable increased prediction power and provide other benefits, such as making sure the system always returns an answer, even if one model takes too long to score.

Working with the Past, Building the Future

After the team was able to successfully train the machine learning models—an ensemble of boosted decision tree and logistic regression models—it found another hurdle as it moved to put them into production. As a 146-year-old institution, Danske Bank had decades of transactions running through a mainframe server. It was going to be a challenge to get the models into production and maintain latency requirements using the bank’s current infrastructure.

The bank needed an architecture that would allow the models to run across the millions of daily transactions. To make that possible, the team enhanced the bank’s infrastructure capabilities with an analytics platform that would be able to elevate the models to production and could be used in future applications for various domains. This was an add-on to what was already running, and the rest of the bank’s architecture was left in place.

Moving the ML Models into Live Production

With the advanced analytics platform built and only three months into the engagement, the team was able to go into shadow production of the machine learning models. A shadow production phase was necessary to help stakeholders become familiar with the model and determine whether it needed retraining before it went live.

After three months in shadow production, the machine learning framework was hardened, and it was expanded to run over multiple datacenters with continual monitoring. Nine months after the start of the project, the machine learning model was ready to be put into live production, where it saw impressive results.

The models were a significant improvement over the former rules-based system, with the rate of false positives reduced by 50%. This removed half the workload of investigators. However, many instances of fraud were still going undetected.

To work well, the machine learning models had to view transactions atomically. They could not ingest information about sequences of events, let alone correlation across channels, features, dependencies, and time series—clues that would certainly help pinpoint more instances of fraud. These, however, are areas where deep learning excels.

In the next phase of the project, Danske Bank integrated deep learning software with GPU appliances to try to capture the remaining cases of fraud and achieve an even lower rate of false positives.

From Machine Learning to Deep Learning

As it moved onto deep learning, the team was able to use the analytics platform it had built during the machine learning phase to test and validate different kinds of neural network architectures. Figure 6-2 is a receiver operating characteristic (ROC) that shows the baseline legacy rules engine, the lift associated with classic machine learning, and the even higher lift stemming from an ensemble of deep learning models.

Figure 6-2. ROC showing dramatic lift resulting from Deep Learning models when compared to Classic Machine Learning and legacy Rules Engine

Over the course of four two-week sprints, the team tested a variety of deep learning models, including convolutional networks, Long Short-Term Memory (LSTM), and generative models. One was found to be especially effective—the ResNet model architecture is a version of a convolutional neural network and an algorithm that is typically used for object detection and computer vision.

This might be surprising, considering the team was not working with pictures or video. Rather, it was using traditional table sets and structured data. So how (and why) did the team make this work?

Visualizing Fraud

The deep learning architectures that specialize in visual detection and object recognition are some of the most advanced, even surpassing human performance at identifying and labeling objects, and doing it much more quickly and reliably. Another benefit of convolutional neural networks is that they use relatively little preprocessing compared to other image-classification algorithms, so people need to do less work in order for the algorithm to determine what’s relevant.

The team was interested in using these architectures because of their maturity. To do so, it discovered a way to turn a table set of data into an image by taking raw features as input and clustering correlated features along the x-axis, with the y-axis representing time, providing a two-dimensional view of the transaction. The team then fed this view into the model, which was then able to detect new relationships and complex patterns in the data. Figure 6-3 demonstrates how features such as frequency of transactions, merchant location, relative size of transaction, and others are transformed from a tabular layout into a matrix.

Figure 6-3. Converting tabular data to a matrix in order to conform to deep neural network input requirements

This matrix is similar to a digital image, meaning it better conforms to the input required by neural networks. Figure 6-4 shows the model output, with fraudulent transactions appearing more red in hue when compared to bona fide transactions.

arbc 0604
Figure 6-4. Visualizing the convolutional neural network output for fraud detection

The net result was a 20% reduction in false-positive rate—a significant improvement over traditional machine learning models.

Visualizing and Interpreting Deep Learning Models

Deep learning can provide a significant advantage over machine learning in some domains. However, it does come with its own challenges. In particular, it can be difficult to understand how deep learning algorithms make decisions.

In using deep learning for financial transactions, however, model interpretability is crucial for a number of reasons:

  • Investigators have less work to do if they understand why a model made a particular decision, because they know what to look for as they are examining possibly fraudulent transactions. They can also gain insight into why the fraud is happening, which is information that can be very useful. Finally, interpretability increases trust in the model’s results, helping with its adoption and integration into current processes.

  • For Danske Bank, interpretability is also necessary for compliance with the EU’s General Data Protection Regulation (GDPR). If it is found that the bank is not in compliance with these regulations, which requires that any financial institution be able to provide information about how it used a customer’s data, it could face very heavy financial penalties.

  • It is also important for building customer happiness and trust, so the customer can have a satisfactory answer as to why their transaction was denied.

To approach the problem of interpreting its deep learning models, the team deployed open source work out of the University of Washington called Local Interpretable Model-Agnostic Explanations (LIME).

LIME (introduced in Chapter 4) is a system that allows you to examine the key characteristics of a model at the point of decision so that you can see which specific data points triggered the model. If you have multiple models running, LIME is also helpful in comparing them and finding out which features were triggered in order to judge performance.

Though LIME is certainly a step forward, this problem is far from solved. Visualizing and interpreting deep learning models is important and ongoing work. For fraud detection, it is crucial to be able to see that fraud events match human expectations based on experience and history, and that the model is treating fraudulent transactions using different mechanisms than nonfraudulent ones.

A Platform for the Future

Through their partnership with Teradata Consulting, Danske Bank was able to build a fraud detection system that made autonomous decisions in real time that were aligned with the bank’s procedures, security, and high-availability requirements. The solution also provided new levels of detail, such as time series and sequences of events, to better assist the bank with its fraud investigations. With it, the bank’s engineers, data scientists, lines of business, and investigative officers were able to collaborate to uncover fraud.

Though this chapter emphasizes the technology behind the solution, the organizational and change management considerations of this project were equally essential for the solution’s success. The project leaders fully understood and embraced change-management best practices, beginning with small wins to prove the value and viability of the AI solution, socializing it across stakeholders, and moving from there to full implementation and operationalization.

For Danske Bank, building and deploying an analytic solution that met its specific needs and used its data sources delivered more value than an off-the-shelf model could have provided for a number of reasons, not the least of which is that no off-the-shelf product would be able to provide fraud detection techniques at the level of its custom solution.

With its enhanced capabilities, the solution is now ready to be used across other business areas of the bank to deliver additional value, and the bank is well-poised to continue using its data in innovative ways to deliver value to its customers.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset