Background
Deposit products are some of the most fundamental products that retail banks provide. In recent years, events such as the collapse of Silicon Valley Bank (SVB) in March 2023 have highlighted the need for developing a deeper understanding of the deposit base. The objective of this study is to improve the prediction quality of deposit product migration through customer segmentation using synthetic transaction data.
Synthetic data
The synthetic data was generated by a state space model using public economic data. Each data point represents a synthetic customer’s net-flows of three asset classes: current deposits, saving deposits, and investment products. Synthetic customers belong to groups that have distinct sensitivities to different economic factors.
Approach
Two different time series clustering frameworks were tested: Dynamic Time Warping (DTW) based clustering and Not Too Deep (N2D) clustering. The resulting clusterings are evaluated based on how well they correspond to the true lables, and their effects on forecasts of deposit flows.
Implementation
The majority of the project was run on Databricks with Python. Public economic data was fetched using APIs provided by Tilastokeskus and ECB Data Portal. The N2D autoencoder was implemented using PyTorch
, while the underlying clustering was done using mainly with scikit-learn
.
Additional Resources
More details of the study can found in Aaltodoc.