Michele Starnini
Senior Researcher, CENTAI
Michele Starnini is currently a Senior Researcher at CENTAI. Before, he was a researcher at the ISI Foundation and a James S. McDonnell fellow at the University of Barcelona. He earned a Ph.D. in StatisticalPhysics from the Polytechnical University of Catalonia. He published more than 40 scientific papers in high-impact physics, sociological, biological, and computer science journals. Some of his works were covered by the national and international press, as well as in podcasts. He led and managed several national, European, and American research projects.
His research focuses on the analysis and modeling of large data sets to understand emerging socio-economic phenomena, such as the effects of opinion polarization on information diffusion over social networks. On a more theoretical level, he studies the unfolding of stochastic processes on heterogeneous, networked populations. He promoted industrial applications of his research, developing anti-money-laundering algorithms in monitoring financial transaction networks. To these aims, he applies analytic, numerical, and data-driven approaches, including statistical analysis, data mining, agent-based modeling, and machine learning.
Challenges and Perspectives in Machine Learning for Anti Money Laundering (MLxAML)
Money laundering involves concealing the origin of illegal assets through seemingly legitimate transactions, fueling corruption, organized crime, and terrorism. Despite worldwide efforts against it, money laundering can involve from 2% to 5% of the world’s domestic product. Anti money laundering (AML) controls are thus of paramount importance for financial institutions. Historically, AML is implemented by rule-based approaches, flagging suspicious transactions based on fixed thresholds or high-risk countries. These methods have downsides such as constant rule updates, challenges in handling unstructured data, and the tendency to produce numerous false positives. Machine learning (ML) algorithms can help overcome these limitations. ML algorithms are generally divided into supervised methods, learning from labeled examples in a training set and classifying new data (testing set) into different categories, and unsupervised approaches, aiming to group unlabelled data into distinct clusters, characterized by unique features or patterns. However, having reliable labeled data in AML presents several challenges, such as the presence of heavily imbalanced data (suspicious transactions are typically rare), the uncertain and time-intensive aspects of manual labeling, and the potential for human bias. Unsupervised approaches such as network analysis are suited to identify key actors and patterns of relationships within financial transactions, represented as temporal graphs. Smurfing, a money-laundering technique that involves breaking up large amounts of money into multiple small transactions, can be detected by simple graph mining algorithms. Likewise, graph neural networks (GNNs), a ML model designed for analyzing graph-structured data, could uncover suspicious patterns and anomalies. Yet, the application of ML in AML comes with challenges, such as data bias and interpretability. Explainable Artificial Intelligence (XAI) can mitigate these challenges: It ensures transparency by clarifying AI model decisions, aiding in compliance and regulatory requirements. XAI supports model validation, enhances investigation processes, and helps assess transaction risks. It promotes the involvement of human experts in evaluating ML results, leading to better decision-making and continuous model improvement. XAI's clear explanations aid communication with stakeholders and reduce false positives, making AML processes more effective, ethical, and aligned with regulations. Finally, financial institutions currently monitor transactions for suspicious activities in a siloed way. This approach is ineffective as financial transactions can form interconnected networks spanning multiple financial entities and crossing borders. Criminals take advantage of such complexity by operating in groups that exploit such vulnerabilities. Cooperation among financial institutions is thus pivotal for AML, but information sharing must safeguard privacy. Privacy-Enhancing Technologies (PETs) such as synthetic data generation (creating realistic data sets for AML testing), the differential privacy framework (ensuring the privacy of single individuals in datasets), and federated learning (sharing trained ML model without sharing data) could help in this regard.
To sum up, ML is a powerful tool for AML, even considering the scarcity of reliable labeled data. Graph- based approaches are particularly useful for detecting suspicious patterns in financial transactions. However, applying ML in AML brings challenges related to data bias and interpretability, which can be addressed through XAI. Since cooperation among financial institutions is key, PETs can help in sharing information among financial institutions securely.