Prof. Bart Baesens gave a talk on “Using AI for Fraud Detection: Recent Research Insights and Emerging Opportunities”

Typically, organizations lose around five percent of their revenue to fraud. In this presentation, we explore advanced AI techniques to address this issue. Drawing on our recent research, we begin by examining cost-sensitive fraud detection methods, such as CS-Logit which integrates the economic imbalances inherent in fraud detection into the optimization of AI models. We then move on to data engineering strategies that enhance the predictive capabilities of both the data and AI models through intelligent instance and feature engineering. We also delve into network data, showcasing our innovative research methods like Gotcha and CATCHM for effective data featurization. A significant focus is placed on Explainable AI (XAI), which demystifies high-performance AI models used in fraud detection, aiding in the development of effective fraud prevention strategies. We provide practical examples from various sectors including credit card fraud, anti-money laundering, insurance fraud, tax evasion, and payment transaction fraud. Furthermore, we discuss the overarching issue of model risk, which encompasses everything from data input to AI model deployment. Throughout the presentation, the speaker will thoroughly discuss his recent research, conducted in partnership with leading global financial institutions such as BNP Paribas Fortis, Allianz, ING, and Ageas.

Welcome two new PhD students to join the group!

The group for risk, reliability, and resilience informatics warmly welcomes the following PhD students to join the team:

Tao Wang, and Xinru Zhang.

Research project funded by the Young Scientists Fund of National Natural Science Foundation of China

The malfunction of deep learning systems in safety-critical applications (e.g., aerospace) will lead to devastating outcomes. Thus, how to ensure the trustworthiness of deep learning systems in high-stakes decision settings is an imperative problem to be tackled. This project aims at accommodating heterogeneous sources of risks pertaining to the input data of deep learning systems in the open world, multi-source uncertainty inherent in the model reliability for individual prediction as well as the coupled relationship between input data-related risk and the model reliability tailored to each individual prediction towards devising an uncertainty quantification-based method for trustworthiness modeling of deep learning systems. We believe that the proposed effort will make substantial contributions to the development of novel and effective theories and models for enhancing the trustworthiness of deep learning systems, offer new insights for the trustworthiness modeling of deep learning systems in the open environment, and boost the advancement of trustworthy deep learning systems.

Research paper accepted by Decision Support Systems

The conventional aggregated performance measure (i.e., mean squared error) with respect to the whole dataset would not provide desired safety and quality assurance for each individual prediction made by a machine learning model in risk-sensitive regression problems. In this paper, we propose an informative indicator $\mathcal{R} \left(\bm{x} \right)$ to quantify model reliability for individual prediction (MRIP) for the purpose of safeguarding the usage of machine learning (ML) models in mission-critical applications. Specifically, we define the reliability of a ML model with respect to its prediction on each individual input $\bm{x}$ as the probability of the observed difference between the prediction of ML model and the actual observation falling within a small interval when the input $\bm{x}$ varies within a small range subject to a preset distance constraint, namely $\mathcal{R}(\bm{x}) = P(|y^* – \hat{y}^*| \leq \varepsilon | \bm{x}^* \in B(\bm{x}))$, where $y^*$ denotes the observed target value for the input $\bm{x}^*$, $\hat{y}^*$ denotes the model prediction for the input $\bm{x}^*$, and $\bm{x}^*$ is an input in the neighborhood of $\bm{x}$ subject to the constraint $B\left( \bm{x} \right) = \left\{ {\left. {{\bm{x}^*}} \right|\left\| {{\bm{x}^*} – \bm{x}} \right\| \le \delta } \right\}$. The developed MRIP indicator $\mathcal{R} \left(\bm{x} \right)$ provides a direct, objective, quantitative, and general-purpose measure of “reliability” or the probability of success of the ML model for each individual prediction by fully exploiting the local information associated with the input $\bm{x}$ and ML model. Next, to mitigate the intensive computational effort involved in MRIP estimation, we develop a two-stage ML-based framework to directly learn the relationship between $\bm{x}$ and its MRIP $\mathcal{R} \left( \bm{x} \right)$, thus enabling to provide the reliability estimate $\mathcal{R} \left( \bm{x} \right)$ for any unseen input instantly. Thirdly, we propose an information gain-based approach to help determine a threshold value pertaing to $\mathcal{R} \left( \bm{x} \right)$ in support of decision makings on when to accept or abstain from counting on the ML model prediction. Comprehensive computational experiments and quantitative comparisons with existing methods on a broad range of real-world datasets reveal that the developed ML-based framework for MRIP estimation shows a robust performance in improving the reliability estimate of individual prediction, and the MRIP indicator $\mathcal{R} \left( \bm{x} \right)$ thus provides an essential layer of safety net when adopting ML models in risk-sensitive environments.

Research paper accepted by Mechanical Systems and Signal Processing

Blind deconvolution (BD) has been demonstrated to be an efficacious approach for extracting bearing fault-specific features from vibration signals under strong background noise. Despite BD’s appealing feature in adaptability and mathematical interpretability, a significant challenge persists: \textit{How to effectively integrate BD with fault-diagnosing classifiers?} This issue is intricate to be tackled because the traditional BD method is solely designed for feature extraction with its own optimizer and objective function. When BD is combined with the downstream deep learning classifier, the different learning objectives easily get in conflict. To address this problem, this paper introduces classifier-guided BD (ClassBD) for joint learning of BD-based feature extraction and deep learning-based fault diagnosis. Towards this goal, we first develop a time and frequency neural BD that employs neural networks to implement conventional BD, thereby facilitating seamless integration of BD and the deep learning classifier for co-optimization of model parameters. In the neural BD, we incorporate two filters: i) a time domain quadratic filter to utilize quadratic convolutional networks for extracting periodic impulses; ii) a frequency domain linear filter composed of a fully-connected neural network to amplify discrete frequency components. Next, we develop a unified framework built upon a deep learning classifier to guide the learning of BD filters. In addition, we devise a physics-informed loss function composed of kurtosis, $l_2/l_4$ norm, and a cross-entropy loss to jointly optimize the BD filters and deep learning classifier. In so doing, the fault labels are fully exploited to direct BD to extract features in distinguishing classes amidst strong noise. To the best of our knowledge, this is the first of its kind that BD is successfully applied to bearing fault diagnosis. Experimental results from three different datasets highlight that ClassBD outperforms other state-of-the-art methods under noisy conditions. The source codes of this paper are available at https://github.com/asdvfghg/ClassBD.

Research paper accepted by Nature Communications

Industrial enterprises are prominent sources of contaminant discharge on the planet and regulating their operations is vital for sustainable development. However, accurately tracking contaminant generation at the firm-level remains an intractable global issue due to significant heterogeneities among enormous enterprises and the absence of a universally applicable estimation method. This study addressed the challenge by focusing on hazardous waste (HW), known for its severe harmful properties and difficulty in automatic monitoring, and developed a data-driven methodology that predicted HW generation utilizing wastewater big data in a uniform and lightweight manner. The idea is grounded in the availability of wastewater big data with widespread application of automatic sensors, enabling depiction of heterogeneous enterprises, and the logical assumption that a correlation exists between wastewater and HW generation. We simulated this relationship by designing a generic framework that jointly used representative variables from diverse sectors, exploited a data-balance algorithm to address long-tail data distribution, and incorporated causal discovery to screen features and improve computation efficiency. To illustrate our approach, we applied it to 1024 enterprises across 10 sectors in Jiangsu, a highly industrialized province in China. Validation results demonstrated the model’s high fidelity (R2=0.87) in predicting HW generation using 4,260,593 daily wastewater data.

Research paper accepted by Risk Analysis

In this paper, we develop a generic framework for systemically encoding causal knowledge manifested in the form of hierarchical causality structure and qualitative (or quantitative) causal relationships into neural networks to facilitate sound risk analytics and decision support via causally-aware intervention reasoning. The proposed methodology for establishing causality-informed neural network (CINN) follows a four-step procedure. In the first step, we explicate how causal knowledge in the form of directed acyclic graph (DAG) can be discovered from observation data or elicited from domain experts. Next, we categorize nodes in the constructed DAG representing causal relationships among observed variables into several groups (e.g., root nodes, intermediate nodes, leaf nodes), and align the architecture of CINN with causal relationships specified in the DAG while preserving the orientation of each existing causal relationship. In addition to a dedicated architecture design, CINN also gets embodied in the design of loss function, where both intermediate and leaf nodes are treated as target outputs to be predicted by CINN. In the third step, we propose to incorporate domain knowledge on stable causal relationships into CINN, and the injected constraints on causal relationships act as guardrails to prevent unexpected behaviours of CINN. Finally, the trained CINN is exploited to perform intervention reasoning with emphasis on estimating the effect that policies and actions can have on the system behavior, thus facilitating risk-informed decision making through comprehensive “what-if” analysis. Two case studies are used to demonstrate the substantial benefits enabled by CINN in risk analytics and decision support.

Research paper accepted by IEEE Transactions on Industrial Informatics

Accurate and reliable prediction of bearing remaining useful life (RUL) is crucial to the prognostics and health management (PHM) of rotation machinery. Despite the rapid progress of data-driven methods, the generalizability of data-driven models remains an open issue to be addressed. In this paper, we tackle this challenge by resolving the feature misalignment problem that arises in extracting features from the raw vibration signals. Towards this goal, we introduce a logarithmic cumulative transformation (LCT) operator consisting of cumulative, logarithmic, and another cumulative transformation for feature extraction. In addition, we propose a novel method to estimate the reliability associated with each RUL prediction by integrating a linear regression model and an auxiliary exponential model. The linear regression model rectifies bias from neural network’s point predictions while the auxiliary exponential model fits the differential slopes of the linear models and generates the upper and lower bounds for building the reliability indicator. The proposed approach comprised of LCT, an attention GRU-based encoder-decoder network, and reliability evaluation is validated on the FEMETO-ST dataset. Computational results demonstrate the superior performance of the proposed approach several other state-of-the-art methods.

Dr. Xiaoge Zhang delivered a talk on “Enhancing the Performance of Neural Networks Through Causal Discovery and Integration of Domain Knowledge” at Sichuan University, China

In this talk, I will present a generic methodology to encode hierarchical causal structure among observed variables into a neural network to improve its prediction performance. The proposed causality-informed neural network (CINN) leverages three coherent steps to systematically map the structural causal knowledge into the layer-to-layer design of neural network while strictly preserving the orientation of every causal relationship. In the first step, CINN discovers causal relationships from observational data via directed acyclic graph (DAG) learning, where causal discovery is recast as a continuous optimization problem to avoid the combinatorial nature. In the second step, the discovered hierarchical causal structure among observed variables is encoded into neural network through a dedicated architecture and customized loss function. By categorizing variables as root, intermediate, and leaf nodes, the hierarchical causal DAG is translated into CINN with a one-to-one correspondence between nodes in the DAG and units in the CINN while maintaining the relative order among these nodes. Regarding the loss function, both intermediate and leaf nodes in the DAG are treated as target outputs during CINN training to drive co-learning of causal relationships among different types of nodes. In the final step, as multiple loss components emerge in CINN, we leverage the projection of conflicting gradients to mitigate gradient interference among the multiple learning tasks. Computational experiments across a broad spectrum of UCI datasets demonstrate substantial advantages of CINN in prediction performance over other state-of-the-art methods. In addition, we conduct an ablation study by incrementally injecting structural and quantitative causal knowledge into neural network to demonstrate their role in enhancing neural network’s prediction performance.

Research paper accepted by Reliability Engineering and Systems Safety

Risk management often involves retrofit optimization to enhance the performance of buildings against extreme events but may result in huge upfront mitigation costs. Existing stochastic optimization frameworks could be computationally expensive, may require explicit programming, and are often not intelligent. Hence, an intelligent risk optimization framework is proposed herein for building structures by developing a deep reinforcement learning-enabled actor-critic neural network model. The proposed framework is divided into two parts including (1) a performance-based environment to assess mitigation costs and uncertain future consequences under hazards and (2) a deep reinforcement learning-enabled risk optimization model for performance enhancement. The performance-based environment takes mitigation alternatives as input and provides consequences and retrofit costs as output by utilizing several steps, including hazard assessment, damage assessment, and consequence assessment. The risk optimization is performed by integrating performance-based environment with actor-critic deep neural networks to simultaneously reduce retrofit costs and uncertain future consequences given seismic hazards. For illustration, the proposed framework is implemented on a portfolio with numerous building structures to demonstrate the new paradigm for intelligent risk optimization. Also, the performance of the proposed method is compared with genetic optimization, deep Q-networks, and proximal policy optimization.