Publication – Xiaoge Zhang's personal website

Research paper accepted by Transportation Research Part E

June 4, 2025/in Publication /by Xiaoge Zhang

Understanding causal relationships between traffic states throughout the system is of great significance for enhancing traffic management and optimization in urban traffic networks. Unfortunately, few studies in the literature have systematically analyzed causal structure characterizing the evolution of traffic states over time and gauged the importance of traffic nodes from a causal perspective, particularly in the context of large-scale traffic networks. Moreover, the dynamic nature of traffic patterns necessitates a robust method to reliably discover causal relationships, which are often overlooked in existing studies. To address these issues, we propose a Spatio-Temporal Causal Structure Learning and Analysis (STCSLA) framework for analyzing large-scale urban traffic networks at a mesoscopic level from a causal lens. The proposed framework comprises three main components: decomposition of spatio-temporal traffic data into localized traffic subprocesses; a Bayesian Information Criterion-guided spatio-temporal causal structure learning combined with temporal-dependencies preserving sampling for deriving reliable causal graph to uncover time-lagged and contemporaneous causal effects; establishing several causality-oriented indicators to identify causally critical nodes, mediator nodes, and bottleneck nodes in traffic networks. Experimental results on both a synthetic dataset and the real-world Hong Kong traffic dataset demonstrate that the proposed STCSLA framework accurately uncovers time-varying causal relationships and identifies key nodes that play various causal roles in influencing traffic dynamics. These findings underscore the potential of the proposed framework to improve traffic management and provide a comprehensive causality-driven approach for analyzing urban traffic networks.

Review paper on AI system reliability is accepted by Journal of Reliability Science and Engineering

April 18, 2025/in Publication /by Xiaoge Zhang

As the potential applications of AI continue to expand, a central question remains unresolved: will users trust and adopt AI-powered technologies? Since AI’s promise closely hinges on the perceptions of its trustworthiness, how to guarantee the reliability and trustworthiness of AI plays a fundamental role in fostering its broad adoptions in practice. However, the theories, mathematical models, and methods in reliability engineering and risk management have not kept pace with the rapid technological progress in AI. As a result, the lack of essential components (e.g., reliability, trustworthiness) in the resultant models has emerged as a major roadblock to regulatory approval and widespread adoptions of AI-powered solutions in high-stakes decision environments, such as healthcare, aviation, finance, nuclear power plant, to name a few. To fully harness AI’s power for automating decision making in these safety-critical applications, it is essential to manage expectations for what AI can realistically deliver to build appropriate levels of trust. In this paper, we focus on functional reliability of AI systems developed through supervised learning and discuss the unique characteristics of AI systems that necessitate the development of specialized reliability engineering and risk management theories and methods to create functionally reliable AI systems. Next, we thoroughly review five prevalent engineering mechanisms in the existing literature for approaching functionally reliable and trustworthy AI, including uncertainty quantification (UQ) composed of model-based UQ and model-agnostic conformal prediction, failure prediction, learning with abstention, formal verification, and knowledge-enabled AI. Furthermore, we outline several research challenges and opportunities related to the development of reliability engineering and trustworthiness assurance methods for AI systems. Our research aims to deepen the understanding of reliability and trustworthiness issues associated with AI systems, and spark researchers in the field of risk and reliability engineering and beyond to contribute to this area of study with emerging importance.

Research paper accepted by European Journal of Operational Research

April 2, 2025/in Publication /by Xiaoge Zhang

It is common for multiple firms\textemdash such as manufacturers, retailers, and third-party insurers\textemdash to coexist and compete in the aftermarket for durable products. In this paper, we study price competition in a partially concentrated aftermarket where one firm offers multiple extended warranty (EW) contracts while the others offer a single one. The demand for EWs is described by the multinomial logit model. We show that, at equilibrium, such an aftermarket behaves like a combination of monopoly and oligopoly. Building upon this base model, we further investigate sequential pricing games for a durable product and its EWs to accommodate the ancillary nature of after-sales services. We consider two scenarios: one where the manufacturer (as the market leader) sets product and EW prices \emph{simultaneously}, and another where these decisions are made \emph{sequentially}. Our analysis demonstrates that offering EWs incentivizes the manufacturer to lower the product price, thereby expanding the market potential for EWs. Simultaneous product-EW pricing leads to a price concession on EWs compared to sequential pricing, effectively reducing the intensity of competition in the aftermarket. Overall, the competitiveness of an EW hinges on its ability to deliver high value to consumers at low marginal cost to its provider. While our focus is on EWs, the proposed game-theoretical pricing models apply broadly to other ancillary after-sales services.

Research paper accepted by IEEE Transactions on Automation Science and Engineering

December 13, 2024/in Publication /by Xiaoge Zhang

The demand for disruption-free fault diagnosis of mechanical equipment under a constantly changing operation environment poses a great challenge to the deployment of data-driven diagnosis models in practice. Extant continual learning-based diagnosis models suffer from consuming a large number of labeled samples to be trained for adapting to new diagnostic tasks and failing to account for the diagnosis of heterogeneous fault types across different machines. In this paper, we use a representative mechanical equipment – rotating machinery — as an example and develop an uncertainty-aware continual learning framework (UACLF) to provide a unified interface for fault diagnosis of rotating machinery under various dynamic scenarios: class continual scenario, domain continual scenario, and both. The proposed UACLF takes a three-step to tackle fault diagnosis of rotating machinery with homogeneous-heterogeneous faults under dynamic environments. In the first step, an inter-class classification loss function and an intra-class discrimination loss function are devised to extract informative feature representations from the raw vibration signal for fault classification. Next, an uncertainty-aware pseudo labeling mechanism is developed to select unlabeled fault samples that we are able to assign pseudo labels confidently, thus expanding the training samples for faults arising in the new environment. Thirdly, an adaptive prototypical feedback mechanism is used to enhance the decision boundary of fault classification and diminish the model misclassification rate. Experimental results on three datasets suggest that the proposed UACLF outperforms several alternatives in the literature on fault diagnosis of rotating machinery across various working conditions and different machines.

Research paper accepted by Decision Support Systems

August 10, 2024/in Publication /by Xiaoge Zhang

The conventional aggregated performance measure (i.e., mean squared error) with respect to the whole dataset would not provide desired safety and quality assurance for each individual prediction made by a machine learning model in risk-sensitive regression problems. In this paper, we propose an informative indicator $\mathcal{R} \left(\bm{x} \right)$ to quantify model reliability for individual prediction (MRIP) for the purpose of safeguarding the usage of machine learning (ML) models in mission-critical applications. Specifically, we define the reliability of a ML model with respect to its prediction on each individual input $\bm{x}$ as the probability of the observed difference between the prediction of ML model and the actual observation falling within a small interval when the input $\bm{x}$ varies within a small range subject to a preset distance constraint, namely $\mathcal{R}(\bm{x}) = P(|y^* – \hat{y}^*| \leq \varepsilon | \bm{x}^* \in B(\bm{x}))$, where $y^*$ denotes the observed target value for the input $\bm{x}^*$, $\hat{y}^*$ denotes the model prediction for the input $\bm{x}^*$, and $\bm{x}^*$ is an input in the neighborhood of $\bm{x}$ subject to the constraint $B\left( \bm{x} \right) = \left\{ {\left. {{\bm{x}^*}} \right|\left\| {{\bm{x}^*} – \bm{x}} \right\| \le \delta } \right\}$. The developed MRIP indicator $\mathcal{R} \left(\bm{x} \right)$ provides a direct, objective, quantitative, and general-purpose measure of “reliability” or the probability of success of the ML model for each individual prediction by fully exploiting the local information associated with the input $\bm{x}$ and ML model. Next, to mitigate the intensive computational effort involved in MRIP estimation, we develop a two-stage ML-based framework to directly learn the relationship between $\bm{x}$ and its MRIP $\mathcal{R} \left( \bm{x} \right)$, thus enabling to provide the reliability estimate $\mathcal{R} \left( \bm{x} \right)$ for any unseen input instantly. Thirdly, we propose an information gain-based approach to help determine a threshold value pertaing to $\mathcal{R} \left( \bm{x} \right)$ in support of decision makings on when to accept or abstain from counting on the ML model prediction. Comprehensive computational experiments and quantitative comparisons with existing methods on a broad range of real-world datasets reveal that the developed ML-based framework for MRIP estimation shows a robust performance in improving the reliability estimate of individual prediction, and the MRIP indicator $\mathcal{R} \left( \bm{x} \right)$ thus provides an essential layer of safety net when adopting ML models in risk-sensitive environments.

Research paper accepted by Mechanical Systems and Signal Processing

July 16, 2024/in Publication /by Xiaoge Zhang

Blind deconvolution (BD) has been demonstrated to be an efficacious approach for extracting bearing fault-specific features from vibration signals under strong background noise. Despite BD’s appealing feature in adaptability and mathematical interpretability, a significant challenge persists: \textit{How to effectively integrate BD with fault-diagnosing classifiers?} This issue is intricate to be tackled because the traditional BD method is solely designed for feature extraction with its own optimizer and objective function. When BD is combined with the downstream deep learning classifier, the different learning objectives easily get in conflict. To address this problem, this paper introduces classifier-guided BD (ClassBD) for joint learning of BD-based feature extraction and deep learning-based fault diagnosis. Towards this goal, we first develop a time and frequency neural BD that employs neural networks to implement conventional BD, thereby facilitating seamless integration of BD and the deep learning classifier for co-optimization of model parameters. In the neural BD, we incorporate two filters: i) a time domain quadratic filter to utilize quadratic convolutional networks for extracting periodic impulses; ii) a frequency domain linear filter composed of a fully-connected neural network to amplify discrete frequency components. Next, we develop a unified framework built upon a deep learning classifier to guide the learning of BD filters. In addition, we devise a physics-informed loss function composed of kurtosis, $l_2/l_4$ norm, and a cross-entropy loss to jointly optimize the BD filters and deep learning classifier. In so doing, the fault labels are fully exploited to direct BD to extract features in distinguishing classes amidst strong noise. To the best of our knowledge, this is the first of its kind that BD is successfully applied to bearing fault diagnosis. Experimental results from three different datasets highlight that ClassBD outperforms other state-of-the-art methods under noisy conditions. The source codes of this paper are available at https://github.com/asdvfghg/ClassBD.

Research paper accepted by Nature Communications

May 30, 2024/in Publication /by Xiaoge Zhang

Industrial enterprises are prominent sources of contaminant discharge on the planet and regulating their operations is vital for sustainable development. However, accurately tracking contaminant generation at the firm-level remains an intractable global issue due to significant heterogeneities among enormous enterprises and the absence of a universally applicable estimation method. This study addressed the challenge by focusing on hazardous waste (HW), known for its severe harmful properties and difficulty in automatic monitoring, and developed a data-driven methodology that predicted HW generation utilizing wastewater big data in a uniform and lightweight manner. The idea is grounded in the availability of wastewater big data with widespread application of automatic sensors, enabling depiction of heterogeneous enterprises, and the logical assumption that a correlation exists between wastewater and HW generation. We simulated this relationship by designing a generic framework that jointly used representative variables from diverse sectors, exploited a data-balance algorithm to address long-tail data distribution, and incorporated causal discovery to screen features and improve computation efficiency. To illustrate our approach, we applied it to 1024 enterprises across 10 sectors in Jiangsu, a highly industrialized province in China. Validation results demonstrated the model’s high fidelity (R2=0.87) in predicting HW generation using 4,260,593 daily wastewater data.

Research paper accepted by Risk Analysis

May 27, 2024/in Publication /by Xiaoge Zhang

In this paper, we develop a generic framework for systemically encoding causal knowledge manifested in the form of hierarchical causality structure and qualitative (or quantitative) causal relationships into neural networks to facilitate sound risk analytics and decision support via causally-aware intervention reasoning. The proposed methodology for establishing causality-informed neural network (CINN) follows a four-step procedure. In the first step, we explicate how causal knowledge in the form of directed acyclic graph (DAG) can be discovered from observation data or elicited from domain experts. Next, we categorize nodes in the constructed DAG representing causal relationships among observed variables into several groups (e.g., root nodes, intermediate nodes, leaf nodes), and align the architecture of CINN with causal relationships specified in the DAG while preserving the orientation of each existing causal relationship. In addition to a dedicated architecture design, CINN also gets embodied in the design of loss function, where both intermediate and leaf nodes are treated as target outputs to be predicted by CINN. In the third step, we propose to incorporate domain knowledge on stable causal relationships into CINN, and the injected constraints on causal relationships act as guardrails to prevent unexpected behaviours of CINN. Finally, the trained CINN is exploited to perform intervention reasoning with emphasis on estimating the effect that policies and actions can have on the system behavior, thus facilitating risk-informed decision making through comprehensive “what-if” analysis. Two case studies are used to demonstrate the substantial benefits enabled by CINN in risk analytics and decision support.

Research paper accepted by IEEE Transactions on Industrial Informatics

April 25, 2024/in Publication /by Xiaoge Zhang

Accurate and reliable prediction of bearing remaining useful life (RUL) is crucial to the prognostics and health management (PHM) of rotation machinery. Despite the rapid progress of data-driven methods, the generalizability of data-driven models remains an open issue to be addressed. In this paper, we tackle this challenge by resolving the feature misalignment problem that arises in extracting features from the raw vibration signals. Towards this goal, we introduce a logarithmic cumulative transformation (LCT) operator consisting of cumulative, logarithmic, and another cumulative transformation for feature extraction. In addition, we propose a novel method to estimate the reliability associated with each RUL prediction by integrating a linear regression model and an auxiliary exponential model. The linear regression model rectifies bias from neural network’s point predictions while the auxiliary exponential model fits the differential slopes of the linear models and generates the upper and lower bounds for building the reliability indicator. The proposed approach comprised of LCT, an attention GRU-based encoder-decoder network, and reliability evaluation is validated on the FEMETO-ST dataset. Computational results demonstrate the superior performance of the proposed approach several other state-of-the-art methods.

Research paper accepted by Reliability Engineering and Systems Safety

April 8, 2024/in Publication /by Xiaoge Zhang

Risk management often involves retrofit optimization to enhance the performance of buildings against extreme events but may result in huge upfront mitigation costs. Existing stochastic optimization frameworks could be computationally expensive, may require explicit programming, and are often not intelligent. Hence, an intelligent risk optimization framework is proposed herein for building structures by developing a deep reinforcement learning-enabled actor-critic neural network model. The proposed framework is divided into two parts including (1) a performance-based environment to assess mitigation costs and uncertain future consequences under hazards and (2) a deep reinforcement learning-enabled risk optimization model for performance enhancement. The performance-based environment takes mitigation alternatives as input and provides consequences and retrofit costs as output by utilizing several steps, including hazard assessment, damage assessment, and consequence assessment. The risk optimization is performed by integrating performance-based environment with actor-critic deep neural networks to simultaneously reduce retrofit costs and uncertain future consequences given seismic hazards. For illustration, the proposed framework is implemented on a portfolio with numerous building structures to demonstrate the new paradigm for intelligent risk optimization. Also, the performance of the proposed method is compared with genetic optimization, deep Q-networks, and proximal policy optimization.

Research paper accepted by Transportation Research Part E

Review paper on AI system reliability is accepted by Journal of Reliability Science and Engineering

Research paper accepted by European Journal of Operational Research

Research paper accepted by IEEE Transactions on Automation Science and Engineering

Research paper accepted by Decision Support Systems

Research paper accepted by Mechanical Systems and Signal Processing

Research paper accepted by Nature Communications

Research paper accepted by Risk Analysis

Research paper accepted by IEEE Transactions on Industrial Informatics

Research paper accepted by Reliability Engineering and Systems Safety

Contact

Visit Us