Research paper accepted by IEEE Transactions on Reliability

Deep learning shows great potential for bearing fault diagnosis, but its effectiveness is severely limited by the prevalent issue of highly imbalanced data in real-world industrial settings, where fault events are extremely rare. This paper proposes a novel method for imbalanced bearing fault diagnosis that combines class-aware supervised contrastive learning with a quadratic network backbone. This integrated approach, named CCQNet, is designed to counter the effects of highly skewed data distributions by improving feature representation and classification fairness. Comprehensive experiments show that CCQNet substantially outperforms existing methods in handling imbalanced data, particularly at high imbalance ratios like 50:1. This study provides an effective and innovative solution for imbalanced bearing fault diagnosis. Source codes of this paper are available at https://github.com/yuweien1120/CCQNet for public evaluation.

Research paper accepted by Reliability Engineering and Systems Safety

Although machine learning (ML) and deep learning (DL) methods are increasingly used for anomaly detection in industrial cyber-physical systems, their adoption is hindered by concerns about model trustworthiness, especially high false alarm rates (FARs). Excessive false alarms overwhelm operators, cause unnecessary shutdowns, and reduce operational efficiency. This study addresses these challenges by proposing a novel framework that integrates ML-based anomaly detectors with conformal prediction (CP), a model-agnostic uncertainty quantification technique. To handle distribution shifts in time-series data, our framework incorporates a temporal quantile adjustment method with a sliding calibration set, ensuring statistical guarantees on predefined FARs. A rejection mechanism is further integrated by excluding significant anomalies from the calibration set, improving detection capability while maintaining FAR guarantees. For real-time anomaly monitoring, two P-value-based indicators generated from CP are developed to track anomalous trends and enhance model interpretability. The framework is evaluated by comparing several baseline ML and DL methods to their conformalized counterparts using a public ICPS dataset. Comparative results based on Precision, Recall, F1, and AUROC validate the framework’s compatibility with various ML models and its effectiveness in improving anomaly detection performance by reducing false alarms and guaranteeing FARs across a range of predefined values.

Research paper accepted by IEEE Transactions on Systems, Man and Cybernetics: Systems

Neural networks that overlook the underlying causal relationships among observed variables pose significant risks in high-stake decision-making contexts due to the concerns about the robustness and stability of model performance. To tackle this issue, we present a general approach for embedding hierarchical causal structure among observed variables into neural network to inform its learning. The proposed methodology, termed causality-informed neural network (CINN), exploits hierarchical causal structure learned from observational data as a structurally informed prior to guide the layer-to-layer architectural design of the neural network while maintaining the orientation of causal relationships in the discovered causal graph. The proposed method involves three steps. First, CINN mines causal relationships from observational data via directed acyclic graph (DAG) learning, where causal discovery is recast as a continuous optimization problem to circumvent the combinatorial nature of DAG learning. Second, we encode the discovered hierarchical causal graph among observed variables into neural network via a dedicated architecture and loss function. By classifying observed variables in the DAG as root, intermediate, and leaf nodes, we translate the hierarchical causal DAG into CINN by creating a one-to-one correspondence between DAG nodes and certain CINN neurons. For the loss function, both intermediate and leaf nodes in the DAG are treated as target outputs during CINN training, facilitating the co-learning of causal relationships among the observed variables. Finally, as multiple loss components emerge in CINN, we leverage the projection of conflicting gradients to mitigate gradient interference among the multiple learning tasks. Computational studies indicate that CINN outperforms several state-of-the-art methods across a broad range of datasets. In addition, an ablation study that incrementally incorporates structural and quantitative causal knowledge into the neural network is conducted to highlight the pivotal role of causal knowledge in enhancing neural network’s prediction performance.

Prof. Zhisheng Ye delivered a talk on “Optimal Abort Policy for Mission-Critical Systems under Imperfect Condition Monitoring”

While most on-demand mission-critical systems are engineered to be reliable to support critical tasks, occasional failures may still occur during missions. To increase system survivability, a common practice is to abort the mission before an imminent failure. We consider optimal mission abort for a system whose deterioration follows a general three-state (normal, defective, failed) semi-Markov chain. The failure is assumed self-revealed, while the healthy and defective states have to be predicted from imperfect condition monitoring data. Due to the non-Markovian process dynamics, optimal mission abort for this partially observable system is an intractable stopping problem. For a tractable solution, we introduce a novel tool of Erlang mixtures to approximate non-exponential sojourn times in the semi-Markov chain. This allows us to approximate the original process by a surrogate continuous-time Markov chain whose optimal control policy can be solved through a partially observable Markov decision process (POMDP). We show that the POMDP optimal policies converge almost surely to the optimal abort decision rules when the Erlang rate parameter diverges. This implies that the expected cost by adopting the POMDP solution converges to the optimal expected cost. Next, we provide comprehensive structural results on the optimal policy of the surrogate POMDP. Based on the results, we develop a modified point-based value iteration algorithm to numerically solve the surrogate POMDP. We further consider mission abort in a multi-task setting where a system executes several tasks consecutively before a thorough inspection. Through a case study on an unmanned aerial vehicle, we demonstrate the capability of real-time implementation of our model, even when the condition-monitoring signals are generated with high frequency.

Congratulations on Jingxiao LIAO to pass his PhD oral defense!!!

In recent years, deep learning has achieved significant success in various fields, including natural language processing, autonomous driving, and computer vision. In the realm of prognostics and health management (PHM) for rolling bearings in rotating machinery—such as aero engines, wind turbines, and high-speed trains—numerous intelligent PHM methodologies have emerged to provide accurate and adaptable machinery fault diagnostics and prognostics. However, methodologically speaking, there is no one-size-fits-all approach. It is widely acknowledged that these data-driven approaches still possess considerable limitations, hindering their widespread adoption in industrial settings.

Three primary challenges persist: (1) the lack of interpretability in deep learning methods, particularly in machinery fault diagnosis, where diagnostic models must be transparent to foster trust in the results and inform maintenance decisions; (2) the limited generalizability and reliability of bearing remaining useful life (RUL) prediction models. When training data is scarce, even under identical operating conditions and with the same bearing types, current RUL models demonstrate suboptimal accuracy. In addition, ensuring the reliability of RUL predictions is an important consideration for making informed maintenance decisions in real-world scenarios; and (3) the difficulty in deploying intelligent diagnosis models to edge devices, which hinders their integration into real-world industrial settings.

Therefore, this dissertation aims to address these challenges by constructing the paradigm of integrating traditional signal processing and modern deep learning methods. We formally define this approach as signal processing-empowered neural networks, which synthesize the complementary strengths of both domains. This framework provides three key advantages: (1) integrating rigorous signal processing theory to improve model interpretability; (2) leveraging the robust feature representation capabilities of signal processing techniques to enhance deep learning model generalizability and auxiliary exponential model to quantify the reliability of RUL predictions; and (3) enabling faster computation and greater accuracy, thereby facilitating the edge device deployment of lightweight models. The research contents are summarized as follows:

Research paper accepted by IEEE Transactions on Emerging Topics in Computational Intelligence

Equipping deep learning models with a principled uncertainty quantification (UQ) has become essential for ensuring their reliable performance in the open world. To handle uncertainty arising from two prevalent sources – distribution shift and out-of-distribution (OOD) – in the open-world settings, this paper presents a unified uncertainty-informed approach for quantifying and managing the risks these factors pose to the dependable function of deep learning models. Toward this goal, we propose leveraging a principled UQ approach — Spectral-normalized Neural Gaussian Process (SNGP) — to quantify the epistemic uncertainty associated with model predictions. Unlike other UQ methods in the literature, SNGP is characterized by two unique properties: (1) applying spectral normalization to the weights of the neural network’s hidden layers to preserve the relative distances among data points during data transformations; (2) replacing the traditional output layer of neural networks with a Gaussian process to enable distance-aware uncertainty estimation. Based on SNGP’s uncertainty estimate, we apply Youden’s index to determine an optimal threshold for categorizing the uncertainty into distinct levels, thereby enabling decision-makers to make uncertainty-informed decisions. Two datasets of varying scale are used to demonstrate how the proposed method facilitates risk assessment and management of deep learning models in the open environment. Computational results reveal that the proposed method achieves prediction performance comparable to Monte Carlo dropout and deep ensemble methods. Importantly, the proposed approach outperforms the other two methods by providing a computationally efficient, consistent, and principled uncertainty estimation under no distribution shift, distribution shift, and OOD conditions.

Research project funded by the Natural Science Foundation of Guangdong Province-General Program

Uncertainty quantification and spatiotemporal causal discovery for reliable traffic prediction

Research paper accepted by Reliability Engineering and Systems Safety

Multi-state systems (MSS) are widely used for modeling the behavior of engineering applications, where the system and its components can have more than two distinct states. Physics-Informed Neural Networks (PINNs) offer a viable solution for characterizing the dynamic state evolution of MSS. However, existing methods predominantly rely on uniformly sampled collocation points across the problem domain when training PINNs. Although some residual-based active learning methods exist, they are inherently static and local, and often fail to capture a crucial aspect of PINN training: identification and accurate modeling of the “critical transition regions” within the problem domain. To address this fundamental challenge, we treat PINN as a dynamic system and introduce a novel active learning method grounded in chaos theory to identify regions within the problem domain that are highly sensitive to initial conditions. Specifically, our method quantifies the degree of chaos at candidate collocation points by introducing small perturbations and using PINN’s forward propagation to simulate the dynamic evolution of both the original and perturbed collocation points. Collocation points that exhibit pronounced chaotic behavior—- where evolutionary trajectories diverge rapidly following perturbation—are identified as the system’s most unstable and valuable regions for PINN training. By prioritizing these dynamically unstable points, our method directs PINN to focus its learning on accurately delineating the boundaries of state transitions, thereby significantly enhancing the accuracy of reliability analysis. Experimental results on multiple benchmark partial differential equations (PDEs) and several MSSs demonstrate that, compared to other PINN learning schemes, our method shows superior accuracy and computational efficiency in MSS reliability assessment.

Prof. Cheng-Lin Liu gave a talk on “Open-World Learning: Problems and Strategies”

Traditional methods of pattern classification and machine learning usually assume closed world: the input pattern falls within a fixed set of classes. However, in open world, the input pattern can be of either known or unknown classes, or be outlier. While in training, the data may emerge incrementally, and the new dataset contain samples or with known or unknown classes, either labeled or unlabeled, or be outlier. Such open-world learning scenario involves multiple challenges including out-of-distribution (OOD) detection, confidence estimation, unlabeled data exploitation, catastrophic forgetting and novel category discovery. The challenges are attacked by combining techniques such as generative modeling, regularization, knowledge distillation, and hybrid learning. This talk will outline the status of open-world pattern recognition, identify the main challenges of open-world learning and main strategies, and present some recent progress achieved in my group: open-set recognition, class-incremental learning, and generalized category discovery.

Welcome one new PhD student to join the group!

The group for risk, reliability, and resilience informatics of intelligent systems warmly welcomes Hang Ji to join the team to start his PhD study journey.