Research paper accepted by IEEE Transactions on Automation Science and Engineering

The demand for disruption-free fault diagnosis of mechanical equipment under a constantly changing operation environment poses a great challenge to the deployment of data-driven diagnosis models in practice. Extant continual learning-based diagnosis models suffer from consuming a large number of labeled samples to be trained for adapting to new diagnostic tasks and failing to account for the diagnosis of heterogeneous fault types across different machines. In this paper, we use a representative mechanical equipment – rotating machinery — as an example and develop an uncertainty-aware continual learning framework (UACLF) to provide a unified interface for fault diagnosis of rotating machinery under various dynamic scenarios: class continual scenario, domain continual scenario, and both. The proposed UACLF takes a three-step to tackle fault diagnosis of rotating machinery with homogeneous-heterogeneous faults under dynamic environments. In the first step, an inter-class classification loss function and an intra-class discrimination loss function are devised to extract informative feature representations from the raw vibration signal for fault classification. Next, an uncertainty-aware pseudo labeling mechanism is developed to select unlabeled fault samples that we are able to assign pseudo labels confidently, thus expanding the training samples for faults arising in the new environment. Thirdly, an adaptive prototypical feedback mechanism is used to enhance the decision boundary of fault classification and diminish the model misclassification rate. Experimental results on three datasets suggest that the proposed UACLF outperforms several alternatives in the literature on fault diagnosis of rotating machinery across various working conditions and different machines.

Research paper accepted by Decision Support Systems

The conventional aggregated performance measure (i.e., mean squared error) with respect to the whole dataset would not provide desired safety and quality assurance for each individual prediction made by a machine learning model in risk-sensitive regression problems. In this paper, we propose an informative indicator $\mathcal{R} \left(\bm{x} \right)$ to quantify model reliability for individual prediction (MRIP) for the purpose of safeguarding the usage of machine learning (ML) models in mission-critical applications. Specifically, we define the reliability of a ML model with respect to its prediction on each individual input $\bm{x}$ as the probability of the observed difference between the prediction of ML model and the actual observation falling within a small interval when the input $\bm{x}$ varies within a small range subject to a preset distance constraint, namely $\mathcal{R}(\bm{x}) = P(|y^* – \hat{y}^*| \leq \varepsilon | \bm{x}^* \in B(\bm{x}))$, where $y^*$ denotes the observed target value for the input $\bm{x}^*$, $\hat{y}^*$ denotes the model prediction for the input $\bm{x}^*$, and $\bm{x}^*$ is an input in the neighborhood of $\bm{x}$ subject to the constraint $B\left( \bm{x} \right) = \left\{ {\left. {{\bm{x}^*}} \right|\left\| {{\bm{x}^*} – \bm{x}} \right\| \le \delta } \right\}$. The developed MRIP indicator $\mathcal{R} \left(\bm{x} \right)$ provides a direct, objective, quantitative, and general-purpose measure of “reliability” or the probability of success of the ML model for each individual prediction by fully exploiting the local information associated with the input $\bm{x}$ and ML model. Next, to mitigate the intensive computational effort involved in MRIP estimation, we develop a two-stage ML-based framework to directly learn the relationship between $\bm{x}$ and its MRIP $\mathcal{R} \left( \bm{x} \right)$, thus enabling to provide the reliability estimate $\mathcal{R} \left( \bm{x} \right)$ for any unseen input instantly. Thirdly, we propose an information gain-based approach to help determine a threshold value pertaing to $\mathcal{R} \left( \bm{x} \right)$ in support of decision makings on when to accept or abstain from counting on the ML model prediction. Comprehensive computational experiments and quantitative comparisons with existing methods on a broad range of real-world datasets reveal that the developed ML-based framework for MRIP estimation shows a robust performance in improving the reliability estimate of individual prediction, and the MRIP indicator $\mathcal{R} \left( \bm{x} \right)$ thus provides an essential layer of safety net when adopting ML models in risk-sensitive environments.

Research paper accepted by Mechanical Systems and Signal Processing

Blind deconvolution (BD) has been demonstrated to be an efficacious approach for extracting bearing fault-specific features from vibration signals under strong background noise. Despite BD’s appealing feature in adaptability and mathematical interpretability, a significant challenge persists: \textit{How to effectively integrate BD with fault-diagnosing classifiers?} This issue is intricate to be tackled because the traditional BD method is solely designed for feature extraction with its own optimizer and objective function. When BD is combined with the downstream deep learning classifier, the different learning objectives easily get in conflict. To address this problem, this paper introduces classifier-guided BD (ClassBD) for joint learning of BD-based feature extraction and deep learning-based fault diagnosis. Towards this goal, we first develop a time and frequency neural BD that employs neural networks to implement conventional BD, thereby facilitating seamless integration of BD and the deep learning classifier for co-optimization of model parameters. In the neural BD, we incorporate two filters: i) a time domain quadratic filter to utilize quadratic convolutional networks for extracting periodic impulses; ii) a frequency domain linear filter composed of a fully-connected neural network to amplify discrete frequency components. Next, we develop a unified framework built upon a deep learning classifier to guide the learning of BD filters. In addition, we devise a physics-informed loss function composed of kurtosis, $l_2/l_4$ norm, and a cross-entropy loss to jointly optimize the BD filters and deep learning classifier. In so doing, the fault labels are fully exploited to direct BD to extract features in distinguishing classes amidst strong noise. To the best of our knowledge, this is the first of its kind that BD is successfully applied to bearing fault diagnosis. Experimental results from three different datasets highlight that ClassBD outperforms other state-of-the-art methods under noisy conditions. The source codes of this paper are available at https://github.com/asdvfghg/ClassBD.

Research paper accepted by Nature Communications

Industrial enterprises are prominent sources of contaminant discharge on the planet and regulating their operations is vital for sustainable development. However, accurately tracking contaminant generation at the firm-level remains an intractable global issue due to significant heterogeneities among enormous enterprises and the absence of a universally applicable estimation method. This study addressed the challenge by focusing on hazardous waste (HW), known for its severe harmful properties and difficulty in automatic monitoring, and developed a data-driven methodology that predicted HW generation utilizing wastewater big data in a uniform and lightweight manner. The idea is grounded in the availability of wastewater big data with widespread application of automatic sensors, enabling depiction of heterogeneous enterprises, and the logical assumption that a correlation exists between wastewater and HW generation. We simulated this relationship by designing a generic framework that jointly used representative variables from diverse sectors, exploited a data-balance algorithm to address long-tail data distribution, and incorporated causal discovery to screen features and improve computation efficiency. To illustrate our approach, we applied it to 1024 enterprises across 10 sectors in Jiangsu, a highly industrialized province in China. Validation results demonstrated the model’s high fidelity (R2=0.87) in predicting HW generation using 4,260,593 daily wastewater data.

Research paper accepted by Risk Analysis

In this paper, we develop a generic framework for systemically encoding causal knowledge manifested in the form of hierarchical causality structure and qualitative (or quantitative) causal relationships into neural networks to facilitate sound risk analytics and decision support via causally-aware intervention reasoning. The proposed methodology for establishing causality-informed neural network (CINN) follows a four-step procedure. In the first step, we explicate how causal knowledge in the form of directed acyclic graph (DAG) can be discovered from observation data or elicited from domain experts. Next, we categorize nodes in the constructed DAG representing causal relationships among observed variables into several groups (e.g., root nodes, intermediate nodes, leaf nodes), and align the architecture of CINN with causal relationships specified in the DAG while preserving the orientation of each existing causal relationship. In addition to a dedicated architecture design, CINN also gets embodied in the design of loss function, where both intermediate and leaf nodes are treated as target outputs to be predicted by CINN. In the third step, we propose to incorporate domain knowledge on stable causal relationships into CINN, and the injected constraints on causal relationships act as guardrails to prevent unexpected behaviours of CINN. Finally, the trained CINN is exploited to perform intervention reasoning with emphasis on estimating the effect that policies and actions can have on the system behavior, thus facilitating risk-informed decision making through comprehensive “what-if” analysis. Two case studies are used to demonstrate the substantial benefits enabled by CINN in risk analytics and decision support.

Research paper accepted by IEEE Transactions on Industrial Informatics

Accurate and reliable prediction of bearing remaining useful life (RUL) is crucial to the prognostics and health management (PHM) of rotation machinery. Despite the rapid progress of data-driven methods, the generalizability of data-driven models remains an open issue to be addressed. In this paper, we tackle this challenge by resolving the feature misalignment problem that arises in extracting features from the raw vibration signals. Towards this goal, we introduce a logarithmic cumulative transformation (LCT) operator consisting of cumulative, logarithmic, and another cumulative transformation for feature extraction. In addition, we propose a novel method to estimate the reliability associated with each RUL prediction by integrating a linear regression model and an auxiliary exponential model. The linear regression model rectifies bias from neural network’s point predictions while the auxiliary exponential model fits the differential slopes of the linear models and generates the upper and lower bounds for building the reliability indicator. The proposed approach comprised of LCT, an attention GRU-based encoder-decoder network, and reliability evaluation is validated on the FEMETO-ST dataset. Computational results demonstrate the superior performance of the proposed approach several other state-of-the-art methods.

Research paper accepted by Reliability Engineering and Systems Safety

Risk management often involves retrofit optimization to enhance the performance of buildings against extreme events but may result in huge upfront mitigation costs. Existing stochastic optimization frameworks could be computationally expensive, may require explicit programming, and are often not intelligent. Hence, an intelligent risk optimization framework is proposed herein for building structures by developing a deep reinforcement learning-enabled actor-critic neural network model. The proposed framework is divided into two parts including (1) a performance-based environment to assess mitigation costs and uncertain future consequences under hazards and (2) a deep reinforcement learning-enabled risk optimization model for performance enhancement. The performance-based environment takes mitigation alternatives as input and provides consequences and retrofit costs as output by utilizing several steps, including hazard assessment, damage assessment, and consequence assessment. The risk optimization is performed by integrating performance-based environment with actor-critic deep neural networks to simultaneously reduce retrofit costs and uncertain future consequences given seismic hazards. For illustration, the proposed framework is implemented on a portfolio with numerous building structures to demonstrate the new paradigm for intelligent risk optimization. Also, the performance of the proposed method is compared with genetic optimization, deep Q-networks, and proximal policy optimization.

Research paper accepted by Knowledge-Based Systems

Principled quantification of predictive uncertainty in neural networks (NNs) is essential to safeguard their applications in high-stakes decision settings. In this paper, we develop a differentiable mathematical formulation to quantify the uncertainty in NN prediction using prediction intervals (PIs). The formulated optimization problem is differentiable and compatible with the built-in gradient descent optimizers in prevailing deep learning platforms, and two performance metrics composed of prediction interval coverage probability (PICP) and mean prediction interval width (MPIW) are considered in the construction of PIs. Different from existing methods, the developed methodology features four salient characteristics. Firstly, we design two distance-based functions that are differentiable to impose constraints associated with the target coverage in PI construction, where PICP is prioritized explicitly over MPIW in the devised composite loss function. Next, we adopt a shared-bottom NN architecture with intermediate layers to separate the learning of shared and task-specific feature representations along the construction of lower and upper bounds. Thirdly, we leverage the projection of conflicting gradients (PCGrad) to mitigate interference of gradients associated with the two individual learning tasks so as to increase the convergence stability and solution quality. Finally, we design a customized early stopping mechanism to monitor PICP and MPIW simultaneously for the purpose of selecting the set of parameters that not only meets the target coverage but also has a minimal MPIW as the ultimate NN parameters. A broad range of datasets are used to rigorously examine the performance of the developed methodology. Computational results suggest that the developed method significantly outperforms the classic LUBE method across the nine datasets by reducing the PI width by 31.26% on average. More importantly, it achieves competitive results compared to the other three state-of-the-art methods by outperforming them on four out of ten datasets. An ablation study is used to explicitly demonstrate the benefit of shared-bottom NN architecture in the construction of PIs.

Research paper accepted by Reliability Engineering and Systems Safety

Physics-Informed Neural Network (PINN) is a special type of deep learning model that encodes physical laws in the form of partial differential equations as a regularization term in the loss function of neural network. In this paper, we develop a principled uncertainty quantification approach to characterize the model uncertainty of PINN, and the estimated uncertainty is then exploited as an instructive indicator to identify collocation points where PINN produces a large prediction error. To this end, this paper seamlessly integrates spectral-normalized neural Gaussian process (SNGP) into PINN for principled and accurate uncertainty quantification. In the first step, we apply spectral normalization on the weight matrices of hidden layers in the PINN to make the data transformation from input space to the latent space distance-preserving. Next, the dense output layer of PINN is replaced with a Gaussian process to make the quantified uncertainty distance-sensitive. Afterwards, to examine the performance of different UQ approaches, we define several performance metrics tailored to PINN for assessing distance awareness in the measured uncertainty and the uncertainty-informed error detection capability. Finally, we employ three representative physical problems to verify the effectiveness of the proposed method in uncertainty quantification of PINN and compare the developed approach with Monte Carlo (MC) dropout using the developed performance metrics. Computational results suggest that the proposed approach exhibits a superior performance in improving the prediction accuracy of PINN and the estimated uncertainty serves as an informative indicator to detect PINN’s prediction failures.

Research paper accepted by IEEE Transactions on Instrumentation and Measurement

Deep learning has achieved remarkable success in the field of bearing fault diagnosis. However, this success comes with larger models and more complex computations, which cannot be transferred into industrial fields requiring models to be of high speed, strong portability, and low power consumption. In this paper, we propose a lightweight and deployable model for bearing fault diagnosis, referred to as BearingPGA-Net, to address these challenges. Firstly, aided by a well-trained large model, we train BearingPGA-Net via decoupled knowledge distillation. Despite its small size, our model demonstrates excellent fault diagnosis performance compared to other lightweight state-of-the-art methods. Secondly, we design an FPGA acceleration scheme for BearingPGA-Net using Verilog. This scheme involves the customized quantization and designing programmable logic gates for each layer of BearingPGA-Net on the FPGA, with an emphasis on parallel computing and module reuse to enhance the computational speed. To the best of our knowledge, this is the first instance of deploying a CNN-based bearing fault diagnosis model on an FPGA. Experimental results reveal that our deployment scheme achieves over 200 times faster diagnosis speed compared to CPU, while achieving a lower-than-0.4% performance drop in terms of F1, Recall, and Precision score on our independently-collected bearing dataset. Our code is available at https://github.com/asdvfghg/BearingPGA-Net.