Research paper accepted by Decision Support Systems
The conventional aggregated performance measure (i.e., mean squared error) with respect to the whole dataset would not provide desired safety and quality assurance for each individual prediction made by a machine learning model in risk-sensitive regression problems. In this paper, we propose an informative indicator $\mathcal{R} \left(\bm{x} \right)$ to quantify model reliability for individual prediction (MRIP) for the purpose of safeguarding the usage of machine learning (ML) models in mission-critical applications. Specifically, we define the reliability of a ML model with respect to its prediction on each individual input $\bm{x}$ as the probability of the observed difference between the prediction of ML model and the actual observation falling within a small interval when the input $\bm{x}$ varies within a small range subject to a preset distance constraint, namely $\mathcal{R}(\bm{x}) = P(|y^* – \hat{y}^*| \leq \varepsilon | \bm{x}^* \in B(\bm{x}))$, where $y^*$ denotes the observed target value for the input $\bm{x}^*$, $\hat{y}^*$ denotes the model prediction for the input $\bm{x}^*$, and $\bm{x}^*$ is an input in the neighborhood of $\bm{x}$ subject to the constraint $B\left( \bm{x} \right) = \left\{ {\left. {{\bm{x}^*}} \right|\left\| {{\bm{x}^*} – \bm{x}} \right\| \le \delta } \right\}$. The developed MRIP indicator $\mathcal{R} \left(\bm{x} \right)$ provides a direct, objective, quantitative, and general-purpose measure of “reliability” or the probability of success of the ML model for each individual prediction by fully exploiting the local information associated with the input $\bm{x}$ and ML model. Next, to mitigate the intensive computational effort involved in MRIP estimation, we develop a two-stage ML-based framework to directly learn the relationship between $\bm{x}$ and its MRIP $\mathcal{R} \left( \bm{x} \right)$, thus enabling to provide the reliability estimate $\mathcal{R} \left( \bm{x} \right)$ for any unseen input instantly. Thirdly, we propose an information gain-based approach to help determine a threshold value pertaing to $\mathcal{R} \left( \bm{x} \right)$ in support of decision makings on when to accept or abstain from counting on the ML model prediction. Comprehensive computational experiments and quantitative comparisons with existing methods on a broad range of real-world datasets reveal that the developed ML-based framework for MRIP estimation shows a robust performance in improving the reliability estimate of individual prediction, and the MRIP indicator $\mathcal{R} \left( \bm{x} \right)$ thus provides an essential layer of safety net when adopting ML models in risk-sensitive environments.
Research paper accepted by Mechanical Systems and Signal Processing
Blind deconvolution (BD) has been demonstrated to be an efficacious approach for extracting bearing fault-specific features from vibration signals under strong background noise. Despite BD’s appealing feature in adaptability and mathematical interpretability, a significant challenge persists: \textit{How to effectively integrate BD with fault-diagnosing classifiers?} This issue is intricate to be tackled because the traditional BD method is solely designed for feature extraction with its own optimizer and objective function. When BD is combined with the downstream deep learning classifier, the different learning objectives easily get in conflict. To address this problem, this paper introduces classifier-guided BD (ClassBD) for joint learning of BD-based feature extraction and deep learning-based fault diagnosis. Towards this goal, we first develop a time and frequency neural BD that employs neural networks to implement conventional BD, thereby facilitating seamless integration of BD and the deep learning classifier for co-optimization of model parameters. In the neural BD, we incorporate two filters: i) a time domain quadratic filter to utilize quadratic convolutional networks for extracting periodic impulses; ii) a frequency domain linear filter composed of a fully-connected neural network to amplify discrete frequency components. Next, we develop a unified framework built upon a deep learning classifier to guide the learning of BD filters. In addition, we devise a physics-informed loss function composed of kurtosis, $l_2/l_4$ norm, and a cross-entropy loss to jointly optimize the BD filters and deep learning classifier. In so doing, the fault labels are fully exploited to direct BD to extract features in distinguishing classes amidst strong noise. To the best of our knowledge, this is the first of its kind that BD is successfully applied to bearing fault diagnosis. Experimental results from three different datasets highlight that ClassBD outperforms other state-of-the-art methods under noisy conditions. The source codes of this paper are available at https://github.com/asdvfghg/ClassBD.
Research paper accepted by Nature Communications
Industrial enterprises are prominent sources of contaminant discharge on the planet and regulating their operations is vital for sustainable development. However, accurately tracking contaminant generation at the firm-level remains an intractable global issue due to significant heterogeneities among enormous enterprises and the absence of a universally applicable estimation method. This study addressed the challenge by focusing on hazardous waste (HW), known for its severe harmful properties and difficulty in automatic monitoring, and developed a data-driven methodology that predicted HW generation utilizing wastewater big data in a uniform and lightweight manner. The idea is grounded in the availability of wastewater big data with widespread application of automatic sensors, enabling depiction of heterogeneous enterprises, and the logical assumption that a correlation exists between wastewater and HW generation. We simulated this relationship by designing a generic framework that jointly used representative variables from diverse sectors, exploited a data-balance algorithm to address long-tail data distribution, and incorporated causal discovery to screen features and improve computation efficiency. To illustrate our approach, we applied it to 1024 enterprises across 10 sectors in Jiangsu, a highly industrialized province in China. Validation results demonstrated the model’s high fidelity (R2=0.87) in predicting HW generation using 4,260,593 daily wastewater data.