Research project funded by the Natural Science Foundation of Shenzhen-General Program
Uncertainty quantification and spatiotemporal causal discovery for reliable traffic prediction
Uncertainty quantification and spatiotemporal causal discovery for reliable traffic prediction
Accurate and reliable prediction has profound implications to a wide range of applications, such as hospital admissions, inventory control, route planning. In this study, we focus on an instance of spatio-temporal learning problem–traffic prediction–to demonstrate an advanced deep learning model developed for making accurate and reliable prediction. Despite the significant progress in traffic prediction, limited studies have incorporated both explicit (e.g., road network topology) and implicit (e.g., causality-related traffic phenomena and impact of exogenous factors) traffic patterns simultaneously to improve prediction performance. Meanwhile, the variability nature of traffic states necessitates quantifying the uncertainty of model predictions in a statistically principled way; however, extant studies offer no provable guarantee on the statistical validity of confidence intervals in reflecting its actual likelihood of containing the ground truth. In this paper, we propose an end-to-end traffic prediction framework that leverages three primary components to generate accurate and reliable traffic predictions: dynamic causal structure learning for discovering implicit traffic patterns from massive traffic data, causally-aware spatio-temporal multi-graph convolution network (CASTMGCN) for learning spatio-temporal dependencies, and conformal prediction for uncertainty quantification. In particular, CASTMGCN fuses several graphs that characterize different important aspects of traffic networks (including physical road structure, time-lagged causal effect, contemporaneous causal relationships) and an auxiliary graph that captures the effect of exogenous factors on the road network. On this basis, a conformal prediction approach tailored to spatio-temporal data is further developed for quantifying the uncertainty in node-wise traffic predictions over varying prediction horizons. Experimental results on two real-world traffic datasets of varying scale demonstrate that the proposed method outperforms several state-of-the-art models in prediction accuracy; moreover, it generates more efficient prediction regions than several other methods while strictly satisfying the statistical validity in coverage.
We are pleased to welcome Ruohan Li, who recently joined our group as a research assistant. Ruohan Li holds a Bachelor’s degree in Economics and Finance from the University of Toronto and a Master’s degree in Business Analytics from Ivey Business School at Western University.
Understanding causal relationships between traffic states throughout the system is of great significance for enhancing traffic management and optimization in urban traffic networks. Unfortunately, few studies in the literature have systematically analyzed causal structure characterizing the evolution of traffic states over time and gauged the importance of traffic nodes from a causal perspective, particularly in the context of large-scale traffic networks. Moreover, the dynamic nature of traffic patterns necessitates a robust method to reliably discover causal relationships, which are often overlooked in existing studies. To address these issues, we propose a Spatio-Temporal Causal Structure Learning and Analysis (STCSLA) framework for analyzing large-scale urban traffic networks at a mesoscopic level from a causal lens. The proposed framework comprises three main components: decomposition of spatio-temporal traffic data into localized traffic subprocesses; a Bayesian Information Criterion-guided spatio-temporal causal structure learning combined with temporal-dependencies preserving sampling for deriving reliable causal graph to uncover time-lagged and contemporaneous causal effects; establishing several causality-oriented indicators to identify causally critical nodes, mediator nodes, and bottleneck nodes in traffic networks. Experimental results on both a synthetic dataset and the real-world Hong Kong traffic dataset demonstrate that the proposed STCSLA framework accurately uncovers time-varying causal relationships and identifies key nodes that play various causal roles in influencing traffic dynamics. These findings underscore the potential of the proposed framework to improve traffic management and provide a comprehensive causality-driven approach for analyzing urban traffic networks.
In the field of prognostics and health management, the integration of machine learning has enabled the development of advanced predictive models that ensure the reliable and safe operation of complex assets. However, challenges such as sparse, noisy, and incomplete data necessitate the integration of prior knowledge and inductive bias to improve model generalization, interpretability, and robustness.
Inductive bias, defined as the set of assumptions embedded in machine learning models, plays a crucial role in guiding these models to generalize effectively from limited training data to real-world scenarios. In PHM applications, where physical laws and domain-specific knowledge are fundamental, the use of inductive bias can significantly enhance a model’s ability to predict system behavior under diverse operating conditions. By embedding physical principles into learning algorithms, inductive bias reduces the reliance on large datasets, ensures that model predictions are physically consistent, and enhances both the generalizability and interpretability of the models.
This talk will explore various forms of inductive bias tailored for PHM systems, with a particular focus on heterogenous-temporal graph neural networks, as well as physics-informed and algorithm-informed graph neural networks. These approaches will be applied to virtual sensing, modelling multi-body dynamical systems and anomaly detection.
As the potential applications of AI continue to expand, a central question remains unresolved: will users trust and adopt AI-powered technologies? Since AI’s promise closely hinges on the perceptions of its trustworthiness, how to guarantee the reliability and trustworthiness of AI plays a fundamental role in fostering its broad adoptions in practice. However, the theories, mathematical models, and methods in reliability engineering and risk management have not kept pace with the rapid technological progress in AI. As a result, the lack of essential components (e.g., reliability, trustworthiness) in the resultant models has emerged as a major roadblock to regulatory approval and widespread adoptions of AI-powered solutions in high-stakes decision environments, such as healthcare, aviation, finance, nuclear power plant, to name a few. To fully harness AI’s power for automating decision making in these safety-critical applications, it is essential to manage expectations for what AI can realistically deliver to build appropriate levels of trust. In this paper, we focus on functional reliability of AI systems developed through supervised learning and discuss the unique characteristics of AI systems that necessitate the development of specialized reliability engineering and risk management theories and methods to create functionally reliable AI systems. Next, we thoroughly review five prevalent engineering mechanisms in the existing literature for approaching functionally reliable and trustworthy AI, including uncertainty quantification (UQ) composed of model-based UQ and model-agnostic conformal prediction, failure prediction, learning with abstention, formal verification, and knowledge-enabled AI. Furthermore, we outline several research challenges and opportunities related to the development of reliability engineering and trustworthiness assurance methods for AI systems. Our research aims to deepen the understanding of reliability and trustworthiness issues associated with AI systems, and spark researchers in the field of risk and reliability engineering and beyond to contribute to this area of study with emerging importance.
It is common for multiple firms\textemdash such as manufacturers, retailers, and third-party insurers\textemdash to coexist and compete in the aftermarket for durable products. In this paper, we study price competition in a partially concentrated aftermarket where one firm offers multiple extended warranty (EW) contracts while the others offer a single one. The demand for EWs is described by the multinomial logit model. We show that, at equilibrium, such an aftermarket behaves like a combination of monopoly and oligopoly. Building upon this base model, we further investigate sequential pricing games for a durable product and its EWs to accommodate the ancillary nature of after-sales services. We consider two scenarios: one where the manufacturer (as the market leader) sets product and EW prices \emph{simultaneously}, and another where these decisions are made \emph{sequentially}. Our analysis demonstrates that offering EWs incentivizes the manufacturer to lower the product price, thereby expanding the market potential for EWs. Simultaneous product-EW pricing leads to a price concession on EWs compared to sequential pricing, effectively reducing the intensity of competition in the aftermarket. Overall, the competitiveness of an EW hinges on its ability to deliver high value to consumers at low marginal cost to its provider. While our focus is on EWs, the proposed game-theoretical pricing models apply broadly to other ancillary after-sales services.
Landing is generally cited as one of the riskiest phases of a flight, as indicated by the much higher accident rate than other flight phases. In this talk, we focus on the hard landing problem (which is defined as the touchdown vertical speed exceeding a predefined threshold) and build a probabilistic deep learning model to forecast the aircraft’s vertical speed at touchdown using DASHlink data. Previous studies have treated hard landing as a classification problem, in which the vertical speed is represented as a categorical variable based on a predefined threshold. In this talk, we develop a machine learning model to predict the touchdown vertical speed during aircraft landing. Probabilistic forecasting is used to quantify the uncertainty in model prediction to support risk-informed decision-making. A Bayesian neural network approach is leveraged to build the predictive model. The overall methodology consists of five steps. First, a clustering method based on the minimum separation between different airports is developed to identify flights in the dataset that landed at the same airport. Secondly, identifying the touchdown point itself is not straightforward; in this paper, it is determined by comparing the vertical speed distributions derived from different candidate touchdown indicators. Thirdly, a forward and backward filtering (filtfilt) approach is used to smooth the data without introducing the phase lag. Next, a minimal-redundancy-maximal-relevance (mRMR) analysis is used to reduce the dimensionality of input variables. Finally, a Bayesian recurrent neural network is trained to predict the touchdown vertical speed and quantify the uncertainty in the prediction. The model is validated using several flights in the test dataset, and computational results demonstrate the satisfactory performance of the proposed approach.
We are pleased to welcome Dr. Shuaiqi Yuan, who recently joined our group as a postdoctoral research scholar. Dr. Yuan holds a PhD in Safety and Security Science from Delft University of Technology in the Netherlands.
The demand for disruption-free fault diagnosis of mechanical equipment under a constantly changing operation environment poses a great challenge to the deployment of data-driven diagnosis models in practice. Extant continual learning-based diagnosis models suffer from consuming a large number of labeled samples to be trained for adapting to new diagnostic tasks and failing to account for the diagnosis of heterogeneous fault types across different machines. In this paper, we use a representative mechanical equipment – rotating machinery — as an example and develop an uncertainty-aware continual learning framework (UACLF) to provide a unified interface for fault diagnosis of rotating machinery under various dynamic scenarios: class continual scenario, domain continual scenario, and both. The proposed UACLF takes a three-step to tackle fault diagnosis of rotating machinery with homogeneous-heterogeneous faults under dynamic environments. In the first step, an inter-class classification loss function and an intra-class discrimination loss function are devised to extract informative feature representations from the raw vibration signal for fault classification. Next, an uncertainty-aware pseudo labeling mechanism is developed to select unlabeled fault samples that we are able to assign pseudo labels confidently, thus expanding the training samples for faults arising in the new environment. Thirdly, an adaptive prototypical feedback mechanism is used to enhance the decision boundary of fault classification and diminish the model misclassification rate. Experimental results on three datasets suggest that the proposed UACLF outperforms several alternatives in the literature on fault diagnosis of rotating machinery across various working conditions and different machines.
