Research project funded by Early Career Scheme (ECS) of Research Grants Council

The past few years have witnessed the rapid development of artificial intelligence (AI) and machine learning (ML) in solving long-standing problems. AI/ML has played an indispensable role in profoundly transforming business, transportation, and finance. However, the adoption of AI/ML in risk-sensitive areas is still in its infancy because AI/ML systems exhibit fundamental limits and practical shortcomings: the field of AI/ML lacks a rigorous framework for reasoning about risk, uncertainty, and their potentially catastrophic outcomes, while safety and quality implications are top priorities in practice across a broad array of high-stakes applications that range from medical diagnosis to aerospace systems. Since the consequences of AI/ML system failures in life and safety-critical applications can be disastrous, it is becoming abundantly clear that, reasoning about uncertainty and risk will become a cornerstone when adopting AI/ML systems in large-scale risk-sensitive applications. As ML systems face varied risks arising from the input data (e.g., data bias, dataset shift etc.) as well as the ML model (e.g., model bias, model misspecification, etc.), the goal of this project is to develop a systematic risk-aware ML framework that consists of a series of control checkpoints to safeguard ML systems against potential risks to increase the credibility of adopting ML system in critical applications. Towards this goal, this project consists of three coherent tasks: (1) develop an integrated approach for data quality monitoring by combining feature-based anomaly detection technique with outcome-based uncertainty measure. The developed integrated approach will produce a composite probabilistic risk indicator to reveal input data quality; (2) develop a two-stage ML-based framework to estimate model reliability for individual prediction (MRIP). MRIP characterizes the probability of the observed difference between model prediction and actual value being within a tiny interval while model input varies within a small prescribed range, and it provides an individualized estimation on model prediction reliability for each input x. (3) develop a ML model to learn system-level risk. A ML model will be developed to map data-level and model-level risk indicators derived in the first two tasks to a risk measure at the system level. The proposed effort has profound practical implications, the risk-aware framework will act as an effective safety barrier in preventing ML models from making over-confident predictions on cases that are either too noisy, anomalous, outside the domain of the trained model, or with low model prediction reliability, thus facilitating safe and reliable adoption of ML systems in critical applications.

Research paper accepted by Decision Support Systems

The adoption of artificial intelligence (AI) and machine learning (ML) in risk-sensitive environments is still in its infancy because it lacks a systematic framework for reasoning about risk, uncertainty, and their potentially catastrophic consequences. In high-impact applications, inference on risk and uncertainty will become decisive in the adoption of AI/ML systems. To this end, there is a pressing need for a consolidated understanding on the varied risks arising from AI/ML systems, and how these risks and their side effects emerge and unfold in practice. In this paper, we provide a systematic and comprehensive overview of a broad array of inherent risks that can arise in AI/ML systems. These risks are grouped into two categories: data-level risk (e.g., data bias, dataset shift, out-of-domain data, and adversarial attacks) and model-level risk (e.g., model bias, misspecification, and uncertainty). In addition, we highlight the research needs for developing a holistic framework for risk management dedicated to AI/ML systems to hedge the corresponding risks. Furthermore, we outline several research related challenges and opportunities along with the development of risk-aware AI/ML systems. Our research has the potential to significantly increase the credibility of deploying AI/ML models in high-stakes decision settings for facilitating safety assurance, and preventing systems from unintended consequences.

Research paper accepted by IEEE Transactions on Intelligent Transportation Systems

Landing is generally cited as one of the riskiest phases of a flight, as indicated by the much higher accident rate than other flight phases. In this paper, we focus on the hard landing problem (which is defined as the touchdown vertical speed exceeding a predefined threshold), and build a probabilistic predictive model to forecast the aircraft’s vertical speed at touchdown, using DASHlink data. Previous work has treated hard landing as a classification problem, where the vertical speed is represented as a categorical variable based on a predefined threshold. In this paper, we build a machine learning model to numerically predict the touchdown vertical speed during aircraft landing. Probabilistic forecasting is used to quantify the uncertainty in model prediction, which in turn supports risk-informed decision-making. A Bayesian neural network approach is leveraged to construct the predictive model. The overall methodology consists of five steps. First, a clustering method based on the minimum separation between different airports is developed to identify flights in the dataset that landed at the same airport. Secondly, identifying the touchdown point itself is not straightforward; in this paper, it is determined by comparing the vertical speed distributions derived from different candidate touchdown indicators. Thirdly, a forward and backward filtering (filtfilt) approach is used to smooth the data without introducing phase lag. Next, a minimal-redundancy-maximal-relevance (mRMR) analysis is used to reduce the dimensionality of input variables. Finally, a Bayesian recurrent neural network is trained to predict the touchdown vertical speed and quantify the uncertainty in the prediction. The model is validated using several flights in the test dataset, and computational results demonstrate the satisfactory performance of the proposed approach.