Predicting Procedure Step Performance From Operator and Text Features: A Critical First Step Toward Machine Learning-Driven Procedure Design. Academic Article uri icon


  • OBJECTIVE: The goal of this study is to assess machine learning for predicting procedure performance from operator and procedure characteristics. BACKGROUND: Procedures are vital for the performance and safety of high-risk industries. Current procedure design guidelines are insufficient because they rely on subjective assessments and qualitative analyses that struggle to integrate and quantify the diversity of factors that influence procedure performance. METHOD: We used data from a 25-participant study with four procedures, conducted on a high-fidelity oil extraction simulation to develop logistic regression (LR), random forest (RF), and decision tree (DT) algorithms that predict procedure step performance from operator, step, readability, and natural language processing-based features. Features were filtered using the Boruta approach. The algorithms were trained and optimized with a repeated 10-fold cross-validation. After training, inference was performed using variable importance and partial dependence plots. RESULTS: The RF, DT, and LR algorithms with all features had an area under the receiver operating characteristic curve (AUC) of 0.78, 0.77, and 0.75, respectively, and significantly outperformed the LR with only operator features (LROP), with an AUC of 0.61. The most important features were experience, familiarity, total words, and character-based metrics. The partial dependence plots showed that steps with fewer words, abbreviations, and characters were correlated with correct step performance. CONCLUSION: Machine learning algorithms are a promising approach for predicting step-level procedure performance, with acknowledged limitations on interpolating to nonobserved data, and may help guide procedure design after validation with additional data on further tasks. APPLICATION: After validation, the inferences from these models can be used to generate procedure design alternatives.

altmetric score

  • 5

author list (cited authors)

  • McDonald, A. D., Ade, N., & Peres, S. C.

citation count

  • 0

publication date

  • September 2020