Reporting of Model Performance and Statistical Methods in Studies That Use Machine Learning to Develop Clinical Prediction Models: Protocol for a Systematic Review.

Colin George Wyllie Weaver

; Robert B Basmadjian

; Tyler Williamson

; Kerry McBrien

; Tolu Sajobi

; Devon Boyne

; Mohamed Yusuf

; Paul Everett Ronksley

; (2022) Reporting of Model Performance and Statistical Methods in Studies That Use Machine Learning to Develop Clinical Prediction Models: Protocol for a Systematic Review. JMIR research protocols, 11 (3). e30956-. ISSN 1929-0748 DOI: 10.2196/30956

Copy

BACKGROUND: With the growing excitement of the potential benefits of using machine learning and artificial intelligence in medicine, the number of published clinical prediction models that use these approaches has increased. However, there is evidence (albeit limited) that suggests that the reporting of machine learning-specific aspects in these studies is poor. Further, there are no reviews assessing the reporting quality or broadly accepted reporting guidelines for these aspects. OBJECTIVE: This paper presents the protocol for a systematic review that will assess the reporting quality of machine learning-specific aspects in studies that use machine learning to develop clinical prediction models. METHODS: We will include studies that use a supervised machine learning algorithm to develop a prediction model for use in clinical practice (ie, for diagnosis or prognosis of a condition or identification of candidates for health care interventions). We will search MEDLINE for studies published in 2019, pseudorandomly sort the records, and screen until we obtain 100 studies that meet our inclusion criteria. We will assess reporting quality with a novel checklist developed in parallel with this review, which includes content derived from existing reporting guidelines, textbooks, and consultations with experts. The checklist will cover 4 key areas where the reporting of machine learning studies is unique: modelling steps (order and data used for each step), model performance (eg, reporting the performance of each model compared), statistical methods (eg, describing the tuning approach), and presentation of models (eg, specifying the predictors that contributed to the final model). RESULTS: We completed data analysis in August 2021 and are writing the manuscript. We expect to submit the results to a peer-reviewed journal in early 2022. CONCLUSIONS: This review will contribute to more standardized and complete reporting in the field by identifying areas where reporting is poor and can be improved. TRIAL REGISTRATION: PROSPERO International Prospective Register of Systematic Reviews CRD42020206167; https://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=206167. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR1-10.2196/30956.