Fairness and generalisability in deep learning of retinopathy of prematurity screening algorithms: a literature review
Author(s)
Nakayama, Luis Filipe; Mitchell, William Greig; Ribeiro, Lucas Zago; Dychiao, Robyn Gayle; Phanphruk, Warachaya; Celi, Leo Anthony; Kalua, Khumbo; Santiago, Alvina Pauline Dy; Regatieri, Caio Vinicius Saito; Moraes, Nilva Simeren Bueno; ... Show more Show less
Downloade001216.full.pdf (707.8Kb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Background
Retinopathy of prematurity (ROP) is a vasoproliferative disease responsible for more than 30 000 blind children worldwide. Its diagnosis and treatment are challenging due to the lack of specialists, divergent diagnostic concordance and variation in classification standards. While artificial intelligence (AI) can address the shortage of professionals and provide more cost-effective management, its development needs fairness, generalisability and bias controls prior to deployment to avoid producing harmful unpredictable results. This review aims to compare AI and ROP study’s characteristics, fairness and generalisability efforts.
Methods
Our review yielded 220 articles, of which 18 were included after full-text assessment. The articles were classified into ROP severity grading, plus detection, detecting treatment requiring, ROP prediction and detection of retinal zones.</jats:p></jats:sec><jats:sec><jats:title>Results</jats:title><jats:p>All the article’s authors and included patients are from middle-income and high-income countries, with no low-income countries, South America, Australia and Africa Continents representation.</jats:p><jats:p>Code is available in two articles and in one on request, while data are not available in any article. 88.9% of the studies use the same retinal camera. In two articles, patients’ sex was described, but none applied a bias control in their models.
Conclusion
The reviewed articles included 180 228 images and reported good metrics, but fairness, generalisability and bias control remained limited. Reproducibility is also a critical limitation, with few articles sharing codes and none sharing data. Fair and generalisable ROP and AI studies are needed that include diverse datasets, data and code sharing, collaborative research, and bias control to avoid unpredictable and harmful deployments.
Date issued
2023-08Department
Harvard--MIT Program in Health Sciences and Technology. Laboratory for Computational PhysiologyPublisher
BMJ
Citation
Nakayama LF, Mitchell WG, Ribeiro LZ, et alFairness and generalisability in deep learning of retinopathy of prematurity screening algorithms: a literature reviewBMJ Open Ophthalmology 2023;8:e001216.
Version: Final published version
ISSN
2397-3269
Keywords
Ophthalmology
Collections
The following license files are associated with this item: