Cytologic scoring of equine exercise-induced pulmonary hemorrhage: Performance of human experts and a deep learning-based algorithm.

Authors: Bertram Christof A, Marzahl Christian, Bartel Alexander, Stayt Jason, Bonsembiante Federico, Beeler-Marfisi Janet, Barton Ann K, Brocca Ginevra, Gelain Maria E, Gläsel Agnes, Preez Kelly du, Weiler Kristina, Weissenbacher-Lang Christiane, Breininger Katharina, Aubreville Marc, Maier Andreas, Klopfleisch Robert, Hill Jenny

Journal: Veterinary pathology

DOI: 10.1177/03009858221137582 PubMed: 36384369Log in to save

Summary

Exercise-induced pulmonary haemorrhage (EIPH) remains a significant concern in sport horses, typically diagnosed by scoring haemosiderin-laden macrophages in bronchoalveolar lavage fluid samples using the total haemosiderin score (THS), yet clinicians have long suspected that subjective scoring introduces substantial inconsistency into diagnosis. Researchers analysed cytological specimens from 52 equine cases, asking ten pathologists to independently assign THSs and comparing their performance against both a ground truth dataset (derived from standardised grading criteria) and a machine learning algorithm trained on the same data. Human observers demonstrated poor reproducibility, with significant interobserver variation primarily driven by inconsistent grading of individual macrophage haemosiderin content; however, 87.7% of this variance could be eliminated through standardised grading protocols. The deep learning algorithm substantially outperformed human experts—achieving 92.3% diagnostic accuracy for EIPH classification (THS ≥75 versus <75) compared to 75.7% for clinicians, whilst maintaining equivalent correlation with direct chemical iron measurements. These findings suggest that incorporating algorithmic scoring into routine EIPH diagnosis would improve reliability for clinical decision-making, and the authors recommend treating human-derived THSs between 40–110 as diagnostically uncertain pending algorithmic or repeat assessment.

Read the full abstract on PubMed

Practical Takeaways

•BALF cytology for EIPH diagnosis using traditional human scoring is unreliable—scores between 40-110 should be interpreted with caution as they lack reproducibility
•Deep learning-based scoring systems offer significantly better accuracy (92% vs 76%) and consistency for EIPH diagnosis and should be considered for routine clinical use
•Implementing standardized grading criteria and algorithmic support can substantially improve the diagnostic reliability of hemosiderin scoring in respiratory cases

Key Findings

•Human annotators showed significant interobserver variability in total hemosiderin score (THS) with only 75.7% diagnostic accuracy for EIPH detection, primarily due to systematic grading differences between observers
•Deep learning algorithm achieved 92.3% diagnostic accuracy for EIPH diagnosis compared to ground truth, with high consistency in hemosiderin grade assignment
•Standardized grading based on ground truth could reduce measurement variance by 87.7%, and a diagnostic uncertainty interval of 40-110 THS is proposed for human expert assessment

Conditions Studied

exercise-induced pulmonary hemorrhage (eiph)

Related References

Diagnostic Value of Tracheal Wash Cytology for Monitoring Exercise-Induced Pulmonary Hemorrhage in Thoroughbred Racehorses.

Cascardo Bianca, Bernardes Camila, de Souza Guilherme N, Silva Katia M, Pires Natália R, de Alencar Nayro Xavier, Lessa Daniel A B(2022)Journal of equine veterinary science

Associations between Exercise-Induced Pulmonary Hemorrhage (EIPH) and Fitness Parameters Measured by Incremental Treadmill Test in Standardbred Racehorses.

Lo Feudo Chiara Maria, Stucchi Luca, Stancari Giovanni, Alberti Elena, Conturba Bianca, Zucca Enrica, Ferrucci Francesco(2022)Animals : an open access journal from MDPI

Bronchoalveolar lavage hemosiderosis in lightly active or sedentary horses.

Mahalingam-Dhingra Ananya, Bedenice Daniela, Mazan Melissa R(2023)Journal of veterinary internal medicine

Bronchoalveolar lavage fluid in Standardbred racehorses: influence of unilateral/bilateral profiles and cut-off values on lower airway disease diagnosis.

Depecker Marianne, Richard Eric A, Pitel Pierre-Hugues, Fortier Guillaume, Leleu Claire, Couroucé-Malblanc Anne(2014)Veterinary journal (London, England : 1997)

Pulmonary bleeding in racehorses: A gross, histologic, and ultrastructural comparison of exercise-induced pulmonary hemorrhage and exercise-associated fatal pulmonary hemorrhage.

Rocchigiani Guido, Verin Ranieri, Uzal Francisco A, Singer Ellen R, Pregel Paola, Ressel Lorenzo, Ricci Emanuele(2022)Veterinary pathology