Intra- and interobserver reliability estimates for identification and grading of upper respiratory tract abnormalities recorded in horses at rest and during overground endoscopy.
Authors: McGivney C L, Sweeney J, David F, O'Leary J M, Hill E W, Katz L M
Journal: Equine veterinary journal
Summary
# Editorial Summary McGivney et al. (2017) investigated how reliably veterinarians can identify and grade upper respiratory tract (URT) abnormalities in horses using both resting endoscopy and overground endoscopy (OGE), addressing a gap in the literature where previous reliability studies had focused predominantly on single disorders or resting examinations alone. Four blinded raters independently evaluated and re-evaluated endoscopic videos from 43 Thoroughbreds with various URT conditions, with findings analysed using weighted Cohen's kappa and Krippendorff's alpha statistics to determine both intraobserver consistency (same rater, repeated assessment) and interobserver agreement (between different raters). Individual observers showed excellent consistency when grading arytenoid symmetry during exercise and epiglottic entrapment, but moderate reliability for vocal fold collapse, nasopharyngeal collapse and resting epiglottic grading; importantly, agreement between different observers was substantially weaker for several conditions, particularly vocal fold collapse, epiglottic retroversion and nasopharyngeal collapse, where reliability dropped to only fair or poor levels. For practitioners, these findings underscore that whilst a single veterinarian will grade the same URT abnormality consistently on repeat examination, significant variation between clinicians means diagnosis and prognosis should ideally involve standardised reference images or a second opinion, particularly for conditions showing poor interobserver agreement—a consideration of particular relevance when diagnosing performance-limiting conditions or assessing suitability for racing.
Read the full abstract on PubMed
Practical Takeaways
- •When submitting endoscopic videos for second opinions on equine upper respiratory abnormalities, be aware that different veterinarians may grade conditions like vocal fold collapse and nasopharyngeal collapse differently, so consensus review by multiple specialists may improve diagnostic confidence.
- •Video endoscopy findings for conditions with only fair to poor interobserver agreement (e.g., epiglottic retroversion, pharyngeal mucus) should be interpreted cautiously and correlated with clinical signs rather than treated as definitive diagnoses.
- •Individual practitioners should establish their own grading consistency for upper respiratory conditions through regular re-evaluation of reference cases, as intraobserver reliability is generally good even when interobserver agreement is limited.
Key Findings
- •Intraobserver agreement was perfect to nearly perfect for arytenoid symmetry at exercise, epiglottic entrapment and epiglottic retroversion, but only moderate for vocal fold collapse, nasopharyngeal collapse and epiglottic grade at rest.
- •Interobserver agreement was substantial for arytenoid symmetry at exercise and palatal dysfunction but only fair to poor for vocal fold collapse, nasopharyngeal collapse, epiglottic retroversion and epiglottic grade at rest.
- •Significant disparity existed between observers for several upper respiratory tract conditions, indicating reliability limitations for interobserver assessment despite good intraobserver consistency.
- •Assessment of overground endoscopy videos revealed reliability challenges comparable to resting endoscopic evaluation for multiple upper respiratory tract abnormalities.