Agreement between subjective evaluations and a markerless AI-based gait analysis system during lungeing assessment in traditional racehorses.
Authors: Meistro F, Ralletti M V, Rinnovati R, Spadari A
Journal: Journal of equine veterinary science
Summary
# Editorial Summary Subjective lameness assessment during lungeing remains a cornerstone of pre-race clinical evaluation in racehorses, yet its reliability has long been questioned—particularly when identifying mild or complex gait asymmetries. Meistro and colleagues compared traditional clinician-based scoring against a markerless artificial intelligence gait analysis system (OAI-MS) in 24 traditional racehorses evaluated at routine pre-race inspections, with inter-observer agreement measurements and 10-day repeatability testing of the AI system to establish consistency. Inter-observer agreement among experienced clinicians was poor to weak (κ = −0.20 to 0.36), whilst agreement between subjective scores and the OAI-MS ranged from slight to moderate (κ = 0.13–0.47), with the AI system demonstrating fair short-term repeatability (κ = 0.43) and notably better concordance for forelimb assessment than hindlimbs. These findings validate what many practitioners have suspected: human evaluation of lungeing gait is inherently variable and potentially unreliable, particularly for subtle lameness. The OAI-MS offers a practical, objective complement to clinical judgment—most useful in borderline cases where clinician opinions diverge or when documenting mild asymmetries for baseline comparison and monitoring, rather than as a replacement for experienced clinical assessment.
Read the full abstract on PubMed
Practical Takeaways
- •Subjective lameness assessments during lungeing have limited reliability, especially for mild cases—consider complementary AI gait analysis when clinical agreement is poor or asymmetry is subtle
- •AI-based gait analysis (OAI-MS) shows promise as a repeatable, objective tool for routine pre-race inspections and clinical decision-making
- •The technology appears more reliable for forelimb evaluation; use additional assessment methods when hindlimb asymmetry is suspected
Key Findings
- •Inter-observer agreement for subjective gait evaluation was poor to fair (κ = -0.20 to 0.36)
- •Agreement between subjective evaluations and AI-based gait analysis ranged from slight to moderate (κ = 0.13-0.47)
- •The OAI-MS demonstrated moderate repeatability at 10-day interval (κ = 0.43), supporting field usability
- •Agreement was higher for forelimbs than hindlimbs, with most discrepancies being of low magnitude