Repeatability of subjective evaluation of lameness in horses.
Authors: Keegan K G, Dent E V, Wilson D A, Janicek J, Kramer J, Lacarrubba A, Walsh D M, Cassells M W, Esther T M, Schiltz P, Frees K E, Wilhite C L, Clark J M, Pollitt C C, Shaw R, Norris T
Journal: Equine veterinary journal
Summary
# Editorial Summary: Repeatability of Subjective Evaluation of Lameness in Horses Even experienced equine practitioners struggle to reach consistent conclusions when evaluating lameness subjectively, particularly in cases of mild dysfunction. Keegan and colleagues assessed agreement amongst 131 horses evaluated by 2–5 clinicians (averaging 18.7 years' experience each) using the AAEP lameness scale, both during straight-line trotting alone and following comprehensive lameness examination. When evaluators were asked simply whether a limb was lame or sound—regardless of severity grade—they agreed only 76.6% of the time during initial observation and 72.9% after full evaluation (both kappa = 0.44–0.45). Critically, agreement deteriorated markedly with mild lameness (mean AAEP score ≤1.5): clinicians concurred just 61.9% of the time (kappa = 0.23), compared with 93.1% agreement (kappa = 0.86) for moderate-to-severe cases (score >1.5). Forelimb lameness showed marginally better inter-observer repeatability than hindlimb, and when the task required identifying the worst-affected limb, agreement plummeted to 51.6%. For practitioners relying on visual assessment in clinical practice, these findings underscore the unreliability of subjective judgment, particularly when detecting early or subtle gait abnormalities. Multimodal assessment incorporating force-plate analysis, inertial measurement units, or diagnostic imaging should be considered essential alongside traditional visual evaluation, especially where mild lameness is suspected but clinical consensus is lacking.
Read the full abstract on PubMed
Practical Takeaways
- •Subjective lameness evaluation by eye alone is unreliable for detecting mild lameness—consider supplementing clinical assessment with objective diagnostic tools when mild lameness is suspected
- •Moderate-to-severe lameness is reasonably reliable to identify by visual inspection, but agreement breaks down significantly for subtle gait abnormalities
- •Multiple clinicians examining the same horse may reach different conclusions, particularly regarding which limb is affected; this has implications for case discussion, second opinions, and treatment planning
Key Findings
- •Clinicians agreed on lameness presence/absence 76.6% of the time using trot-only evaluation (kappa=0.44) and 72.9% after full evaluation (kappa=0.45)
- •Agreement on forelimb lameness was slightly higher than hindlimb lameness across all evaluation methods
- •For moderate-severe lameness (mean AAEP score >1.5), clinician agreement was 93.1% (kappa=0.86), but for mild lameness (≤1.5) only 61.9% (kappa=0.23)
- •When asked to identify which limb was worst lame, clinicians agreed only 51.6% of the time (kappa=0.37)