Thursday, March 29, 2012

Brief Thoughts: A critical examination of "Foot Strike and Injury Rates in Endurance Runners: a retrospective study"

Two very different footstrikes captured on high-speed video

This post is a bit behind the times, but I thought I'd get it out there regardless.  This post is a short analysis which takes a look at the latest research from Daniel Lieberman's lab at Harvard University.  Lieberman is, of course, famous for his paper on foot strike in habitually shod and unshod runners which made the cover of Nature magazine and sparked a fierce controversy in the world of biomechanics.  This latest paper, which is available as an "epub" online (it has been accepted and reviewed, but has yet to be printed), turns its attention to foot strike styles and injury rates on the Harvard track and cross country team.  Its title is "Foot Strike and Injury Rates in Endurance Runners: a retrospective study," and I'll reproduce the abstract below for your convenience.
Purpose: This retrospective study tests if runners who habitually forefoot strike have different rates of injury than runners who habitually rearfoot strike.
Methods: We measured the strike characteristics of middle and long distance runners from a collegiate cross country team and quantified their history of injury, including the incidence and rate of specific injuries, the severity of each injury, and the rate of mild, moderate and severe injuries per mile run.
Results: Of the 52 runners studied, 36 (59%) primarily used a rearfoot strike and 16 (31%) primarily used a forefoot strike. Approximately 74% of runners experienced a moderate or severe injury each year, but those who habitually rearfoot strike had approximately twice the rate of repetitive stress injuries than individuals who habitually forefoot strike. Traumatic injury rates were not significantly different between the two groups. A generalized linear model showed that strike type, sex, race distance, and average miles per week each correlate significantly (p<0.01) with repetitive injury rates.
Conclusions: Competitive cross country runners on a college team incur high injury rates, but runners who habitually rearfoot strike have significantly higher rates of repetitive stress injury than those who mostly forefoot strike. This study does not test the causal bases for this general difference. One hypothesis, which requires further research, is that the absence of a marked impact peak in the ground reaction force during a forefoot strike compared to a rearfoot strike may contribute to lower rates of injuries in habitual forefoot strikers.
 The paper caused quite a stir when it was first published, since its message was very quickly distorted to "heelstriking causes injury," prompting strong rebuke from other biomechanics researchers and doctors.  First off, for all the huffing and puffing about the controversy over this study, its tone is really quite tame for most of the paper. It's just the media and the minimalist shoe companies that are blowing this way out of proportion.

Anyways, getting back to the paper itself: The introduction actually does quite a nice job of introducing the current state of affairs when it comes to running and injury: a few fairly obvious factors (age, bmi, running experience) are known to correlate with injury rates, but efforts to reduce injury rates globally have not been particularly successful.  I am a bit irked at Daoud et al. for being too dismissive of the success of orthotics in reducing injury (e.g. this study), but outside of that I did not see too many issues with the introduction. Daoud et al. bring up the issue of impact forces and rely heavily on Irene Davis' work (who has shown that runners with a history of specific injuries like plantar fasciitis and tibial stress fractures have high impact loading rates), but they also point out that Benno Nigg has not found impact loading rates to be related to injury risk.  They omitted the fact that Nigg found that loading rates to be inversely correlated with injury risk, however. In a few places, the paper simply says "impact" when it really ought to say "impact loading rates," since it's only these loading rates, not the magnitude of the impact itself, that have been associated with injury risk (again, mostly by Irene Davis' group). They also briefly bring up the issue of joint torques and possible differences between forefoot and rearfoot striking, citing Casey Kerrigan's work in that area, which I posted on a few months ago.

The experimental design is more problematic. To their credit, the researchers collected much more data than I had expected: all runners logged every workout on an online training log, so they were able to track total number of miles run through the duration of the study. As the study says:
Each athlete was required to record daily all running and cross training information including distance run, times, and comments on performance throughout the 9-month athletic season. The total number of running days, total miles run, total minutes run, average miles per week, and average running pace were computed for each subject while on the team
For reasons that are not clear, some of the subjects had their footstrike categorized on a treadmill and others had it done on a track. While it does not seem like there would be any major differences, it's just not that hard to film a runner on a track, so I don't understand why they weren't all filmed doing overground running. In any case, in the duration of the study there were 16 forefoot strikers, 36 rearfoot strikers, and zero midfoot strikers. This alone should cause us to think twice about the sample size in this study, since from the two largest studies on footstrike patterns, we know that midfoot striking is three to seventeen times more common than forefoot striking and that true forefoot strikers make up only ~1% of the running population, at least among elite, sub-elite, and recreational road runners (Hasegawa et al., Larson et al.).

To be fair, you have to keep in mind that the general running population is of a very different makeup than the highly competitive college running population.  But last year, I used high-speed video to categorize 18 male runners on my own college cross country team; these were my results:

rear-foot strike: 14 (77%)
mid-foot strike: 3 (16%)
fore-foot strike: 1 (1%)

While this is NOWHERE near enough data to make any "real" conclusions, we certainly did not have 30% forefoot strikers. Even among competitive runners, I'd wager the natural proportion of midfoot PLUS forefoot strikers at normal training paces is probably ~20%. In my own investigation, footstrike patterns were fairly stable from 8:00/mi down to about 5:30/mi...once it gets faster, all bets are off.  Steve Magness cites some unpublished studies (not sure if they ever ended up getting published) on footstrike patterns in 1500m and 800m races and found that, predictably, a much higher proportion of runners forefoot strike, though interestingly, even in fast 800m races, a substantial portion of the field rear-foot strikes.  But the point of this was that, even in my puny sample of 18 DIII runners, we had three midfoot strikers, so I'm puzzled as to why there was such a dearth of midfoot strikers (and excess of forefoot strikers) at Harvard.

Daoud et al. also note that some runners switched striking styles based on their running speed (the three speeds they were evaluated at were "easy," "intermediate," and 5km race pace). Later in the analysis they use some statistical tricks to show that removing them from the study does not alter its conclusions. They also constructed a pretty complicated statistical model to evaluate injuries based on their severity and to control for distance covered in training, and formulate a list of "predicted" FFS and RFS injuries. This is also a flaw in my eyes—forming hypotheses are great, but you can't pick out what injuries you think rearfoot strikers will get beforehand and then go looking for them. It is completely arbitrary and artificial, and likely exploits artifacts in the data by narrowing the effective pool of subjects even further. Daoud et al. make a big deal in the conclusion about rearfoot striking being a significant factor associated with "predicted" rearfoot striking injuries, but make no mention of statistically significant or near-significant factors that seem spurious (like duration in the study, my next point).

My third major grievance with this study is that it does not give equal weight to all of its subjects. For example, some runners were in the study all four years of college, while others graduated before the study ended. Daoud et al. claim this is not a bias since their generalized linear model showed that not associated with the overall rate of injury (P = 0.565). However, when broken down into the FFS and RFS injury categories, it jumps to P = 0.0333 and 0.0377! There is obviously some sort of double standard here—why are significant differences in "predicted RFS injuries" important to talk about, but study design flaws that show up when that same category is used are not?

All injury rates are reported in injuries per 10,000 miles of running. I'm undecided on what I think of this: on one hand, it seems to control for the obvious fact that mileage is associated with injury risk; on the other, it seems like it's the same problem of unequal subject treatment again. Every college team probably has one "indestructible" high mileage guy who runs 100+ miles a week year-round and rarely or never gets injured. What if he happened to be a forefoot striker? Especially with only 16 forefoot strikers in the study, that would totally throw off the data analysis.

Daoud et al. also use some advanced statistics to do some more data analysis showing that (as we would predict) women get more injuries than men and that higher mileage is associated with more injuries, but also that the occurrence of "predicted RFS injuries" is indeed more common in rearfoot strikers, but there was no greater occurrence of "predicted FFS injuries" in forefoot strikers. My best guess as to why they did this is because they did not have enough data to categorize specific injuries, even though they saw non-significant trends among clusters of injuries. As I mentioned earlier, I think this is a misleading procedure, since they have little previous evidence to suspect that some of the injuries they classify as RFS-associated are in fact more common in rearfoot strikers. The only statistically significant differences among specific injuries were the following:

*Achilles tendinopathy was more common among FFS females
*Plantar fasciitis was more common among RFS females
*Hip pain was more common among pooled (male and female) rearfoot strikers
*Traumatic joint sprain was more common among FFS females
*"Repetitive joint sprain" (not sure what exactly that is) was more common among male and overall RFS
*Sacral stress fractures were more common among RFS males.

I think Lieberman's lab is losing sight of the statistical forest on account of the trees. I suspect many of these 'statistically significant' injury rate differences are statistical artifacts due to extremely small sample sizes, so Daoud et al. instead grouped injuries into larger clusters—otherwise they'd have to account for why forefoot striking causes more joint sprains. A sacral stress fracture, particularly in men, is an extremely rare injury, and I suspect there was only one or perhaps two in the entire study. Despite that, we're supposed to accept that this is a "statistically significant" difference because the p-values say so?

Maybe it's because I haven't taken a formal statistics class since the tenth grade, but I don't buy this statistical glitz and glamor. I don't care WHAT the p-values say, at the end of the day, this study still only has sixteen forefoot strikers in it. There are going to be statistical artifacts when you start parsing that already-too-small group into smaller pieces; that's self-evident. While the results of this study ARE interesting and DO merit further study, I'm seriously troubled by the methodological flaws. I don't think we should throw it out as worthless, though. See the end of my post for some thoughts on an improved investigation.

Getting back to the paper itself, the discussion is, like the introduction, really quite good until the end. Daoud et al. are careful to caution readers not to read too far into the data, and warn that there are a lot of unanswered questions. But unfortunately it seems their thinking-caps fell off while composing the last few paragraphs, as the paper takes quite an alarming, unsubstantiated turn in the final page or two. Read for yourself:
The results presented here suggest that a biomechanically proximate way to lower injury rates is to make runners more aware of the importance of running form, including ways to lessen impact forces. There is no question that there are plenty of shod heel strikers who avoid injury, and we need to find out if these runners generate lower impact forces than those with higher injury rates or are running differently in some other way. However, most FFS runners, shod and unshod, avoid marked impact peaks in terms of vertical GRFs and they generally incur lower moments in the knee and perhaps in other joints. A FFS style of running is also hypothesized to be more natural from an evolutionary perspective because barefoot and minimally shod runners tend to use FFS gaits, most likely since RFS landings are painful without a cushioned elevated heel (21). Because hominins have been running barefoot for millions of years (5), often on very hard and rough substrates, it is reasonable to conclude that FFS styles of running used to be more common. No one knows when shoes were invented, but all athletic footwear until very recently were either sandals or moccasins and thus minimal by today’s standards. Even though modern running shoes make RFS running comfortable, the human body may be less well adapted to repeated RFS than FFS landings.

The hypothesis that FFS running is more natural and less injurious than RFS running requires further testing with a controlled prospective study. In the meantime, what are the implications of this study for runners who are injured, or who want to prevent injury? One point to consider is that many runners who RFS in shoes do not get injured or get injured rarely even when they train at high intensity. We predict that these runners have better form than those who do get injured: they probably land with less overstride and more compliant limbs that generate less severe impact loading and generate less extreme joint moments. They may also have fewer anatomical abnormalities that predispose them to injury than other RFS runners who do get injured. These predictions are supported by several recent studies (9, 25, 26, 30, 31), and they emphasize the hypothesis that running style is probably a more important determinant of injury than footwear (with the caveat that footwear probably influences one’s running style).

Another point to consider is that this study did not test for the effect of transitioning from RFS to FFS running, and it is unclear and unknown if runners who switch from RFS to FFS strikes will have lower injury rates. FFS running requires stronger calf muscles because eccentric or isometric contractions of the triceps surae are necessary to control ankle dorsiflexion at the beginning of stance, and shod FFS runners also generate higher joint moments in the ankle (41). Runners who transition to FFS running may be more likely to suffer from Achilles tendinopathies and calf muscle strains. FFS running also requires stronger foot muscles, so even though impact forces generated by FFS landings are low, runners who transition are perhaps more likely to experience forefoot pain or stress fractures. They may also experience plantar fasciitis if their foot muscles are weak. However, these injuries are treatable, and they may be preventable if runners transition, slowly, gradually, and with good overall form.

In conclusion, there is much research to do, and the above results need to be replicated and more fully explored. Regardless, the last few years have seen an exciting surge of research on the biomechanics of running injuries, partly inspired by interest in barefoot running. All runners are at risk of injury, and there are no magic bullets to prevent injuries, but the results of this study support those of other recent analyses indicating that runners and researchers alike may profit from paying more attention to how people run than what is on their feet.

Hold on, stop, back up.

1) This study had nothing to do with impact forces—does forefoot striking really make a runner "more aware" of them?

2) This study offers zero proof that CHANGES to running form—or indeed even being more "conscious" of it—has any effect on injuries. All of these runners were presumably natural forefoot strikers, not heelstrikers who became forefoot strikers. Even IF we accept that forefoot strikers have a lower incidence of injury, it is by NO means a foregone conclusion that heelstrikers who SWITCH to forefoot striking will have a lower injury risk.  Indeed, they may have greater injury rates!

3) The naturalistic fallacy returns, having briefly poked its head up in the introduction but thankfully quickly being forgotten about. Would you ever read a phrase like "Because hominins have been living with cancer for millions of years (5), often to very long lifespans, it is reasonable to conclude that late stage cancer used to be more common [and is thus desirable]" in The Lancet? I would certainly hope not.

4) Footstrike is a very small part of running form as a whole. To suddenly jump to running form as the new "big thing" for minimalist runners is a stretch. I imagine proponents of leg stiffness theory (currently outside my expertise, but I'm working on it!) will have a few words about "more compliant limbs" as well.

Despite these setbacks, this study uncovered some interesting information and merits a lot of future work. I've been trying to think of ways to remedy the flaws in this study; let me know what you think of my proposed reworking.

If I were to replicate this study, I'd do it with high school runners rather than college runners, simply because it's a whole lot easier to find a few thousand of them. I would forget about trying to control for mileage, BMI, training pace, etc., and let large sample sizes take care of that. Either that or build a control group that is matched for those factors. If the true prevalence of natural forefoot strikers (among runner who wear "normal" shoes) is 1-2%, you'd need somewhere in the neighborhood of 5,000 to 10,000 runners. This is daunting, but the upside of it is that actually collecting footstrike data does not take all that long. With a car, a high-speed camera, and photogates, you could collect data from a few dozen high school teams in a major metropolitan area for a few years and you'd easily get a big enough sample size. You'd have to find an efficient way to document injuries, and it would not be as extensive of a system as Daoud et al. came up with, but the bigger sample size would easily counteract any issues associated with this.


No comments:

Post a Comment