Photo: Drew Geraets 
This year, however, many participants were
disappointed with their times. Warmer
than average temperatures and clear, sunny skies caused many runners to finish
well back from their goals. Since
several runners that I coach or advise ran the race, I was curious to see how
much of an effect the temperature had on their finish times. So, as I often do, I started crunching some
data.
Fortunately, I was able to stand on the shoulders
of some Big Running Data giants—a
2012 scientific paper by Nour El Helou and other researchers in France
already laid the groundwork for disentangling the effects of climate on
marathon race times. In their paper, El
Helou et al. analyzed ten years' worth of results from six World Marathon
Majors (London, Berlin, Paris, Boston, Chicago, and New York), resulting in a
data set of 60 marathons. These totaled
almost 1.8 million marathon finishers.
El Helou et al. ran statistical analysis on each year's results, trying
to find the correlation between ambient temperature during the race and the distribution
of the finish times.
El Helou et al.'s methods
Because El Helou et al. (correctly) hypothesized
that temperature would have varying effects on runners of different abilities,
they analyzed several levels of performance for the top one, 25th, 50th, and 75th
percentiles of male and female finishers.
So, for example, if the 2010 Chicago Marathon had 21,000 male finishers,
the authors looked at the finish time for 210th place—that's the "one
percentile" time. This marker is
more useful than looking at the winning time or 10th place, because those can
be affected by things like the quality of the elite field, the tactics employed
by the lead pack, and so on. After
extracting the various levels of performance for the 60 marathons in the data
set, El Helou et al. then consulted meteorological records to find the ambient
temperature midway through each of the 60 races.
Doing regression analysis allowed El Helou et al.
to correlate the ambient temperature with the distribution of finish times. The broad trend in the results was not
surprising: marathon times are slower when temperatures are too hot, and they
are also slower when temperatures are too cold.
What was surprising, at least
to me, was the optimal temperature for marathoning. El Helou et al.'s data robustly shows that
the ideal temperature for running a marathon is pretty chilly—39 degrees Fahrenheit
(3.8° C) for a 2:40 marathon! Race
times follow a parabolic curve, slowing significantly on either end of an
optimal temperature.
Click to enlarge 
Click to enlarge 
Notably, the top 1% of women are vast
outliers. Because of this I did not include
them in the data analysis.
I'm fascinated by this—does this represent a true
physiological phenomenon, or is it just a statistical quirk? I contacted El
Helou et al. several months ago to see if they had looked further into this
wild anomaly, but I received no response.
I am currently in the process of conducing a similar data analysis on
another set of marathon results that were not included in El Helou et al.'s
paper (including the history of Grandma's marathon) to see if I can replicate
this finding. I will publish these
results once I'm finished! It's particularly hard to find large marathons that
are held in cold conditions (45° F /
7.3° C or colder), so if you know of any, drop me a line.
I do have one possible explanation: top female
marathoners tend to be very small in stature, even by distance runner
standards. A male 2:40 marathoner might
easily be 5'9 to 6' tall and weigh 150 or 160 pounds, while a female running
the same time is likely to be substantially lighter and shorter. Because of how volume and surface area scale,
top female runners have a very high skin surface area to body mass ratio,
meaning they radiate heat much more effectively than a taller or heavier
runner. This might cause them to lose too much heat in cold conditions that
would otherwise favor fast times. This
may have implications for top international male runners too.
Indeed, the shape
of the parabolas for a given time for men versus women appears to differ:
men tend to do comparatively worse in hot conditions than women, which makes we
would expect if the surface area to body mass ratio hypothesis is true.
In any case, this outlier aside, it's clear to see
that faster marathons require colder temperatures. But that's not the only takeaway from El
Helou et al.'s data. Their statistical
analysis also allows us to predict how much slower a marathon will be when
temperatures are substantially hotter or colder than optimal.
How much slower was Grandma's Marathon in 2016?
Since the methods I used to extract a general
formula from El Helou et al. are very tedious, I'll jump right to the
interesting part, which is Grandma's Marathon this year. The table below illustrates my results.
Ideal time
in optimal conditions

Optimal
marathon temperature

Expected
time at Grandma's 2016

2:30:00

38° F

2:33:46

2:45:00

39° F

2:53:04

3:00:00

41° F

3:11:55

3:30:00

42° F

3:48:01

4:00:00

44° F

4:21:41

4:30:00

45° F

4:52:52

The calculators are based on the simple average of
start line temperature at 7:45 am in Two Harbors (65° F) and the finish line temperature in Duluth (75° F) at 10:55 am on race day.
At first glance, it appears that times were a lot slower, and this is certainly
true. I should point out, however, that
Grandma's Marathon is almost never within
the optimal marathoning temperature for most anyone. At one hour and fifteen minutes into the race
(9am), the average temperature for the past twelve years has been 61° F.
Only twice has the temperature at 9 am been below 50° F.
It's also worth pointing out that some people
handle heat better than others. These
are only averages; some people are likely affected much worse, while others are
affected to a lesser degree. El Helou
did not provide error margins along with their data, so I can't say what the
standard deviation looks like for these values.
Even if they did, I'm sure the error propagation would be massively
challenging!
Conclusion
The bottom line remains, however—Grandma's
Marathon was at least several minutes slower for most participants. The prediction based on El Helou et al.'s
model is supported by the realworld results.
In the past 11 years (20052015), the top one
percentile of men at Grandma's have an average finish time of 2:35:10 and the
average temperature has been 60°
F. This year, the top one percentile was
2:38:28, a difference of three minutes and 18 seconds. The generalized formula equates a 2:33:03 in
perfect conditions to 2:35:10 in 60°
F (average Grandma's temperature), and to 2:37:43 in this year's conditions, a
difference of two minutes and 33 seconds.
Not a bad prediction!
Another example: The 50th percentile in last
year's men's race was 3:55:54. The
temperature last year was 54° F. This year's 50th percentile was 4:25:49; the
formula predicts (converting back to ideal conditions, then to this year's
conditions of 70° F) a
4:22:46. Pretty good. In fact, you could even bargain your way to a
higher temperature because the fourhour marathoner is out in the heat longer,
so the weighted average of temperature should be higher—more like 72 or 73
degrees. That gets you even closer to
the real world result.
The rest of this article is long, boring, and
appeals only to nerds like me. Here is
the gist of the next 600 words:
·
Women may handle intense heat slightly better
than men.
·
The data in the table above are extracted from
men's performances only, but women's performances should be pretty close in
most cases.
·
This model falls apart pretty quickly for
marathon times under 2:30:00.
·
Someday I might turn this into a cool web app that
you can play with, but I've got a lot on my plate right now so don't get your
hopes up.
Methods: Extracting the generalized formula from El
Helou et al.
The supplemental data in El Helou et al. provides
data points for their parabolic model of temperature and marathon time. By extracting these and putting them back
into a parabolic fit curve, it's possible to develop a general formula for
predicting how much slower you'll run under suboptimal conditions.
Any parabola has the form y = ax^{2} + bx + c, including the bestfit parabolas for
each discrete marathon time provided in the El Helou et al. paper (derived from
the expected time in optimal conditions for the top 1, 25, 50, and 75 percentile
of male finishers.
It is reasonable to expect that the shape of this
parabola changes in a smooth and predictable function according to the speed at
which you run. In other words, though it
may be the case that deviations from optimal temperature affect faster runners
more or less severely than slower runners, these changes probably occur
smoothly on a continuum. Because of
this, we can develop functions that describe how the constants of the parabola (a, b, and c) change as a function of your running speed
under optimal marathon conditions.
As noted above, it is not reasonable to expect
that men and women change the same way, so the "shape trends" of each
parabola must be treated separately by sex.
Because I don't have valid data for the top 1% of women (given how big
of an outlier that group was), I can only develop a general formula for
men. That being said, the results should
still be pretty good for most female
runners in most conditions; they line up pretty well at the 25th and 50th
percentiles.
In brief, here is a look at the regression
analysis I used to determine the general formula describing the shape of the
speed deviation parabola for male marathoners:
Note that changes in a, the first constant in
the formula, appear to be proportional to the square of your running
speed. b and c
appear to follow linear trends.
I was trained as a chemist. In chemistry, we have a rule: "If it
doesn't look linear, don't do a linear fit!" As such, I used a quadratic
fit for changes in the constant a.
This area of math is interesting and perplexing to me—is this model of
smooth continuous parabola shape change really accurate? You'd need a lot more
data to validate this. It would be a
cool project, but would be extremely timeintensive.
In any case *waving hands*, let's just assume it
is.
This allows for a general formula that gives you the
equation of the race time parabola for a particular goal marathon pace. From the linear graph earlier, we already
know the optimal temperature for any given marathon pace, so it's easy to
doublecheck our general formula by plugging in a known marathon time and
seeing where the peak of the parabola ends up.
There are, of course, some small differences because of rounding errors
and the fact that this data is empirically derived.
Finally, we can create a master formula that can
predict your actual marathon time based on the temperature during the race and
your optimal marathon performance in ideal conditions. This allows you to work backwards too,
answering questions like "If I ran 3:06:00 in 66 degree weather, what
could I run in 55 degree weather?" (answer: 3:00:27).
There are some major limitations to the model,
though. It's clearly not valid for times
under about 2:30:00, since the predictions start to fall apart. As you approach elite men's times, the model
starts to predict faster times for hotter temperatures. This is probably a result of the assumptions
made and the limited data available when deriving the general formula to predict
the trend in the shape of the parabola describing the relationship between race
time and temperature: El Helou et al.'s data only extends down to 2:41
marathoners. Extrapolating 30 or 40 minutes
faster than that lands you in uncharted and inaccurate territory. More data
analysis is needed to refine the model so it can be used for elite marathon
times, too. I'm also not comfortable
using the model for women running under about three hours, because of the
aforementioned issue with the top one percentile of female finishers. I'll have a better idea of what's going on
there once I finish my own analysis of top female times at other races and
confirm or refute my surface area/body mass hypothesis.
Despite this, the formula proves immensely useful
and quite accurate for the 2:30 to 4:00 marathon crowd. Unfortunately for you, the generalized
version of the formula is only currently workable in a very messy Excel chart
on my computer. Someday, I hope to
develop a simple web app that I can embed in this article that allows you to
play with the numbers in an easy, interactive way. But that would require dusting off my JavaScript
programming skills, which seems like a low priority right now. If you really want to sink your teeth into
this, email me and I'll send you the raw data.
Wow, that was dead on. I was shooting for a 3:30 and got a 3:48 at Grandma's
ReplyDeleteSame. Was going for 3:253:30 and finished 3:45.
DeleteAupa Jonh!!!
ReplyDeleteAs always, very interesrting article!!!!
Unfourtanetly I don't have any data of a cold marathon....
Last year in Porto Marathon in a very hot and sunny race I run 2:45, I was suppossed to be under 2:40.... I wonder how it will be in óptimal conditions.
5 month later 2:39 in Hamburg marathon, and I was not trained as I was for Porto marathon.
In October i'll go to Chicago I hope, weather conditions will be nice for running.
As a chemist, I'm very interested on this staff, so i'll contact you by mail
Data analysis in business paraphernalia has different facets and methodologies. Different aspects of life which includes businesses, politics, science, etc have different interpretation of the data, but the data collection is a basic thing for successful execution. See more statistical data analysis services
ReplyDelete