A Study of Market Inefficiencies at the Olympic Decathlon
By Brian Baker
For this case, the group went into finding market inefficiencies in the world of sports. The goal of this case study was to look for and determine inefficiencies in the market and then to come to the conclusion of how that weakness could be corrected by people who noticed these trends.
After watching a video of the decathlon training featuring 2004 Olympic Silver Medalist and 2008 Olympic Gold Medalist Bryan Clay, I was fascinated to see what sort of analytics could be applied to this sport, which even Bryan Clay admits does not get much respect. I decided to do my best to rectify the problem by doing my own analysis on the skill of decathletes.
For those who are not familiar with the decathlon, it is a two day competition which features ten events. On the first day, athletes compete in the 100 meter dash, the long jump, the shot put, the high jump and the 400 meter dash. The second day consists of the 110 meter hurdles, the discus throw, the pole vault, the javelin throw and the 1500 meter run.
Each performance in these events is given a numerical value, and are therefore standards-based. For instance, athletes who are competing at the collegiate level would be using the same scoring standards which the athletes are competing in the Olympics do. The scoring for these events is also intended to be equal, such that one event can be exploited over another; the scoring is also created in such a way that an athlete who specializes in one event can dominate the scoring and win a gold medal simply by being an elite athlete in one or two events. Most of the market inefficiencies would be in the former statement; by assuming that all events are created, we can use statistical tests which determine the probability that these events are equal and drawn from the same data set. This will be covered in more detail later in the article.
As with any test like this involving numbers, the first step is to collect the data. I used Sports Reference to gather my data; you can see, for instance, the data collected from the 2008 100m dash here. I then proceeded to manually enter the data in, concerning myself only with the number of points in each event. If I focused on the actual results, that would prevent me from comparing across all events; the points are supposed to make sure that all of the events are on a level playing field.
The data that was easily available to me went all the way back to 1996. The modern scoring system was implemented into play in 1984, but I could not find data which went back that far. I chose to only enter the top twenty competitors, as I wanted to focus on what the top athletes were doing (I assumed that top athletes would be the ones exploiting any major inefficiencies). Anyone who is not in the top twenty is not likely to have a strategy in winning the Olympics which would be worth our study.
Once I had these 80 individuals and their 800 unique data points entered into the computer, I began by using some measures of central tendency and general distribution. For those who are not familiar, measures of central tendency (mean, median and mode) are ways which statisticians attempt to define what an average member of the data would produce. The table below looks at what the measures found:
This first measure was useful, as it showed us that running events, such as the 100m dash and the 110m hurdles, had far greater means than many of the other events (seen in red). Possibly even more important, and more relevant to our measures, is how events that we could classify as technical events (seen in purple, these are events such as the javelin throw and the pole vault which require an athlete to develop a specific skill set) had far greater standard errors. This is a measure of the standard deviation of the distribution that we have used here. This is particularly important, as it may suggest that these high technical events, such as the javelin throw and discus throw, are an opportunity for the top athletes to differentiate themselves from the competition.
It will be important to remember these technical events as we move on to our next measures. In order to see what overall impact these events had on finish, I ran several regression models. The first thing that I did was take all of the points for each event, and ran a regression with overall points as the dependent variable. The results are seen to the right and below.
The important column for us to look at is the P-value. If there is a P-value less than 0.05, it means that the event is highly correlated to the total number of points. With that guideline in our head, it seems pretty obvious that all of these events have a high correlation with the number of points scored. The technical events that we looked at before have high correlation coefficients, suggesting that they could be worthy of further research. Overall though, the fact that all of the variables are correlated is not a surprising result, but it was worthy of checking to see if any events stuck out.
The next regression model that I ran had some more interesting results. The finish place of each athlete was listed as the dependent variable, and points were the independent variable that we were using once again. In this case, the goal was to determine if winners tend to do well in certain events, while athletes of a lower quality tend to do better in other events (which would mean that event would not be statistically significant). The results of the regression are below:
This data was relatively significant. First off, we can eliminate the 100m dash as being indicative of finish position. This means that how well someone does in the 100m dash does not have any bearing into how well someone does overall in the event.
From there, we see that all of the other events meet the sub-0.05 p-value threshold, and thus are significant to finish position. The highly technical events that we looked at earlier show up again. They have the lowest p-values in this case, which suggests that they have a high correlation to what we are looking at. These events are showing up again, as they have in all three of the tests that we have done; the first time because of high standard errors, and these last two times for having high coefficients. That seems to offer some pretty substantive evidence for their significance over other events.
In order to firmly deny our initial hypothesis that all events are created equal, I ran a total of three two sample t-tests. These tests analyze the data to see what the chance is that two events were different samples of the same population. For instance, if I were to take a group of 100m dash results and a group of 200m dash results (measured in seconds, as we are looking at individual results), a two-sample t-test would say that there is almost no chance that those two groups of data were taken from the same place. The three tests that I ran ended up with these results:
These three tests all produced statistically significant results, suggesting that the results from all of these events came from vastly different events. This would suggest that the scoring patterns for all of these events are very different. For instance, the 400m scores, on average, were far higher than the 100m results. This means that the two of them are in no way equal.
This data suggests that all events are different. From the data that has been looked at, the most surefire thing for athletes trying to succeed in the decathlon is to focus on the technical events. Many of the weaker athletes in the Olympics seem to struggle there, and the point spread suggests that athletes who are strong in those events put themselves in a very good spot to win the competition. Further, the data was interesting because it seemed to suggest that these athletes were not as common. You could take a highly skilled technical athlete and work on the sprinting aspect, but you may not be able to take a sprinter and turn him into an effective decathlete.
For those looking for further study, it would be relevant to study the event on the collegiate level, as well as at other professional venues to see what trends could be discovered from those events. It would also be interesting to see if data could be found that would allow this study to go further back in times; are these trends that have always been present in the data or is it something that has come about more recently?