Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Jun 23;11(6):e0157257.
doi: 10.1371/journal.pone.0157257. eCollection 2016.

Prediction and Quantification of Individual Athletic Performance of Runners

Affiliations

Prediction and Quantification of Individual Athletic Performance of Runners

Duncan A J Blythe et al. PLoS One. .

Abstract

We present a novel, quantitative view on the human athletic performance of individual runners. We obtain a predictor for running performance, a parsimonious model and a training state summary consisting of three numbers by application of modern validation techniques and recent advances in machine learning to the thepowerof10 database of British runners' performances (164,746 individuals, 1,417,432 performances). Our predictor achieves an average prediction error (out-of-sample) of e.g. 3.6 min on elite Marathon performances and 0.3 seconds on 100 metres performances, and a lower error than the state-of-the-art in performance prediction (30% improvement, RMSE) over a range of distances. We are also the first to report on a systematic comparison of predictors for running performance. Our model has three parameters per runner, and three components which are the same for all runners. The first component of the model corresponds to a power law with exponent dependent on the runner which achieves a better goodness-of-fit than known power laws in the study of running. Many documented phenomena in quantitative sports science, such as the form of scoring tables, the success of existing prediction methods including Riegel's formula, the Purdy points scheme, the power law for world records performances and the broken power law for world record speeds may be explained on the basis of our findings in a unified way. We provide strong evidence that the three parameters per runner are related to physiological and behavioural parameters, such as training state, event specialization and age, which allows us to derive novel physiological hypotheses relating to athletic performance. We conjecture on this basis that our findings will be vital in exercise physiology, race planning, the study of aging and training regime design.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

Fig 1
Fig 1. Central phenomenon: non-linear deviation from the power law in individuals.
Top left: performances of world record holders and a selection of random runners. Curves labelled by runners are their known best performances (y-axis) at that event (x-axis). Black crosses are world record performances. Individual performances deviate non-linearly from the world record power law. Top right: a good model should take into account specialization, illustration by example. Hypothetical performance curves of three runners, green, red and blue are shown, the task is to predict green on 1500m from all other performances. Dotted green lines are predictions. State-of-art methods such as Riegel or Purdy predict green performance on 1500m close to blue and red; a realistic predictor for 1500m performance of green—such as LMC—will predict that green is outperformed by red and blue on 1500m; since blue and red being worse on 400m indicates that out of the three runners, green specializes most on shorter distances. Bottom: using local matrix completion as a mathematical prediction principle by filling in an entry in a (3 × 3) sub-pattern. Schematic illustration of the algorithm.
Fig 2
Fig 2. The three components of the low-rank model, and explanation of the world record data.
Left: the components displayed (unit norm, log-time vs log-distance). Tubes around the components are one standard deviation, estimated by the bootstrap. The first component is an exact power law (straight line in log-log coordinates); the last two components are non-linear, describing transitions at around 800m and 10km. Middle: Comparison of first component and world record to the exact power law (log-speed vs log-distance). Right: Least-squares fit of rank 1-3 models to the world record data (log-speed vs log-distance).
Fig 3
Fig 3. Matrix scatter plot of the three-number-summary vs performance.
For each of the scores in the three-number-summary (rows) and each event distance (columns), the plot matrix shows: a scatter plot of performances (time) vs the coefficient score of the top 25% (on the best event) runners who have attempted at least 4 events. Each scatter plot in the matrix is colored on a continuous color scale according to the absolute value of the scatter sample’s Spearman rank correlation (red = 0, green = 1).
Fig 4
Fig 4. Scatter plots exploring the three number summary.
Top left and right: 3D scatter plot of three-number-summaries of runners in the data set, colored by preferred distance and shown from two angles. A negative value for the second score is a indicates that the runner is a sprinter, a positive value an endurance runner. In the top right panel, the summaries of the elite runners Usain Bolt (world record holder, 100m, 200m), Mo Farah (world beater over distances between 1500m and 10km), Haile Gabrselassie (former world record holder from 5km to Marathon) and Takahiro Sunada (100km world record holder) are shown; summaries are estimated from their personal bests. For comparison we also display the hypothetical data of a runner who holds all world records. Bottom left: preferred distance vs individual exponents, color is percentile on preferred distance. Bottom right: age vs. exponent, colored by preferred distance.

Similar articles

Cited by

References

    1. Lietzke MH. An analytical study of world and olympic racing records. Science. 1954;119(3089):333–336. 10.1126/science.119.3089.333 - DOI - PubMed
    1. Henry FM. Prediction of world records in running sixty yards to twenty-six miles. Research Quarterly American Association for Health, Physical Education and Recreation. 1955;26(2):147–158.
    1. Riegel PS. Athletic records and human endurance. American Scientist. 1980;69(3):285–290. - PubMed
    1. Katz L, Katz JS. Fractal (power-law) analysis of athletic performance. Research in Sports Medicine: An International Journal. 1994;5(2):95–105.
    1. Savaglio S, Carbone V. Human performance: scaling in athletic world records. Nature. 2000;404(6775):244–244. 10.1038/35005165 - DOI - PubMed

LinkOut - more resources