. 2016 Jun 23;11(6):e0157257.

doi: 10.1371/journal.pone.0157257. eCollection 2016.

Prediction and Quantification of Individual Athletic Performance of Runners

Duncan A J Blythe^{1

2}, Franz J Király³

Affiliations

¹ African Institute for Mathematical Sciences, Bagamoyo, Tanzania.
² Bernstein Centre for Computational Neuroscience, Berlin, Germany.
³ Department of Statistical Science, University College London, London, United Kingdom.

PMID: 27336162
PMCID: PMC4919094
DOI: 10.1371/journal.pone.0157257

Prediction and Quantification of Individual Athletic Performance of Runners

Duncan A J Blythe et al. PLoS One. 2016.

. 2016 Jun 23;11(6):e0157257.

doi: 10.1371/journal.pone.0157257. eCollection 2016.

Authors

Duncan A J Blythe^{1

2}, Franz J Király³

Affiliations

¹ African Institute for Mathematical Sciences, Bagamoyo, Tanzania.
² Bernstein Centre for Computational Neuroscience, Berlin, Germany.
³ Department of Statistical Science, University College London, London, United Kingdom.

PMID: 27336162
PMCID: PMC4919094
DOI: 10.1371/journal.pone.0157257

Abstract

We present a novel, quantitative view on the human athletic performance of individual runners. We obtain a predictor for running performance, a parsimonious model and a training state summary consisting of three numbers by application of modern validation techniques and recent advances in machine learning to the thepowerof10 database of British runners' performances (164,746 individuals, 1,417,432 performances). Our predictor achieves an average prediction error (out-of-sample) of e.g. 3.6 min on elite Marathon performances and 0.3 seconds on 100 metres performances, and a lower error than the state-of-the-art in performance prediction (30% improvement, RMSE) over a range of distances. We are also the first to report on a systematic comparison of predictors for running performance. Our model has three parameters per runner, and three components which are the same for all runners. The first component of the model corresponds to a power law with exponent dependent on the runner which achieves a better goodness-of-fit than known power laws in the study of running. Many documented phenomena in quantitative sports science, such as the form of scoring tables, the success of existing prediction methods including Riegel's formula, the Purdy points scheme, the power law for world records performances and the broken power law for world record speeds may be explained on the basis of our findings in a unified way. We provide strong evidence that the three parameters per runner are related to physiological and behavioural parameters, such as training state, event specialization and age, which allows us to derive novel physiological hypotheses relating to athletic performance. We conjecture on this basis that our findings will be vital in exercise physiology, race planning, the study of aging and training regime design.

PubMed Disclaimer

Conflict of interest statement

Competing Interests: The authors have declared that no competing interests exist.

Figures

**Fig 1. Central phenomenon: non-linear deviation from the power law in individuals.**
Top left: performances of world record holders and a selection of random runners. Curves labelled by runners are their known best performances (y-axis) at that event (x-axis). Black crosses are world record performances. Individual performances deviate non-linearly from the world record power law. Top right: a good model should take into account specialization, illustration by example. Hypothetical performance curves of three runners, green, red and blue are shown, the task is to predict green on 1500m from all other performances. Dotted green lines are predictions. State-of-art methods such as Riegel or Purdy predict green performance on 1500m close to blue and red; a realistic predictor for 1500m performance of green—such as LMC—will predict that green is outperformed by red and blue on 1500m; since blue and red being worse on 400m indicates that out of the three runners, green specializes most on shorter distances. Bottom: using local matrix completion as a mathematical prediction principle by filling in an entry in a (3 × 3) sub-pattern. Schematic illustration of the algorithm.

**Fig 2. The three components of the low-rank model, and explanation of the world record data.**
Left: the components displayed (unit norm, log-time vs log-distance). Tubes around the components are one standard deviation, estimated by the bootstrap. The first component is an exact power law (straight line in log-log coordinates); the last two components are non-linear, describing transitions at around 800m and 10km. Middle: Comparison of first component and world record to the exact power law (log-speed vs log-distance). Right: Least-squares fit of rank 1-3 models to the world record data (log-speed vs log-distance).

**Fig 3. Matrix scatter plot of the three-number-summary vs performance.**
For each of the scores in the three-number-summary (rows) and each event distance (columns), the plot matrix shows: a scatter plot of performances (time) vs the coefficient score of the top 25% (on the best event) runners who have attempted at least 4 events. Each scatter plot in the matrix is colored on a continuous color scale according to the absolute value of the scatter sample’s Spearman rank correlation (red = 0, green = 1).

**Fig 4. Scatter plots exploring the three number summary.**
Top left and right: 3D scatter plot of three-number-summaries of runners in the data set, colored by preferred distance and shown from two angles. A negative value for the second score is a indicates that the runner is a sprinter, a positive value an endurance runner. In the top right panel, the summaries of the elite runners Usain Bolt (world record holder, 100m, 200m), Mo Farah (world beater over distances between 1500m and 10km), Haile Gabrselassie (former world record holder from 5km to Marathon) and Takahiro Sunada (100km world record holder) are shown; summaries are estimated from their personal bests. For comparison we also display the hypothetical data of a runner who holds all world records. Bottom left: preferred distance vs individual exponents, color is percentile on preferred distance. Bottom right: age vs. exponent, colored by preferred distance.

See this image and copyright information in PMC

Cited by

Modelling human endurance: power laws vs critical power.
Drake JP, Finke A, Ferguson RA. Drake JP, et al. Eur J Appl Physiol. 2024 Feb;124(2):507-526. doi: 10.1007/s00421-023-05274-5. Epub 2023 Aug 10. Eur J Appl Physiol. 2024. PMID: 37563307 Free PMC article.
Construction of Women's All-Around Speed Skating Event Performance Prediction Model and Competition Strategy Analysis Based on Machine Learning Algorithms.
Liu M, Chen Y, Guo Z, Zhou K, Zhou L, Liu H, Bao D, Zhou J. Liu M, et al. Front Psychol. 2022 Jul 12;13:915108. doi: 10.3389/fpsyg.2022.915108. eCollection 2022. Front Psychol. 2022. PMID: 35910999 Free PMC article.
Special endurance coefficients enable the evaluation of running performance.
Blödorn W, Döring F. Blödorn W, et al. Sci Rep. 2025 Jun 20;15(1):20184. doi: 10.1038/s41598-025-06009-6. Sci Rep. 2025. PMID: 40542038 Free PMC article.
Modelling 5-km Running Performance on Level and Hilly Terrains in Recreational Runners.
Melo OUM, Tartaruga MP, de Borba EF, Boullosa D, da Silva ES, Bernardo RT, Coimbra R, Oliveira HB, da Rosa RG, Peyré-Tartaruga LA. Melo OUM, et al. Biology (Basel). 2022 May 22;11(5):789. doi: 10.3390/biology11050789. Biology (Basel). 2022. PMID: 35625517 Free PMC article.
Win Your Race Goal: A Generalized Approach to Prediction of Running Performance.
Dash S. Dash S. Sports Med Int Open. 2024 Oct 9;8:a24016234. doi: 10.1055/a-2401-6234. eCollection 2024. Sports Med Int Open. 2024. PMID: 39439845 Free PMC article.

See all "Cited by" articles

References

1. Lietzke MH. An analytical study of world and olympic racing records. Science. 1954;119(3089):333–336. 10.1126/science.119.3089.333 - DOI - PubMed
1. Henry FM. Prediction of world records in running sixty yards to twenty-six miles. Research Quarterly American Association for Health, Physical Education and Recreation. 1955;26(2):147–158.
1. Riegel PS. Athletic records and human endurance. American Scientist. 1980;69(3):285–290. - PubMed
1. Katz L, Katz JS. Fractal (power-law) analysis of athletic performance. Research in Sports Medicine: An International Journal. 1994;5(2):95–105.
1. Savaglio S, Carbone V. Human performance: scaling in athletic world records. Nature. 2000;404(6775):244–244. 10.1038/35005165 - DOI - PubMed

MeSH terms

Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions
Actions

LinkOut - more resources

Full Text Sources
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Prediction and Quantification of Individual Athletic Performance of Runners

Affiliations

Prediction and Quantification of Individual Athletic Performance of Runners

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

Similar articles

Cited by

References

MeSH terms

LinkOut - more resources

Full Text Sources

Other Literature Sources