Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
Editorial
. 2024 Nov;32(11):3039-3042.
doi: 10.1002/ksa.12389. Epub 2024 Jul 31.

The artificial intelligence advantage: Supercharging exploratory data analysis

Affiliations
Editorial

The artificial intelligence advantage: Supercharging exploratory data analysis

Felix C Oettl et al. Knee Surg Sports Traumatol Arthrosc. 2024 Nov.

Abstract

Explorative data analysis (EDA) is a critical step in scientific projects, aiming to uncover valuable insights and patterns within data. Traditionally, EDA involves manual inspection, visualization, and various statistical methods. The advent of artificial intelligence (AI) and machine learning (ML) has the potential to improve EDA, offering more sophisticated approaches that enhance its efficacy. This review explores how AI and ML algorithms can improve feature engineering and selection during EDA, leading to more robust predictive models and data-driven decisions. Tree-based models, regularized regression, and clustering algorithms were identified as key techniques. These methods automate feature importance ranking, handle complex interactions, perform feature selection, reveal hidden groupings, and detect anomalies. Real-world applications include risk prediction in total hip arthroplasty and subgroup identification in scoliosis patients. Recent advances in explainable AI and EDA automation show potential for further improvement. The integration of AI and ML into EDA accelerates tasks and uncovers sophisticated insights. However, effective utilization requires a deep understanding of the algorithms, their assumptions, and limitations, along with domain knowledge for proper interpretation. As data continues to grow, AI will play an increasingly pivotal role in EDA when combined with human expertise, driving more informed, data-driven decision-making across various scientific domains. Level of Evidence: Level V - Expert opinion.

Keywords: artificial intelligence; exploratory data analysis; feature engineering; machine learning; orthopedic research.

PubMed Disclaimer

References

REFERENCES

    1. Caruana, R., Lou, Y., Gehrke, J., Koch, P., Sturm, M. & Elhadad, N. (2015) Intelligible models for healthcare: predicting pneumonia risk and hospital 30‐day readmission, Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. Sydney, NSW, Australia.
    1. Donnelly, J., Katta, S., Rudin, C. & Browne, E.P. (2024) The Rashomon importance distribution: getting RID of unstable, single model‐based variable importance. ArXiv. [Preprint]
    1. Eckhardt, C.M., Madjarova, S.J., Williams, R.J., Ollivier, M., Karlsson, J., Pareek, A. et al. (2023) Unsupervised machine learning methods and emerging applications in healthcare. Knee Surgery, Sports Traumatology, Arthroscopy, 31, 376–381. Available from: https://doi.org/10.1007/s00167-022-07233-7
    1. Liu, Y., Li, Y., Yang, W. & Hu, J. (2023) Exploring nonlinear effects of built environment on jogging behavior using random forest. Applied Geography, 156, 102990. Available from: https://doi.org/10.1016/j.apgeog.2023.102990
    1. Lloyd, S. (1982) Least squares quantization in PCM | IEEE Journals & Magazine | IEEE Xplore. IEEE Transactions on Information Theory, 28(2), 129–137.

Publication types

LinkOut - more resources