Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2022 Jun 24;38(Suppl 1):i10-i18.
doi: 10.1093/bioinformatics/btac233.

An approachable, flexible and practical machine learning workshop for biologists

Affiliations

An approachable, flexible and practical machine learning workshop for biologists

Chris S Magnano et al. Bioinformatics. .

Abstract

Summary: The increasing prevalence and importance of machine learning in biological research have created a need for machine learning training resources tailored towards biological researchers. However, existing resources are often inaccessible, infeasible or inappropriate for biologists because they require significant computational and mathematical knowledge, demand an unrealistic time-investment or teach skills primarily for computational researchers. We created the Machine Learning for Biologists (ML4Bio) workshop, a short, intensive workshop that empowers biological researchers to comprehend machine learning applications and pursue machine learning collaborations in their own research. The ML4Bio workshop focuses on classification and was designed around three principles: (i) emphasizing preparedness over fluency or expertise, (ii) necessitating minimal coding and mathematical background and (iii) requiring low time investment. It incorporates active learning methods and custom open-source software that allows participants to explore machine learning workflows. After multiple sessions to improve workshop design, we performed a study on three workshop sessions. Despite some confusion around identifying subtle methodological flaws in machine learning workflows, participants generally reported that the workshop met their goals, provided them with valuable skills and knowledge and greatly increased their beliefs that they could engage in research that uses machine learning. ML4Bio is an educational tool for biological researchers, and its creation and evaluation provide valuable insight into tailoring educational resources for active researchers in different domains.

Availability and implementation: Workshop materials are available at https://github.com/carpentries-incubator/ml4bio-workshop and the ml4bio software is available at https://github.com/gitter-lab/ml4bio.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
Timeline of the ML4Bio workshop. Activities are shown in addition to which non-affective learning goals (LGs) are addressed by that activity, as defined in Section 2.1
Fig. 2.
Fig. 2.
Different configurations of the left half of the software interface throughout a ML workflow
Fig. 3.
Fig. 3.
The right half of the ml4bio software interface. The top shows a summary of all classifiers created during model selection, and the bottom shows detailed information on the performance of the selected classifier. Note that multiple classifiers can only be viewed during model selection. The user must select a single model and can no longer see the performance of other models once the test set is examined
Fig. 4.
Fig. 4.
Sankey diagram of participants’ responses pertaining to comfort with ML before and after the workshop across all three sessions. Note that a proportion of those who completed a pre-survey and not a post-survey did not attend the workshop at all. 47 completed the pre-survey, 6 completed only the pre-survey and assessment, and 26 completed all 3 instruments
Fig. 5.
Fig. 5.
Participant responses to self-reported knowledge, confidence and interest in ML before and after the workshop. Note that these questions used a retrospective design, meaning that participants were asked about both before and after the workshop in the post-survey

References

    1. Ambrose S.A. et al. (2010). How Learning Works: Seven Research-Based Principles for Smart Teaching. Jossey-Bass, San Francisco, CA.
    1. Ashmore R. et al. (2021) Assuring the machine learning lifecycle: desiderata, methods, and challenges. ACM Comput. Surv., 54, 1–39.
    1. Bandura A. (1977) Self-efficacy: toward a unifying theory of behavioral change. Psychol. Rev., 84, 191–215. - PubMed
    1. Barone L. et al. (2017) Unmet needs for analyzing biological big data: a survey of 704 NSF principal investigators. PLoS Comput. Biol., 13, e1005755. - PMC - PubMed
    1. Bhanji F. et al. (2012) The retrospective pre-post: a practical method to evaluate learning from an educational program. Acad. Emerg. Med., 19, 189–194. - PubMed

Publication types