. 2022 Jun 24;38(Suppl 1):i10-i18.

doi: 10.1093/bioinformatics/btac233.

An approachable, flexible and practical machine learning workshop for biologists

Chris S Magnano^{1

2}, Fangzhou Mu², Rosemary S Russ³, Milica Cvetkovic⁴, Debora Treu¹, Anthony Gitter^{1

2

5}

Affiliations

¹ Morgridge Institute for Research, Madison, WI 53715, USA.
² Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA.
³ Department of Curriculum and Instruction, University of Wisconsin-Madison, Madison, WI 53715, USA.
⁴ Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA.
⁵ Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, USA.

PMID: 35758797
PMCID: PMC9236579
DOI: 10.1093/bioinformatics/btac233

An approachable, flexible and practical machine learning workshop for biologists

Chris S Magnano et al. Bioinformatics. 2022.

. 2022 Jun 24;38(Suppl 1):i10-i18.

doi: 10.1093/bioinformatics/btac233.

Authors

Chris S Magnano^{1

2}, Fangzhou Mu², Rosemary S Russ³, Milica Cvetkovic⁴, Debora Treu¹, Anthony Gitter^{1

2

5}

Affiliations

¹ Morgridge Institute for Research, Madison, WI 53715, USA.
² Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA.
³ Department of Curriculum and Instruction, University of Wisconsin-Madison, Madison, WI 53715, USA.
⁴ Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA.
⁵ Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, USA.

PMID: 35758797
PMCID: PMC9236579
DOI: 10.1093/bioinformatics/btac233

Abstract

Summary: The increasing prevalence and importance of machine learning in biological research have created a need for machine learning training resources tailored towards biological researchers. However, existing resources are often inaccessible, infeasible or inappropriate for biologists because they require significant computational and mathematical knowledge, demand an unrealistic time-investment or teach skills primarily for computational researchers. We created the Machine Learning for Biologists (ML4Bio) workshop, a short, intensive workshop that empowers biological researchers to comprehend machine learning applications and pursue machine learning collaborations in their own research. The ML4Bio workshop focuses on classification and was designed around three principles: (i) emphasizing preparedness over fluency or expertise, (ii) necessitating minimal coding and mathematical background and (iii) requiring low time investment. It incorporates active learning methods and custom open-source software that allows participants to explore machine learning workflows. After multiple sessions to improve workshop design, we performed a study on three workshop sessions. Despite some confusion around identifying subtle methodological flaws in machine learning workflows, participants generally reported that the workshop met their goals, provided them with valuable skills and knowledge and greatly increased their beliefs that they could engage in research that uses machine learning. ML4Bio is an educational tool for biological researchers, and its creation and evaluation provide valuable insight into tailoring educational resources for active researchers in different domains.

Availability and implementation: Workshop materials are available at https://github.com/carpentries-incubator/ml4bio-workshop and the ml4bio software is available at https://github.com/gitter-lab/ml4bio.

Supplementary information: Supplementary data are available at Bioinformatics online.

PubMed Disclaimer

Figures

**Fig. 1.**
Timeline of the ML4Bio workshop. Activities are shown in addition to which non-affective learning goals (LGs) are addressed by that activity, as defined in Section 2.1

**Fig. 2.**
Different configurations of the left half of the software interface throughout a ML workflow

**Fig. 3.**
The right half of the ml4bio software interface. The top shows a summary of all classifiers created during model selection, and the bottom shows detailed information on the performance of the selected classifier. Note that multiple classifiers can only be viewed during model selection. The user must select a single model and can no longer see the performance of other models once the test set is examined

**Fig. 4.**
Sankey diagram of participants’ responses pertaining to comfort with ML before and after the workshop across all three sessions. Note that a proportion of those who completed a pre-survey and not a post-survey did not attend the workshop at all. 47 completed the pre-survey, 6 completed only the pre-survey and assessment, and 26 completed all 3 instruments

**Fig. 5.**
Participant responses to self-reported knowledge, confidence and interest in ML before and after the workshop. Note that these questions used a retrospective design, meaning that participants were asked about both before and after the workshop in the post-survey

See this image and copyright information in PMC

References

1. Ambrose S.A. et al. (2010). How Learning Works: Seven Research-Based Principles for Smart Teaching. Jossey-Bass, San Francisco, CA.
1. Ashmore R. et al. (2021) Assuring the machine learning lifecycle: desiderata, methods, and challenges. ACM Comput. Surv., 54, 1–39.
1. Bandura A. (1977) Self-efficacy: toward a unifying theory of behavioral change. Psychol. Rev., 84, 191–215. - PubMed
1. Barone L. et al. (2017) Unmet needs for analyzing biological big data: a survey of 704 NSF principal investigators. PLoS Comput. Biol., 13, e1005755. - PMC - PubMed
1. Bhanji F. et al. (2012) The retrospective pre-post: a practical method to evaluate learning from an educational program. Acad. Emerg. Med., 19, 189–194. - PubMed

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

An approachable, flexible and practical machine learning workshop for biologists

Affiliations

An approachable, flexible and practical machine learning workshop for biologists

Authors

Affiliations

Abstract

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Miscellaneous