Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jul 3;10(1):11009.
doi: 10.1038/s41598-020-67658-3.

Data-derived metrics describing the behaviour of field-based citizen scientists provide insights for project design and modelling bias

Affiliations

Data-derived metrics describing the behaviour of field-based citizen scientists provide insights for project design and modelling bias

Tom August et al. Sci Rep. .

Abstract

Around the world volunteers and non-professionals collect data as part of environmental citizen science projects, collecting wildlife observations, measures of water quality and much more. However, where projects allow flexibility in how, where, and when data are collected there will be variation in the behaviour of participants which results in biases in the datasets collected. We develop a method to quantify this behavioural variation, describing the key drivers and providing a tool to account for biases in models that use these data. We used a suite of metrics to describe the temporal and spatial behaviour of participants, as well as variation in the data they collected. These were applied to 5,268 users of the iRecord Butterflies mobile phone app, a multi-species environmental citizen science project. In contrast to previous studies, after removing transient participants (those active on few days and who contribute few records), we do not find evidence of clustering of participants; instead, participants fall along four continuous axes that describe variation in participants' behaviour: recording intensity, spatial extent, recording potential and rarity recording. Our results support a move away from labelling participants as belonging to one behavioural group or another in favour of placing them along axes of participant behaviour that better represent the continuous variation between individuals. Understanding participant behaviour could support better use of the data, by accounting for biases in the data collection process.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Figure 1
Figure 1
Illustration of spatial metrics. Each of the three spatial metrics are independent, the change from the top row to the bottom row represents a change in the given metric where the other two metric are held constant. The circles are illustrative; kernel density polygons can take any shape.
Figure 2
Figure 2
A histogram showing the distribution of records contributed per participant. The x-axis is truncated at 100. The top 14% of participants contribute 80% of the data.
Figure 3
Figure 3
Principal components analysis undertaken using the three sets of participant metrics: temporal, spatial and data content. Symbols represent the clusters identified via k-means clustering, but which have low support across all three sets of metrics.
Figure 4
Figure 4
The distribution of participants and records along the 4 axes of participant behaviour. Upper (white) plots show the distribution of participants while lower (grey) plots show the distribution of records (i.e. the sum of records contributed by participants in each white column). Number of records per participant is significantly positively correlated to the axes of participant behaviour in (ac), and negatively correlated to (d) (see Fig. 5).
Figure 5
Figure 5
Correlation between the four axes of participant behaviour identified, and against log10 number of records per participant. There were significant associations (shown with ‘*’ where p < 0.05) between all variables except for rarity recording and recording potential. Numbers above the diagonal are Pearson's correlation coefficients. Below the diagonal the relationship is illustrated with a lowess line.

References

    1. Dirzo R, et al. Defaunation in the Anthropocene. Science. 2014;345:401–406. - PubMed
    1. Newbold T, et al. Global effects of land use on local terrestrial biodiversity. Nature. 2015;520:45. - PubMed
    1. Seebens H, Gastner MT, Blasius B. The risk of marine bioinvasion caused by global shipping. Ecol. Lett. 2013;16:782–790. - PubMed
    1. Hooper DU, Chapin FS, III, Ewel JJ. Effects of biodiversity on ecosystem functioning: a consensus of current knowledge. Ecol. Monogr. 2005;75:3–35.
    1. Ehrenfeld JG. Ecosystem consequences of biological invasions. Annu. Rev. Ecol. Evol. Syst. 2010;41:59–80.

Publication types