Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021:1:10.1145/3441852.3471208.
doi: 10.1145/3441852.3471208.

Sharing Practices for Datasets Related to Accessibility and Aging

Affiliations

Sharing Practices for Datasets Related to Accessibility and Aging

Rie Kamikubo et al. ASSETS. 2021.

Abstract

Datasets sourced from people with disabilities and older adults play an important role in innovation, benchmarking, and mitigating bias for both assistive and inclusive AI-infused applications. However, they are scarce. We conduct a systematic review of 137 accessibility datasets manually located across different disciplines over the last 35 years. Our analysis highlights how researchers navigate tensions between benefits and risks in data collection and sharing. We uncover patterns in data collection purpose, terminology, sample size, data types, and data sharing practices across communities of focus. We conclude by critically reflecting on challenges and opportunities related to locating and sharing accessibility datasets calling for technical, legal, and institutional privacy frameworks that are more attuned to concerns from these communities.

Keywords: Accessibility; Human-centered computing → Human computer interaction (HCI); Security and privacy → Human and societal aspects of security and privacy; dataset; disability; machine learning; repository; sharing practices.

PubMed Disclaimer

Figures

Figure 1:
Figure 1:
Examples of accessibility datasets such as photos taken by blind users [106], assistive app logs of users with visual impairments [92], sign language videos [85], gloss annotations [130], motion captured signs [80], depth data from older adults’ activities [108], stroke gestures by people with motor impairments [166], eye-tracking data from autistic children [54], voice recordings of people with speech impairments [36], and a speech corpus of people with intellectual disabilities [142].
Figure 2:
Figure 2:
Distribution of dataset count across communities.
Figure 3:
Figure 3:
Dataset count over the years across communities.
Figure 4:
Figure 4:
Sample size across communities and data types: all contributors (a&c) vs. those within the communities of focus only (b&d).
Figure 5:
Figure 5:
Distribution of data types across communities.
Figure 6:
Figure 6:
Dataset count over the years across data types.
Figure 7:
Figure 7:
Distribution of sharing strategies employed across communities.

References

    1. 2021. ACL: Association for Computational Linguistics. ACL Data and Code Repository. https://aclweb.org/aclwiki/ACL_Data_and_Code_Repository.
    1. 2021. ACM: Association for Computing Machinery. https://www.acm.org/.
    1. 2021. Amazon. Registry of Open Data on AWS. https://registry.opendata.aws/.
    1. 2021. CVF: Computer Vision Foundation. https://www.thecvf.com/.
    1. 2021. Google Search. https://www.google.com/search/howsearchworks/.