Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov:2021:2413-2418.
doi: 10.1109/EMBC46164.2021.9630199.

An Integrated Toolkit for Extensible and Reproducible Neuroscience

An Integrated Toolkit for Extensible and Reproducible Neuroscience

Jordan K Matelsky et al. Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov.

Abstract

As neuroimagery datasets continue to grow in size, the complexity of data analyses can require a detailed understanding and implementation of systems computer science for storage, access, processing, and sharing. Currently, several general data standards (e.g., Zarr, HDF5, precomputed) and purpose-built ecosystems (e.g., BossDB, CloudVolume, DVID, and Knossos) exist. Each of these systems has advantages and limitations and is most appropriate for different use cases. Using datasets that don't fit into RAM in this heterogeneous environment is challenging, and significant barriers exist to leverage underlying research investments. In this manuscript, we outline our perspective for how to approach this challenge through the use of community provided, standardized interfaces that unify various computational backends and abstract computer science challenges from the scientist. We introduce desirable design patterns and share our reference implementation called intern.

PubMed Disclaimer

Figures

Fig. 1.
Fig. 1.
The intern Python library acts as a shock absorber to provide a consistent API to researchers, tool developers, and other users. A community-facing data-access tool should operate on all major data-storage systems (including CloudVolume, DVID, and BossDB), and remain flexible enough to enable common use cases (such as visualization, data upload/download, and data proofreading) without sacrificing performance.
Fig. 2.
Fig. 2.
Many factors impact data download rate. As an illustration, we tuned the chunk-size parameters for parallel- and non-parallel downloads from the BossDB remote. A. Performance was impacted by client-side compute speed (for data decompression) as well as network throughput, illustrating possible avenues for further abstraction of other remotes. B. Chunked data stores benefit from data requests that are aligned to the cuboid subdivisions in the server backend. This effect is more pronounced in filesystem-based data-stores such as CloudVolume or Zarr, as the cuboid periphery must be downloaded and cropped on local compute resources. In contrast, data stores with cloud-side compute (such as DVID or BossDB) can perform this cropping operation prior to data download, and so the additional egress burden is not incurred.
Fig. 3.
Fig. 3.
Using the intern library, we downloaded nanometer-resolution 3D imagery and pixel segmentation from the public Witvliet 2020 et al. dataset on BossDB [25]. We then used the meshing service to produce 3D mesh files for visualization in 3D software such as Blender or Neuroglancer. Pictured here is Dataset 2, a C. elegans nematode brain imaged during the L1 larval stage.

References

    1. Kleissas D, Hider R et al., “The block object storage service (bossDB): A cloud-native approach for petascale neuroscience discovery,” bioRxiv, p. 217745, 2017. - PMC - PubMed
    1. Katz WT and Plaza SM, “DVID: Distributed Versioned Image-Oriented Dataservice,” 2019. - PMC - PubMed
    1. Helmstaedter M, Briggman KL, and Denk W, “High-accuracy neurite reconstruction for high-throughput neuroanatomy,” Nature Neuroscience, vol. 14, no. 8, p. 1081–1088, Jul 2011. - PubMed
    1. Saalfeld S, Cardona A, Hartenstein V, and Tomancak P, “Catmaid: collaborative annotation toolkit for massive amounts of image data,” Bioinformatics, vol. 25, no. 15, p. 1984–1986, Apr 2009. - PMC - PubMed
    1. Clements J, Dolafi T et al., “neuprint: Analysis tools for em connectomics,” bioRxiv, Jan 2020.

Publication types

MeSH terms

LinkOut - more resources