The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience

Affiliations

¹ Department of Computer Science and the Institute for Data Intensive Engineering and Science, Johns Hopkins University.
² Johns Hopkins University Applied Physics Laboratory.
³ Department of Statistical Science and Mathematics and the Institute for Brain Science, Duke University.
⁴ Janelia Farm Research Campus, Howard Hughes Medical Institute.
⁵ Department of Molecular and Cellular Biology, Harvard University.
⁶ Department of Computational Neuroscience, Massachusetts Institute of Technology.
⁷ Allen Institute for Brain Science.
⁸ Department of Physics and Astronomy and the Institute for Data Intensive Engineering and Science, Johns Hopkins University.
⁹ Department of Bioengineering, Stanford University.
¹⁰ Department of Molecular and Cellular Physiology, Stanford University.

PMID: 24401992
PMCID: PMC3881956
DOI: 10.1145/2484838.2484870

The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience

Randal Burns et al. Sci Stat Database Manag. 2013.

. 2013:10.1145/2484838.2484870.

doi: 10.1145/2484838.2484870.

Affiliations

¹ Department of Computer Science and the Institute for Data Intensive Engineering and Science, Johns Hopkins University.
² Johns Hopkins University Applied Physics Laboratory.
³ Department of Statistical Science and Mathematics and the Institute for Brain Science, Duke University.
⁴ Janelia Farm Research Campus, Howard Hughes Medical Institute.
⁵ Department of Molecular and Cellular Biology, Harvard University.
⁶ Department of Computational Neuroscience, Massachusetts Institute of Technology.
⁷ Allen Institute for Brain Science.
⁸ Department of Physics and Astronomy and the Institute for Data Intensive Engineering and Science, Johns Hopkins University.
⁹ Department of Bioengineering, Stanford University.
¹⁰ Department of Molecular and Cellular Physiology, Stanford University.

PMID: 24401992
PMCID: PMC3881956
DOI: 10.1145/2484838.2484870

Abstract

We describe a scalable database cluster for the spatial analysis and annotation of high-throughput brain imaging data, initially for 3-d electron microscopy image stacks, but for time-series and multi-channel data as well. The system was designed primarily for workloads that build connectomes- neural connectivity maps of the brain-using the parallel execution of computer vision algorithms on high-performance compute clusters. These services and open-science data sets are publicly available at openconnecto.me. The system design inherits much from NoSQL scale-out and data-intensive computing architectures. We distribute data to cluster nodes by partitioning a spatial index. We direct I/O to different systems-reads to parallel disk arrays and writes to solid-state storage-to avoid I/O interference and maximize throughput. All programming interfaces are RESTful Web services, which are simple and stateless, improving scalability and usability. We include a performance evaluation of the production system, highlighting the effec-tiveness of spatial data organization.

Keywords: Connectomics; Data-intensive computing.

PubMed Disclaimer

Figures

**Figure 1**
Visualization of the spatial distribution of synapses detected in the mouse visual cortex of Bock et al. [3].

**Figure 2**
Electron microscopy imaging of a mouse somatosensory cortex [16] overlaid by manual annotations describing neural objects, including axons, dendrites, and synapses. These images were cutout from two spatially registered databases and displayed in the CATMAID Web viewer [34].

**Figure 3**
Visualization of six channels array tomography data courtesy of Nick Weiler and Stephen Smith [28, 22]. Data were drawn from a 17-channel database and rendered by the OCP cutout service.

**Figure 4**
Partitioning the Morton (z-order) space-filling curve. For clarity, the figure shows 16 cuboids in 2-dimensions mapping to four nodes. The z-order curve is recursively defined and scales in dimensions and data size.

**Figure 5**
The resolution hierarchy scales the X,Y dimensions of cuboids, but not Z. So that cuboids contain roughly equal lengths in all dimensions.

**Figure 6**
Original (left) and color corrected (right) images across multiple serial sections [16].

**Figure 7**
OCP Data Cluster and clients as configured to run and visualize the parallel computer vision workflow for synapse detection (Section 2).

**Figure 8**
A cutout of an annotation database (left) and the dense read of a single annotation (right).

**Figure 9**
The sparse index for an object (green) is a list of the Morton-order location of the cuboids that contain voxels for that annotation. The index describes the disk blocks that contain voxels for that object, which can be read in a single pass.

**Figure 10**
The performance of the cutout Web-service that extracts three-dimensional subvolumes from the kasthuri11 image database.

**Figure 11**
Throughput of 256MB cutout requests to kasthuri11 as a function of the number of concurrent requests.

**Figure 12**
Throughput of writing annotations as a function of the size of the annotated region.

**Figure 13**
Performance comparison of Database nodes and SSD nodes when writing synapses (small random writes).

See this image and copyright information in PMC

References

1. Abadi DJ, Madden SR, Ferreira M. Integrating compression and execution in column-oriented database systems. SIGMOD. 2006
1. Baumann P, Dehmel A, Furtado P, Ritsch R, Widmann B. The multidimensional database system RasDaMan. SIGMOD. 1998
1. Bock DD, Lee WCA, Kerlin AM, Andermann ML, Hood G, Wetzel AW, Yurgenson S, Soucy ER, Kim HS, Reid RC. Network anatomy and in vivo physiology of visual cortical neurons. Nature. 2011;471(7337) - PMC - PubMed
1. Cardona A, Saalfeld S, Schindelin J, Arganda-Carreras I, Preibisch S, Longair M, Tomancak P, Hartenstein V, Douglas RJ. TrakEM2 software for neural circuit reconstruction. PLoS ONE. 2012;7(6) - PMC - PubMed
1. Chang C, Acharya A, Sussman A, Saltz J. T2: a customizable parallel database for multidimensional data. SIGMOD Record. 1998;27(1)

Grants and funding

LinkOut - more resources

Full Text Sources
- Europe PubMed Central
- PubMed Central
Other Literature Sources
- scite Smart Citations

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience

Affiliations

The Open Connectome Project Data Cluster: Scalable Analysis and Vision for High-Throughput Neuroscience

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources