Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2009 May 30;180(1):185-92.
doi: 10.1016/j.jneumeth.2009.03.022. Epub 2009 Apr 1.

Large-scale electrophysiology: acquisition, compression, encryption, and storage of big data

Affiliations

Large-scale electrophysiology: acquisition, compression, encryption, and storage of big data

Benjamin H Brinkmann et al. J Neurosci Methods. .

Abstract

The use of large-scale electrophysiology to obtain high spatiotemporal resolution brain recordings (>100 channels) capable of probing the range of neural activity from local field potential oscillations to single-neuron action potentials presents new challenges for data acquisition, storage, and analysis. Our group is currently performing continuous, long-term electrophysiological recordings in human subjects undergoing evaluation for epilepsy surgery using hybrid intracranial electrodes composed of up to 320 micro- and clinical macroelectrode arrays. DC-capable amplifiers, sampling at 32kHz per channel with 18-bits of A/D resolution are capable of resolving extracellular voltages spanning single-neuron action potentials, high frequency oscillations, and high amplitude ultra-slow activity, but this approach generates 3 terabytes of data per day (at 4 bytes per sample) using current data formats. Data compression can provide several practical benefits, but only if data can be compressed and appended to files in real-time in a format that allows random access to data segments of varying size. Here we describe a state-of-the-art, scalable, electrophysiology platform designed for acquisition, compression, encryption, and storage of large-scale data. Data are stored in a file format that incorporates lossless data compression using range-encoded differences, a 32-bit cyclically redundant checksum to ensure data integrity, and 128-bit encryption for protection of patient information.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Left) Photographic montage of hybrid subdural grid containi 16 clinical macroelectrodes (4 mm) and 112 microelectrodes. Right) Schematic of hybrid subdural grid and depth electrodes. MRI of hippocampal hybrid depth implant (below).
Figure 2
Figure 2
Large scale human electrophysiology acquisition system streams data from the patient's room to the acquisition node via a dedicated dual-Gigabit Ethernet. Data are stored on a 70 terabyte storage pool. Data are accessed via a fiber channel Service Area Network. Large-scale analysis is performed on a dedicated computational cluster.
Figure 3
Figure 3
Data flow schematic. Data acquisition creates files stored in a range of data-type-specifc formats. Storing continuously sampled data normally constitutes the largest component of the dataset, which allows data compression to reduce overall storage requirements significantly. Permanent storage of events and metadata in a relational database provides a flexible and reliable storage mechanism that allows subsequent integration of analysis information
Figure 4
Figure 4
Long-duration, high-frequency, DC-coupled EEG recordings capture all physiologically relevant time scales. A. 10 hours of continuous data from a macroelectrode show a clear DC drift. B. 10 minute, expanded view from A shows a spontaneous seizure approximately 16 min into the recording session. C. 10 second expanded view from B from a microelectrode (bandpass filtered, 600-6,000 Hz) shows action potentials from single neurons. Blue dots show 18 action potentials associated with a single neuron. Da. Expanded view of color-coded action potentials from C showing the similarity of the recorded waveforms. Db. Mean and standard deviation of the 18 action potentials identified in C. Note the dynamic range in both voltage (mV to μV) and time (hours to msec).
Figure 5
Figure 5
Theoretical compression ratios for macro- and microwire 32 kHz channel recordings based on 18 bits of information per sample are plotted against the log of the compressed block length in seconds. Compression ratios tend to improve with longer block lengths and increasing number of samples per block. However, gains beyond 1 sec (32556 samples) are modest and may be outweighed by the advantage of greater direct access to individual time points with smaller blocks.
Figure 6
Figure 6
The range-encoded difference algorithm improves its compression ratio as high-frequency information is removed from the recorded data. Compression ratio calculations are based on 18-bits of information in each sample. Reported data represents 2,255,061,204 samples (69,267 seconds) from a macro electrode (white circles) and micro electrode (black diamonds). The relatively low impedance of the macroelectrode compared to the microelectrode yields a lower thermal noise and better overall compression.
Figure 7
Figure 7
The RED compression algorithm reduces the size of the MEF file as the number of data bits stored is decreased. Percent compression is reported as the ratio of the MEF file size to the theoretical size of the data for each bit rate. Data is reported for 2,255,061,204 samples from a macro electrode and a micro electrode.
Figure 8
Figure 8
Reading and decompressing the MEF data from disk is faster than reading raw 32-bit integer data from disk. Data are reported as the percentage of the raw 32-bit integer read time required to read and decode the corresponding MEF file for the given number of samples. Raw and MEF read times were measured using one processor thread on an Apple Macintosh with a 3.2 GHz Intel processor and 32 Gb of RAM.
Figure 9
Figure 9
Multithreading the RED decompression on a multi-processor computer provides a significant speed increase. 325,560,000 samples were read on an 8-processor system with 32 Gb of RAM. Values are expressed as a percentage of the time required to read an identical number of 32-bit samples from an uncompressed raw data file.

References

    1. Antoniol G, Tonella P. EEG data compression techniques. IEEE Transactions on Biomedical Engineering. 1997 Feb;44(2):105–14. - PubMed
    1. Bodden E, Clasen M, Kneis J. Proseminar Datenkompression 2001. University of Technology Aachen; 2002. Arithmetic Coding in a nutshell.
    1. Bower MR, Buckmaster PS. Changes in granule cell firing rates precede locally recorded spontaneous seizures by minutes in an animal model of temporal lobe epilepsy. J Neurophysiol. 2008 May;99(5):2431–42. - PubMed
    1. Bragin A, Mody I, Wilson CL, Engel J., Jr Local generation of fast ripples in epileptic brain. J Neurosci. 2002 Mar 1;22(5):2012–21. - PMC - PubMed
    1. Buzsaki G. Large-scale recording of neural ensembles. Nat Neurosci. 2004 May;7(5):446–51. - PubMed

Publication types

MeSH terms

LinkOut - more resources