Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2024 Aug 22:arXiv:2406.05248v3.

Processing, evaluating and understanding FMRI data with afni_proc.py

Affiliations

Processing, evaluating and understanding FMRI data with afni_proc.py

Richard C Reynolds et al. ArXiv. .

Update in

Abstract

FMRI data are noisy, complicated to acquire, and typically go through many steps of processing before they are used in a study or clinical practice. Being able to visualize and understand the data from the start through the completion of processing, while being confident that each intermediate step was successful, is challenging. AFNI's afni_proc.py is a tool to create and run a processing pipeline for FMRI data. With its flexible features, afni_proc.py allows users to both control and evaluate their processing at a detailed level. It has been designed to keep users informed about all processing steps: it does not just process the data, but first outputs a fully commented processing script that the users can read, query, interpret and refer back to. Having this full provenance is important for being able to understand each step of processing; it also promotes transparency and reproducibility by keeping the record of individual-level processing and modeling specifics in a single, shareable place. Additionally, afni_proc.py creates pipelines that contain several automatic self-checks for potential problems during runtime. The output directory contains a dictionary of relevant quantities that can be programmatically queried for potential issues and a systematic, interactive quality control (QC) HTML. All of these features help users evaluate and understand their data and processing in detail. We describe these and other aspects of afni_proc.py here using a set of task-based and resting state FMRI example commands.

PubMed Disclaimer

Figures

Figure 1.
Figure 1.
Schematic features of afni_proc.py. A) Primary data inputs and descriptors are highlighted in green. The processing is managed hierarchically: first the user selects and orders the desired blocks (or major stages), and then for each can specify zero, one or more options. The array of hot colors highlights which options are associated with which block, by matching them: the “tshift” block label with the “-tshift_opts_ts” option, etc. Note that the start of the option name typically matches the block, as well. B) The afni_proc.py command creates a fully commented processing pipeline (“proc script”), so that the user has detailed understanding and provenance of all the steps of the analysis. C) An example workflow that uses afni_proc.py for a single subject analysis, utilizing some preliminary programs beforehand and incorporating automatically-generated data checks and quality control features at the end. This can simply be looped over all subjects in a data collection.
Figure 2.
Figure 2.
While an EPI brain mask is estimated during afni_proc.py processing, it is not applied to the data, so that results throughout the whole FOV can be viewed. This facilitates understanding the data better, as well as improving QC evaluation. Two examples of this are shown from data processed within the FMRI Open QC project, showing TSNR in the final MNI space after regression modeling. In A, one can see strong ghosting outside the brain (cyan arrows), which helps explain some of the unexpected correlation patterns that are observed within the brain. In B, one can see from the TSNR pattern that part of the final EPI data is not well aligned in standard space; this is due to initially impressive skullstripping, which could then be fixed. In both cases, masking would have hidden the reality of what was happening and contributed to potentially biased results.
Figure 3.
Figure 3.
Pseudocode for running single subject processing at a group level, looping over a list of subject IDs and one or more sessions for each. At the heart of the second loop is the action to do the subject processing: here, to run a theoretical shell script (“do_ap_cmd.tcsh”) that contains an afni_proc.py command and just needs the subject and session ID values provided as arguments. This runs easily on a BIDS-formatted data collection (though some BIDS trees do not contain a session-level ID or directory structure, and so the second loop would be omitted).
Figure 4.
Figure 4.
QC images generated by AFNI’s sswarper2, as it both skullstrips an anatomical volume (panel A) and calculates its nonlinear warp to template space (panel B). Both brainmasking and alignment with the template appear to be generally strong throughout the brain. The outputs of this program (or analogous ones, such as AFNI’s @animal_warper) can be used directly in afni_proc.py. Here and in axial/coronal images below, image left is subject left.
Figure 5.
Figure 5.
The afni_proc.py command for Ex. 1 (warping-only, single echo FMRI). The options and any arguments are vertically spaced for readability. Here and throughout, items starting with “$” are variable names, which are typically file names or control options. ${sub} = the subject ID; ${anat_cp} = the input anatomical dataset (here, that has been skullstripped by sswarper2); ${anat_skull} = a version of the input anatomical dataset that still has its skull, for reference during processing; ${dset_epi} = the input EPI dataset (which is a single echo, here the second one from the ME-FMRI acquisition); $ {epi_forward} = an EPI volume with phase encoding in the same direction as the main input FMRI datasets, to be used in alignment-based B0-inhomogeneity correction; ${epi_reverse} = an EPI volume with phase encoding in the opposite direction as ${epi_forward}, for B0-inhomogeneity correction; $ {template} = name of reference volume for final space (here, the MNI template). Running this command produces a commented script of >450 lines, encoding the detailed provenance of all processing.
Figure 6.
Figure 6.
Schematics of the various alignment steps within each example’s afni_proc.py command. Details are shown for the first time a particular step is presented. Alignment is calculated separately for each step, but then concatenated within the afni_proc.py script before applying to the EPI data. This tends to minimize extra blurring that would be incurred by multiple regridding and interpolation processes, if the stages were applied separately. In C (Ex. 4), after the concatenated warp is applied, the EPI data are projected onto a standardized surface mesh with 3dVol2Surf. Case D displays a variation of how to handle motion estimation when multiple runs are input, particularly if one might expect more differences between runs.
Figure 7.
Figure 7.
A selection of QC images generated by afni_proc.py for Ex. 1, which focuses on alignment-related steps of preprocessing. Panel A shows one EPI volume in original view (specifically, the one used as a reference for motion correction and EPI-anatomical alignment) to check coverage, tissue contrast, etc. Panel B shows the underlaid EPI and overlaid edges of the anatomical volumes after affine alignment. Here, after blip up/down correction, the EPI shows greatly reduced B0 inhomogeneity distortion along the AP axis (cf the sagittal views; some of the bright regions are CSF), and the general matching of the sulcal and gyral features and other tissue boundaries is strong. Panel C shows the anatomical (underlaid) and reference template (overlaid, edges) volumes after nonlinear alignment. There can be local structural differences expected (particularly in situations where there are differing numbers of sulci and gyri), but again the general matching of structural features is quite high.
Figure 8.
Figure 8.
The afni_proc.py command for Ex. 2 (task-based, single echo FMRI, full processing). Options with gray background have already been described earlier in Ex. 1 here, and any variables described in the captions of Fig. 5. ${blur_size} is the FWHM size of applied blur, in mm; ${sdir_timing} is the directory containing stimulus timing files. Running this command produces a commented script of >640 lines, encoding the detailed provenance of all processing.
Figure 9.
Figure 9.
QC images generated by afni_proc.py for Ex. 2, showing: A) the raw EPI volume in native space; and B) the unmasked TSNR after the volreg processing block, prior to regression modeling. The unmasked TSNR image shows evidence for ghosting artifact overlapping into the brain (cyan arrows); as described in Fig. 2, this shows the benefits of not masking data during processing to understand it better and more reliably evaluate it. The fact that the EPI has been acquired with such a tight FOV (see panel A) likely contributes to the presence of ghosting. The TSNR map also shows the presence of EPI distortion (the anterior TSNR pattern extends beyond the anatomical boundaries, even though structural alignment is good).
Figure 10.
Figure 10.
QC images generated by afni_proc.py for Ex. 2, focused on the motion and regression model setup when processing task-based FMRI. Panel A shows the Enorm and outlier fraction plots across time, which are used for time point censoring. The dashed lines show the thresholds for each quantity, and the red bands highlight the location of any volumes to be censored (here, only 3 volumes are censored). The “BC” and “AC” boxplots show distributions of each plotted parameter before and after censoring, respectively. The lower two panels show the “ideal” stimulus response based on the timing and chosen hemodynamic response function (HRF): B shows the sum of responses, and C shows each individual stimulus class. The red band of censoring is also displayed here, to reveal any cases of stimulus-correlated motion (which is also checked automatically in the “warns” section of the APQC HTML).
Figure 11.
Figure 11.
QC images generated by afni_proc.py for Ex. 2, focused on evaluating the task-based regression modeling results. In each panel, the statistic value is used for thresholding in a translucent fashion: suprathreshold locations are opaque and outlined, and subthreshold locations are increasingly translucent. The overlay color is the accompanying effect estimate coefficient where available (panels B and C). Panel A exhibits the full F-stat, which shows the relative quality of model fit. Panels B and C show the two contrasts specified in the afni_proc.py command. In all cases, modeling results outside the brain are shown, for more complete evaluation and understanding of the processing results (Taylor et al., 2023b).
Figure 12.
Figure 12.
The afni_proc.py command for Ex. 3 (resting state, single echo FMRI, full processing). Options with gray background have already been described in earlier examples here, and any variables described in the captions of Figs. 5 and 8. ${sdir_timing} is the directory containing stimulus timing files. Running this command produces a commented script of >740 lines, encoding the detailed provenance of all processing. Two additional atlases are imported here, for extracting ROIs for checking TSNR and shape properties: “BrodPijn” is the Brodmann atlas (1909) digitized by Pijnenburg et al. (2021); and “SchYeo7N” is the refined version of the 7-network, 400 parcellation Schaefer-Yeo atlas (Schaefer et al., 2018; Glen et al., 2021).
Figure 13.
Figure 13.
Aspects of processing related to having respiratory and cardiac time series data included in FMRI processing in Ex. 3. The AFNI program physio_calc.py was run on these physiological time series, for which peak and/or trough detection is a first key step, shown in panel A for the respiratory data. The QC image shows the estimated peak and trough locations with triangles (which can be edited in the program’s interactive mode, if necessary); the blue and red color bands reflect the relative intervals between pairs of each, which can help highlight potential algorithm problems. Panel B shows the final RETROICOR regressors estimated by physio_calc.py. These are included in a slice-wise manner within early afni_proc.py processing, along with 5 RVT regressors from the same program (not shown). Finally, panel C shows a map of the fractional variance explained using the 13 physiological regressors, with highest values around the subcortical and inferior regions.
Figure 14.
Figure 14.
QC images generated by afni_proc.py related to motion effects and regression modeling in Ex. 3 processing. Panel A shows the primary quantities that are used to assess subject motion and its effects: Enorm (Euclidean norm), which is approximately the amount of subject motion between time points, in mm; and outlier fraction. Users typically set thresholds for these quantities (horizontal blue lines) to determine which time points should be censored (highlighted in red). Panel B shows the degree of freedom bookkeeping for the regression model, organized by category of regressor. During modeling, data analysts must balance the removal of motion and other non-neuronal effects with the reduction of the statistical DF count. This example did not include bandpassing in processing, but Panel C shows the DF count if it were (see supplementary Ex. 5, in Appendix C). Note that bandpassing itself reduces the DF count by 60% of the original amount. Bandpassing can be problematic, particularly in cases of more subject motion.
Figure 15.
Figure 15.
QC images of statistical output for resting state time series, for which the residuals are the time series of interest (Ex. 3). Panels A-C show axial maps the three seed-based correlation maps shown in the APQC HTML when the final space is a known template: for the default mode network (DMN), the visual network and the auditory network. These allow for checks for artifacts and other potential problems from processing. Panel D displays the TSNR for this data, which can help distinguish regions of strong signal coverage from those with dropout or artifact.
Figure 16.
Figure 16.
QC tables of ROI shape and TSNR properties of user-defined regions of interest (Ex. 3). The regions in panel A are defined in the Brodmann Atlas digitized by Pijnenburg et al. (2021), and those in panel B are from the refined version of the 7-network, 400 parcellation Schaefer-Yeo atlas (Schaefer et al., 2018; Glen et al., 2021). See Taylor et al. (2024) for details on the columns and warning levels, such as for narrow ROIs and low/unstable TSNR. Briefly: ROI = integer value of the region in the dataset; Nvox = total number of voxels in the ROI; Nzer = number of zero-valued voxels in the region (e.g., due to masking or limited FOV); Dvox = maximum depth, counted in voxels; Tmin, T25%, Tmed, T75%, Tmax = the minimum, lower quartile, median, upper quartile and maximum TSNR values in the ROI; X,Y,Z = RAI coordinates of maximum-depth location; ROI_name = string label of ROI.
Figure 17.
Figure 17.
The afni_proc.py command for Ex. 4 (resting state, multi-echo FMRI with surface analysis, full processing). Options with gray background have already been described in earlier examples here, and any variables described in the captions of Figs. 5, 8 and 12. ${sv_suma} is the surface volume dataset; ${suma_specs} are the surface specification files in the SUMA directory; ${dsets_epi_me} is a set of a single run of EPI datasets with different echo times. This example’s “radial_correlate_blocks …” option does not include “regress”, because that stage of processing occurs on the surface and radial correlation QC has not yet been implemented there (but it will be added in the future). Running this afni_proc.py command produces a commented script of >650 lines, encoding the detailed provenance of all processing.
Figure 18.
Figure 18.
Some results generated by afni_proc.py for Ex. 4, which uses surface-based processing for resting state ME-FMRI data. Images are displayed using SUMA (Saad et al., 2004). Panels A-C show seed-based correlation maps for the same seed locations used in the standard APQC HTML reports when purely volumetric processing is used (cf. Fig. 15A–C, showing QC images for Ex. 3). Panel D shows the TSNR across the cortical surface, for ME-FMRI data which has been processed using MEICA-estimated regressors. Some empty patches in the TSNR maps reflect the fact that the utilized MEICA requires brainmasking and occurs before surface projection.
Figure 19.
Figure 19.
Examples of “simple” afni_proc.py commands, using wrapper programs in AFNI for both single- and multi-echo EPI input. Each performs a quick, volumetric analysis of the provided input data, treating the input like resting state FMRI with essentially no detailed options required. This convenient processing still produces useful outputs for informative QC evaluations of data. These commands are general enough to be applied as part of a standard data acquisition, so APQC HTMLs could be created and checked automatically and even while a subject is still present. Some simple processing options that might be useful are: “-nt_rm …”, to provide the number of initial time points to remove; or “-template …”, to specify a reference template for quick, approximate (affine) alignment.

References

    1. Allen EA, Erhardt EB, Calhoun VD (2012). Data Visualization in the Neurosciences: overcoming the Curse of Dimensionality. Neuron 74:603–608. - PMC - PubMed
    1. Argall BD, Saad ZS, Beauchamp MS (2006). Simplified intersubject averaging on the cortical surface using SUMA. Human Brain Mapping 27: 14–27. - PMC - PubMed
    1. Andersson JL, Skare S, Ashburner J (2003). How to correct susceptibility distortions in spin-echo echo-planar images: application to diffusion tensor imaging. Neuroimage 20(2):870–88. - PubMed
    1. Ashburner J (2012). SPM: a history. Neuroimage 62(2):791–800. - PMC - PubMed
    1. Behzadi Y, Restom K, Liau J, Liu TT (2007). A component based noise correction method (CompCor) for BOLD and perfusion based fMRI. Neuroimage 37(1):90–101. - PMC - PubMed

Publication types

LinkOut - more resources