Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 Nov 17:15:622951.
doi: 10.3389/fninf.2021.622951. eCollection 2021.

Magnetic Resonance Imaging Sequence Identification Using a Metadata Learning Approach

Collaborators, Affiliations

Magnetic Resonance Imaging Sequence Identification Using a Metadata Learning Approach

Shuai Liang et al. Front Neuroinform. .

Abstract

Despite the wide application of the magnetic resonance imaging (MRI) technique, there are no widely used standards on naming and describing MRI sequences. The absence of consistent naming conventions presents a major challenge in automating image processing since most MRI software require a priori knowledge of the type of the MRI sequences to be processed. This issue becomes increasingly critical with the current efforts toward open-sharing of MRI data in the neuroscience community. This manuscript reports an MRI sequence detection method using imaging metadata and a supervised machine learning technique. Three datasets from the Brain Center for Ontario Data Exploration (Brain-CODE) data platform, each involving MRI data from multiple research institutes, are used to build and test our model. The preliminary results show that a random forest model can be trained to accurately identify MRI sequence types, and to recognize MRI scans that do not belong to any of the known sequence types. Therefore the proposed approach can be used to automate processing of MRI data that involves a large number of variations in sequence names, and to help standardize sequence naming in ongoing data collections. This study highlights the potential of the machine learning approaches in helping manage health data.

Keywords: AI-assisted data management; MRI sequence naming standardization; data share and exchange; health data; machine learning; metadata learning.

PubMed Disclaimer

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Figures

FIGURE 1
FIGURE 1
Results of a grid search of the number of trees and tree depths of the random forest model, illustrating the hyperparameter tuning processes. The number of bins were fixed at 32 in the calculations. The prediction accuracy was represented by the scale of the colors showing on the right side. The dashed line (drawn manually to serve as a guide to the eye) represents the approximate point where the prediction accuracies plateaued.
FIGURE 2
FIGURE 2
The mean prediction accuracy (blue line) and standard deviation (shaded area around the blue line) of the random forest model built from different sizes of training datasets. The prediction accuracy is defined as the fraction of the testing scans that were classified correctly. The X-axis (i.e., Size of MRI Sequences) represents the number of scans from each type of sequence 1–7 in Table 2, that are used in the training of the random forest model. The standard deviations are calculated from 20 independent computations. When the size reaches approximately 800, the standard deviation is smaller than the width of the line in the figure, and the prediction accuracy is consistently larger than 0.999.
FIGURE 3
FIGURE 3
Feature importance score. The feature importance score was extracted from a model built from the training data with 1,200 scans from each of the sequence types 1–7 in Table 2. The features are ranked by the value of the importance score. The corresponding DICOM tags of these features can be seen in Table 3.
FIGURE 4
FIGURE 4
The percent distribution of the prediction confidence of the random forest model built from different sizes of the training datasets. From (A–D), the training datasets consisted of 20, 100, 600, and 1,200 scans from each of the sequence types 1–7 listed in Table 2. The data are collected from 20 separate computations.
FIGURE 5
FIGURE 5
The percent distribution of the classification confidence on predicting two unknown classes, (A) Arterial Spin Labeling (ASL), and (B) Field Map scans. The random forest models are built from 1,200 scans from each of the sequence types 1–7 listed in Table 2. The data are collected from 20 separate computations.
FIGURE 6
FIGURE 6
The percent distribution of the classification confidence on predicting two unknown classes, (A) fMRI, and (B) DTI scans. (A) The random forest models are built from 1,200 scans from each of the sequence types, 1–5, and 7, listed in Table 2, i.e., without the fMRI scans. (B) The random forest models are built from 1,200 scans from each of the sequence types, 1–4, 6, and 7, i.e., without DTI scans. The data are collected from 20 separate computations.

References

    1. Abbasi S., Tajeripour F. (2017). Detection of brain tumor in 3D MRI images using local binary patterns and histogram orientation gradient. Neurocomputing 219 526–535. 10.1016/j.neucom.2016.09.051 - DOI
    1. Agrawal R., Imieliński T., Swami A. (1993). Mining association rules between sets of items in large databases. SIGMOD Rec. 22 207–216. 10.1186/s40064-016-1943-9 - DOI - PMC - PubMed
    1. Beaton D., Adni A., Saporta G., Abdi H. (2019). A generalization of partial least squares regression and correspondence analysis for categorical and mixed data: An application with the ADNI data. bioRxiv 2019:598888. 10.1101/598888 - DOI
    1. Breiman L. (2001). Random Forests. Machine Learn. 45 5–32.
    1. Calle D., Navarro T. (2018). Basic Pulse Sequences in Magnetic Resonance Imaging. Methods Mol. Biol. 1718 21–37. 10.1007/978-1-4939-7531-0_2 - DOI - PubMed

LinkOut - more resources