. 2023 Jul;50(7):4151-4172.

doi: 10.1002/mp.16412. Epub 2023 Apr 14.

Discrimination tasks in simulated low-dose CT noise

Craig K Abbey¹, Frank W Samuelson², Rongping Zeng², John M Boone³, Kyle J Myers⁴, Miguel P Eckstein¹

Affiliations

¹ Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, USA.
² Division of Imaging, Diagnostics and Software Reliability, US Food and Drug Administration, Silver Spring, Maryland, USA.
³ Departments of Radiology and Biomedical Engineering, University of California, Davis, California, USA.
⁴ Puente Solutions, LLC, Phoenix, Arizona, USA.

PMID: 37057360
PMCID: PMC11181787
DOI: 10.1002/mp.16412

Discrimination tasks in simulated low-dose CT noise

Craig K Abbey et al. Med Phys. 2023 Jul.

. 2023 Jul;50(7):4151-4172.

doi: 10.1002/mp.16412. Epub 2023 Apr 14.

Authors

Craig K Abbey¹, Frank W Samuelson², Rongping Zeng², John M Boone³, Kyle J Myers⁴, Miguel P Eckstein¹

Affiliations

¹ Department of Psychological and Brain Sciences, University of California, Santa Barbara, California, USA.
² Division of Imaging, Diagnostics and Software Reliability, US Food and Drug Administration, Silver Spring, Maryland, USA.
³ Departments of Radiology and Biomedical Engineering, University of California, Davis, California, USA.
⁴ Puente Solutions, LLC, Phoenix, Arizona, USA.

PMID: 37057360
PMCID: PMC11181787
DOI: 10.1002/mp.16412

Abstract

Background: This study reports the results of a set of discrimination experiments using simulated images that represent the appearance of subtle lesions in low-dose computed tomography (CT) of the lungs. Noise in these images has a characteristic ramp-spectrum before apodization by noise control filters. We consider three specific diagnostic features that determine whether a lesion is considered malignant or benign, two system-resolution levels, and four apodization levels for a total of 24 experimental conditions.

Purpose: The goal of the investigation is to better understand how well human observers perform subtle discrimination tasks like these, and the mechanisms of that performance. We use a forced-choice psychophysical paradigm to estimate observer efficiency and classification images. These measures quantify how effectively subjects can read the images, and how they use images to perform discrimination tasks across the different imaging conditions.

Materials and methods: The simulated CT images used as stimuli in the psychophysical experiments are generated from high-resolution objects passed through a modulation transfer function (MTF) before down-sampling to the image-pixel grid. Acquisition noise is then added with a ramp noise-power spectrum (NPS), with subsequent smoothing through apodization filters. The features considered are lesion size, indistinct lesion boundary, and a nonuniform lesion interior. System resolution is implemented by an MTF with resolution (10% max.) of 0.47 or 0.58 cyc/mm. Apodization is implemented by a Shepp-Logan filter (Sinc profile) with various cutoffs. Six medically naïve subjects participated in the psychophysical studies, entailing training and testing components for each condition. Training consisted of staircase procedures to find the 80% correct threshold for each subject, and testing involved 2000 psychophysical trials at the threshold value for each subject. Human-observer performance is compared to the Ideal Observer to generate estimates of task efficiency. The significance of imaging factors is assessed using ANOVA. Classification images are used to estimate the linear template weights used by subjects to perform these tasks. Classification-image spectra are used to analyze subject weights in the spatial-frequency domain.

Results: Overall, average observer efficiency is relatively low in these experiments (10%-40%) relative to detection and localization studies reported previously. We find significant effects for feature type and apodization level on observer efficiency. Somewhat surprisingly, system resolution is not a significant factor. Efficiency effects of the different features appear to be well explained by the profile of the linear templates in the classification images. Increasingly strong apodization is found to both increase the classification-image weights and to increase the mean-frequency of the classification-image spectra. A secondary analysis of "Unapodized" classification images shows that this is largely due to observers undoing (inverting) the effects of apodization filters.

Conclusions: These studies demonstrate that human observers can be relatively inefficient at feature-discrimination tasks in ramp-spectrum noise. Observers appear to be adapting to frequency suppression implemented in apodization filters, but there are residual effects that are not explained by spatial weighting patterns. The studies also suggest that the mechanisms for improving performance through the application of noise-control filters may require further investigation.

Keywords: discrimination tasks; observer performance; ramp-spectrum noise.

PubMed Disclaimer

Conflict of interest statement

Author CKA acts as a consultant for Canon Medical Systems Corporation and Izotropic Corporation, where he also has stock options. JMB has funding from Canon Medical Systems Corporation, and is a shareholder and serves on the board of directors for Izotropic Corporation. In JMB’s role as editor-in-chief for Medical Physics, he was blinded to the review process and had no role in decisions pertaining to this manuscript.

Figures

**FIGURE 1**
Task profiles. Radial plots of the “Malignant” and “Benign” profiles (a–c) are shown for each of the three tasks considered (T1–T3). In Task 1, the feature of interest is the lesion size. In Task 2, the feature of interest is an indistinct or unsharp boundary. In Task 3, the feature of interest is a nonuniform lesion interior. The spectral plots (real part of the Fourier Transform) for each task (d–f) show that the spectrum of the features falls off more slowly than that of the base lesion used for the task. Thus the feature discrimination tasks tend to place more weight on higher spatial frequencies. Note that the lesion profiles have been scaled to match the integrated spectral power of the features in each task. The legend on the left applies to each row of plots.

**FIGURE 2**
System properties. The simulated modulation transfer functions (a and b) for the low-resolution system (S1) and the high-resolution system (S2) are shown, along with plots of the noise-power spectra (c and d) at each of the 4 apodization levels (A1–A4). The legend on the left applies to all plots.

**FIGURE 3**
Noise textures. The different levels of apodization lead to different noise amplitude and texture in the simulated imaging systems (S1: Low-Resolution System; S2: High-Resolution System). Higher levels of apodization result in smoother and less grainy texture.

**FIGURE 4**
Stimulus profiles and images. The (noiseless) malignant and benign profiles for each of the three tasks are shown (Left side, rows 1–3), along with the difference signal and sample image patches from each class (target and alternative). All patches derived from the low-resolution system, at apodization-level 3. Task parameters have been exaggerated for the purpose of display in this figure. The image patches are cropped from a simulation ROI of 87.5 mm in an assumed 350 mm field of view. All the images have a window of 1500 HU and level of −650 HU except for the difference images (central column) which are scaled to the maximum difference value.

**FIGURE 5**
Characterization of performance. The plots show performance for the three tasks, both imaging systems (S1, S2), and the four levels of apodization (A1–A4). The PC plot (A) shows some deviation from the target PC of 80%. The corrected threshold energy (B) and efficiency (C) plots show variability between the tasks and evidence of better performance with increasing apodization (see text). Each estimate is the average performance across the 6 subjects in the studies, with error bars representing ± 1.96 standard errors of the mean.

**FIGURE 6**
Classification Images. The average classification image is shown for each task, system (S1: Low-Resolution, S2: High-Resolution), and apodization level (A1–A4). These patches are cropped for display purposes and have been spatially windowed to a radius of 7.5 mm (HWHM), and frequency windowed to 0.4cyc/mm. Within each Task (a–c), the display range is held fixed. At the higher levels of apodization, instability in the estimation process is evident.

**FIGURE 7**
Classification image spectra. Average spatial-frequency weights of the classification images are plotted as a function of radial frequency (The legend in the upper left applies to all plots) for all apodization levels of each task and system resolution (S1: Low-resolution; S2: high-resolution). Each plot shows the radial average of the classification-image spectrum (averaged across subjects), which shows how subjects adapt to the different apodization conditions. In the highest levels of apodization (A3 and A4) the plots show some evidence instability at high frequencies (>0.35 cyc/mm) from inverting the noise covariance matrix.

**FIGURE 8**
Classification-image feature values. The integrated power (a) and mean-frequency (b) features are plotted as a function of the apodization level (A1–A4) for each task (T1–T3) and imaging system (S1,S2). Each estimate is the average of the feature values, with error bars representing a 95% confidence interval on the mean across the six subjects in the studies. The legend applies to both plots.

**FIGURE 9**
Sampling efficiency from classification images. Each symbol in the plot represents one of the 24 experimental conditions. The sampling efficiency of the classification images is associated with subject efficiency ( $R^{2} = 69.6 %$ ), and shows a clear distinction between the three different tasks (labels on plot).

**FIGURE 10**
Differential-sampling-efficiency spectra. Radial averages of the differential sampling efficiency are plotted for the four apodization levels (Legend in upper left applies to all plots) within each task and system. The differential-sampling-efficiency spectra are derived from the classification-image spectra shown in Figure 7, and the show where the spectra are under-weighted (positive values) or over-weighted (negative values), as described in the text.

**FIGURE 11**
Unapodized classification image spectra. Average spatial-frequency weights of the unapodized classification images are plotted as a function of radial frequency. Each plot shows the average radial frequency weights for the classification image (averaged across subjects) using responses from each of the four apodization conditions (A1–A4), but constructed from unapodized noise fields (Legend in upper left applies to all plots). These plots show the impact of apodization on the spectral weights used by human observers. For comparison, we also plot the profile of the signal spectrum and the pre-whitened matched filter (PWMF), which represents optimal spatial weighting.

**FIGURE 12**
Unapodized classification image spectral features. The plots show the Integrated Power (a) and Mean Frequency (b) features plotted as a function of apodization level for the unapodized classification-image spectra. Plotted on the same scale as Figure 8 for comparison. While trends are similar to feature plots for the apodized classification images, the apodization effect is substantially reduced.

See this image and copyright information in PMC

References

1. Richard S, Siewerdsen JH. Comparison of model and human observer performance for detection and discrimination tasks using dual-energy x-ray images. Med Phys 2008;35(11):5043–5053. - PMC - PubMed
1. Abbey CK, Zemp RJ, Liu J, Lindfors KK, Insana MF. Observer efficiency in discrimination tasks simulating malignant and benign breast lesions imaged with ultrasound. IEEE Trans Med Imaging. 2006;25(2):198–209. - PMC - PubMed
1. Gang GJ, Siewerdsen JH, Stayman JW. Task-driven optimization of CT tube current modulation and regularization in model-based iterative reconstruction. Phys Med Biol 2017;62(12):4777. - PMC - PubMed
1. Hernandez AM, Abbey CK, Ghazi P, Burkett G, Boone JM. Effects of kV, filtration, dose, and object size on soft tissue and iodine contrast in dedicated breast CT. Med Phys 2020;47(7):2869–2880. - PMC - PubMed
1. Hanson KM, Myers KJ. Rayleigh task performance as a method to evaluate image reconstruction algorithms. Maximum Entropy and Bayesian Methods. Springer; 1991:303–312.

MeSH terms

Actions
Actions
Actions
Actions
Actions

Grants and funding

R01 EB025829/EB/NIBIB NIH HHS/United States

LinkOut - more resources

Full Text Sources
Medical
- MedlinePlus Consumer Health Information
- MedlinePlus Health Information
Miscellaneous
- NCI CPTAC Assay Portal

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Discrimination tasks in simulated low-dose CT noise

Affiliations

Discrimination tasks in simulated low-dose CT noise

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources

Medical

Miscellaneous