Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014;109(508):1339-1349.
doi: 10.1080/01621459.2013.836969.

Sparse Semiparametric Nonlinear Model with Application to Chromatographic Fingerprints

Affiliations

Sparse Semiparametric Nonlinear Model with Application to Chromatographic Fingerprints

Michael R Wierzbicki et al. J Am Stat Assoc. 2014.

Abstract

Traditional Chinese herbal medications (TCHMs) are comprised of a multitude of compounds and the identification of their active composition is an important area of research. Chromatography provides a visual representation of a TCHM sample's composition by outputting a curve characterized by spikes corresponding to compounds in the sample. Across different experimental conditions, the location of the spikes can be shifted, preventing direct comparison of curves and forcing compound identification to be possible only within each experiment. In this article we propose a sparse semiparametric nonlinear modeling framework for the establishment of a standardized chromatographic fingerprint. Data-driven basis expansion is used to model the common shape of the curves while a parametric time warping function registers across individual curves. Penalized weighted least squares with the adaptive lasso penalty provides a unified criterion for registration, model selection, and estimation. Furthermore, the adaptive lasso estimators possess attractive sampling properties. A back-fitting algorithm is proposed for estimation. Performance is assessed through simulation and we apply the model to chromatographic data of rhubarb collected from different experimental conditions and establish a standardized fingerprint as a first step in TCHM research.

Keywords: Adaptive lasso; Chromatography; Curve registration; Herbal medicine; Variable selection.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Top portion of each panel: chromatograms for one of the eight experimental conditions. The three samples within each condition are displayed with an arbitrary vertical shift. The set of known compounds are denoted as: (1) gallic acid, (2) catechin, (3) aloe-emodin, (4) rhein, (5) emodin, (6) chrysophanol, and (7) physcion. Bottom portion of each panel: the chromatogram from known compound sample under Condition 1 transformed to the time in the given condition using the corresponding estimated warping function.
Figure 2
Figure 2
Population averaged fingerprint (solid) from the estimates along with 95% confidence bands (dashed). Identified compounds are denoted: (1) gallic acid, (2) catechin, (3) aloe-emodin, (4) rhein, (5) emodin, (6) chrysophanol, and (7) physcion.
Figure 3
Figure 3
The estimated warping functions (solid) along with the 45 degree line denoting no warping for each condition (dashed).
Figure 4
Figure 4
Simulated warped fingerprints with 12 peaks and additive t-distributed noise. The fingerprint was generated using the Laplace distribution function and the warping functions were generated by logistic and inverse-logistic functions.
Figure 5
Figure 5
Boxplots of the mean square error of the estimated fingerprints for the proposed procedure using 3-, 4-, and 5-knots and the two-step procedure using dynamic time warping and wavelet thresholding.
Figure 6
Figure 6
True (solid) and mean of 100 estimated warping functions (dashed) using 4 uniform knots.

References

    1. Abramovich F, Sapatinas T, Silverman BW. Wavelet thresholding via a Bayesian approach. Journal of the Royal Statistical Society: Series B. 1998;60(4):725–749.
    1. Brumback LC, Lindstrom MJ. Self modeling with flexible, random time transformations. Biometrics. 2004;60(2):461–70. - PubMed
    1. Bunea F, Gupta S. Technical Report, Dept of Statistics. Florida State University; 2010. A study of the asymptotic properties of Lasso for correlated data.
    1. Daubechies I. Ten Lectures on Wavelets. Society for Industrial and Applied Mathematics; 1992.
    1. Di Marco VB, Bombi GG. Mathematical functions for the representation of chromatographic peaks. Journal of chromatography A. 2001;931(1-2):1–30. - PubMed

LinkOut - more resources