Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
[Preprint]. 2023 Apr 27:2023.04.24.538128.
doi: 10.1101/2023.04.24.538128.

Identification of allele-specific KIV-2 repeats and impact on Lp(a) measurements for cardiovascular disease risk

Affiliations

Identification of allele-specific KIV-2 repeats and impact on Lp(a) measurements for cardiovascular disease risk

S Behera et al. bioRxiv. .

Update in

Abstract

The abundance of Lp(a) protein holds significant implications for the risk of cardiovascular disease (CVD), which is directly impacted by the copy number (CN) of KIV-2, a 5.5 kbp sub-region. KIV-2 is highly polymorphic in the population and accurate analysis is challenging. In this study, we present the DRAGEN KIV-2 CN caller, which utilizes short reads. Data across 166 WGS show that the caller has high accuracy, compared to optical mapping and can further phase ~50% of the samples. We compared KIV-2 CN numbers to 24 previously postulated KIV-2 relevant SNVs, revealing that many are ineffective predictors of KIV-2 copy number. Population studies, including USA-based cohorts, showed distinct KIV-2 CN, distributions for European-, African-, and Hispanic-American populations and further underscored the limitations of SNV predictors. We demonstrate that the CN estimates correlate significantly with the available Lp(a) protein levels and that phasing is highly important.

Keywords: Illumina; LPA; Next Generation sequencing; cardiovascular disease; copy number variation.

PubMed Disclaimer

Conflict of interest statement

Ethics declarations Competing Interests FJS receives research support from Illumina, PacBio and ONT. LP is funded from Genentech. JB, EN, MR, ER, JH and VO are employees from Illumina. XC and ME are employees from PacBio. VM is employed now at Genentech.

Figures

Figure 1:
Figure 1:. Overview of KIV-2 LPA performance:
A) Genome wide alignment of short Illumina reads and long PacBio HiFi and ONT reads to LPA gene. We observed many non-unique mappings (i.e. mapping quality MQ=0, shown in white) in KIV-2. The GRCh38 representation of KIV-2 contains six copies of the 5.5 kbp repeat, challenging even for longer reads. This is shown by many white (MQ=0) PacBio reads and to a lesser extent for Oxford Nanopore Technologies (ONT) reads. The fraction of non-unique mapped reads increased with shorter reads, which often hindered the direct assessment of KIV-2 repeat copy number. B) Total KIV-2 copy number compared between calls made by the DRAGEN LPA caller or using optical mapping technology, for cases where optical mapping spanned both genomic alleles. Dashed lines indicate error margins of 5% from optical mapping copy number. C) Allelic KIV-2 copy number compared where the per-allele copy number was available from the DRAGEN LPA caller and from optical mapping assemblies. Dashed lines indicate error margins of 5% from optical mapping copy number. D) For 60 trios where both offspring and parent KIV-2 allele lengths are reported, an allele combination is chosen which minimizes the total difference between each of the two offspring alleles and one from each parent as the most likely allele origin. Each pair, consisting of one offspring allele and one associated parent allele, is shown for a total of 120 allele pairs. Dashed lines indicate error margins of 5% from expected. E) For 153 duos where both offspring KIV-2 allele lengths and those from one parent are reported, a most likely allele combination for one offspring allele and one parental allele is chosen as in C. Dashed lines indicate error margins of 5% from expected
Figure 2.
Figure 2.. DRAGEN LPA caller analysis across 1KGP cohort.
A) Distribution of LPA KIV-2 repeats among samples from 1KGP. B) Histogram of absolute value of allele length difference between Allele1 and Allele2 in 1KGP samples. The DRAGEN LPA caller could resolve haplotype copy number in about 50% of samples. C) The distribution of Allele1 (longest allele in sample) copy numbers among haplotype-resolved 1KGP samples. D) The distribution of Allele2 copy numbers (shortest allele in sample). E-L) Comparison of SNV markers with CNV states of KIV-2. Samples from 1KGP were genotyped for 12 SNVs which have been reported as associated with different Lp(a) levels or copy number states. KIV-2 copy number was compared between groups of samples homozygous for the major (more common) or minor (less common) alleles of these SNVs, to test for association between SNV status and KIV-2 copy number. P-values were obtained by Kolmogorov-Smirnov tests and corrected for multiple testing by the Bonferroni method.
Figure 3:
Figure 3:. Overview of ARIC and SOL cohorts across the USA:
A) Total KIV-2 copy number distributions in African-American, European-American and Hispanic-American populations. B) Haplotype difference between allele1 CN and allele2 CN among various populations. C) and D) distribution of Allele1 and Allele2 copy numbers. In each diploid genome with haplotype resolved KIV-2 calls, Allele1 is the longer allele and Allele2 is the shorter.
Figure 4:
Figure 4:. Comparison of SNV markers with CN states of KIV-2.
Samples from ARIC and SOL cohorts were genotyped for 9 SNVs which have been reported as associated with different Lp(a) levels or copy number states. KIV-2 copy number was compared between groups of samples homozygous for the major (more common) or minor (less common) alleles of these SNVs, to test for association between SNV status and KIV-2 copy number. p-values were obtained by Kolmogorov-Smirnov tests and corrected for multiple testing by the Bonferroni method.
Figure 5.
Figure 5.. Association between KIV-2 CN estimates and Lp(a) measurement (in nmol/L):
A) The mean Lp(a) level vs CN quartiles for African-American (p-value = 4.03E-47) and European-American (p-value = 2.15E-46) samples from ARIC cohort, and Hispanic-American samples from HCHS/SOL cohort (p-value = 2.64E-42). B) The mean LPA KIV-2 CN vs Lp(a) levels of three groups based on their diagnostic cutoff values of Lipid level i.e. <75 nmol/L as low, >125 nmol/L as high and in between 75 nmol/L and 125 nmol/L as medium. C) Scatterplot of allele length comparisons for KIV-2 CN and their impact on diagnostic cutoff values of Lipid levels per population. Allele 2 (Y-axis) is always reported as the smaller allele.

References

    1. Benjamin E. J. et al. Heart Disease and Stroke Statistics-2019 Update: A Report From the American Heart Association. Circulation 139, e56–e528 (2019). - PubMed
    1. Rosamond W. et al. Heart disease and stroke statistics--2007 update: a report from the American Heart Association Statistics Committee and Stroke Statistics Subcommittee. Circulation 115, e69–171 (2007). - PubMed
    1. Said M. A., Verweij N. & van der Harst P. Associations of Combined Genetic and Lifestyle Risks With Incident Cardiovascular Disease and Diabetes in the UK Biobank Study. JAMA Cardiol 3, 693–702 (2018). - PMC - PubMed
    1. Khera A. V. et al. Genetic Risk, Adherence to a Healthy Lifestyle, and Coronary Disease. N. Engl. J. Med. 375, 2349–2358 (2016). - PMC - PubMed
    1. Shen M. et al. Associations of combined lifestyle and genetic risks with incident psoriasis: A prospective cohort study among UK Biobank participants of European ancestry. Journal of the American Academy of Dermatology Preprint at 10.1016/j.jaad.2022.04.006 (2022). - DOI - PubMed

Publication types