Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 May 30;7(2):lqaf070.
doi: 10.1093/nargab/lqaf070. eCollection 2025 Jun.

KILDA: identifying KIV-2 repeats from kmers

Affiliations

KILDA: identifying KIV-2 repeats from kmers

Corentin Molitor et al. NAR Genom Bioinform. .

Abstract

High concentration of lipoprotein(a) [Lp(a)], a lipoprotein with proatherogenic properties, is an important risk factor for cardiovascular disease. This concentration is mostly genetically determined by a complex interplay between the number of kringle IV type 2 repeats and Lp(a)-affecting variants. Besides Lp(a) plasma concentration, there is an unmet need to identify individuals most at risk based on their LPA genotype. We developed KILDA (KIv2 Length Determined from a kmer Analysis), a Nextflow pipeline, to identify the number of kringle IV type 2 repeats and Lp(a)-affecting variants directly from kmers generated from FASTQ files. The pipeline was tested on the 1000 Genomes Project (n = 2459) and results were equivalent to DRAGEN-LPA (R 2= 0.92). In silico datasets proved the robustness of KILDA's predictions under different scenarios of sequencing coverage and quality. In brief, KILDA is a robust, open-source, and free-to-use pipeline that can identify the number of kringle IV type 2 repeats and Lp(a)-associated variants even when inputting low-coverage libraries.

PubMed Disclaimer

Conflict of interest statement

None declared.

Figures

Figure 1.
Figure 1.
KILDA predictions against DRAGEN-LPA predictions for the number of KIV-2 copies on samples from the 1000 Genomes Project. For KILDA, 31-mers were used to make the predictions. The samples are colored by the presence of variants in samples as detected by KILDA.

References

    1. Tsao CW, Aday AW, Almarzooq ZI et al. . Heart disease and stroke statistics—2023 update: a report from the American Heart Association. Circulation. 2023; 147:e93–621.10.1161/CIR.0000000000001123. - DOI - PubMed
    1. Patel AP, Wang M, Pirruccello JP et al. . Lp(a) (lipoprotein[a]) concentrations and incident atherosclerotic cardiovascular disease. Arterioscler Thromb Vasc Biol. 2021; 41:465–74.10.1161/ATVBAHA.120.315291. - DOI - PMC - PubMed
    1. Kronenberg F, Mora S, Stroes ESG et al. . Lipoprotein(a) in atherosclerotic cardiovascular disease and aortic stenosis: a European Atherosclerosis Society consensus statement. Eur Heart J. 2022; 43:3925–46.10.1093/eurheartj/ehac361. - DOI - PMC - PubMed
    1. Coassin S, Kronenberg F Lipoprotein(a) beyond the kringle IV repeat polymorphism: the complexity of genetic variation in the LPA gene. Atherosclerosis. 2022; 349:17–35.10.1016/j.atherosclerosis.2022.04.003. - DOI - PMC - PubMed
    1. Schmidt K, Noureen A, Kronenberg F et al. . Structure, function, and genetics of lipoprotein(a). J Lipid Res. 2016; 57:1339–59.10.1194/jlr.R067314. - DOI - PMC - PubMed

LinkOut - more resources