High-throughput complement component 4 genomic sequence analysis with C4Investigator
- PMID: 37899688
- PMCID: PMC11099535
- DOI: 10.1111/tan.15273
High-throughput complement component 4 genomic sequence analysis with C4Investigator
Abstract
The complement component 4 gene loci, composed of the C4A and C4B genes and located on chromosome 6, encodes for complement component 4 (C4) proteins, a key intermediate in the classical and lectin pathways of the complement system. The complement system is an important modulator of immune system activity and is also involved in the clearance of immune complexes and cellular debris. C4A and C4B gene loci exhibit copy number variation, with each composite gene varying between 0 and 5 copies per haplotype. C4A and C4B genes also vary in size depending on the presence of the human endogenous retrovirus (HERV) in intron 9, denoted by C4(L) for long-form and C4(S) for short-form, which affects expression and is found in both C4A and C4B. Additionally, human blood group antigens Rodgers and Chido are located on the C4 protein, with the Rodger epitope generally found on C4A protein, and the Chido epitope generally found on C4B protein. C4A and C4B copy number variation has been implicated in numerous autoimmune and pathogenic diseases. Despite the central role of C4 in immune function and regulation, high-throughput genomic sequence analysis of C4A and C4B variants has been impeded by the high degree of sequence similarity and complex genetic variation exhibited by these genes. To investigate C4 variation using genomic sequencing data, we have developed a novel bioinformatic pipeline for comprehensive, high-throughput characterization of human C4A and C4B sequences from short-read sequencing data, named C4Investigator. Using paired-end targeted or whole genome sequence data as input, C4Investigator determines the overall gene copy numbers, as well as C4A, C4B, C4(Rodger), C4(Ch), C4(L), and C4(S). Additionally, C4Ivestigator reports the full overall C4A and C4B aligned sequence, enabling nucleotide level analysis. To demonstrate the utility of this workflow we have analyzed C4A and C4B variation in the 1000 Genomes Project Data set, showing that these genes are highly poly-allelic with many variants that have the potential to impact C4 protein function.
Keywords: C4; bioinformatics pipeline; complement component; copy number; genotyping; immunogenetics.
© 2023 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Conflict of interest statement
Conflict of Interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Figures



Update of
-
High-throughput complement component 4 genomic sequence analysis with C4Investigator.bioRxiv [Preprint]. 2023 Jul 19:2023.07.18.549551. doi: 10.1101/2023.07.18.549551. bioRxiv. 2023. Update in: HLA. 2024 Jan;103(1):e15273. doi: 10.1111/tan.15273. PMID: 37503256 Free PMC article. Updated. Preprint.
Similar articles
-
High-throughput complement component 4 genomic sequence analysis with C4Investigator.bioRxiv [Preprint]. 2023 Jul 19:2023.07.18.549551. doi: 10.1101/2023.07.18.549551. bioRxiv. 2023. Update in: HLA. 2024 Jan;103(1):e15273. doi: 10.1111/tan.15273. PMID: 37503256 Free PMC article. Updated. Preprint.
-
Genetic, structural and functional diversities of human complement components C4A and C4B and their mouse homologues, Slp and C4.Int Immunopharmacol. 2001 Mar;1(3):365-92. doi: 10.1016/s1567-5769(01)00019-4. Int Immunopharmacol. 2001. PMID: 11367523 Review.
-
Low C4A copy numbers and higher HERV gene insertion contributes to increased risk of SLE, with absence of association with disease phenotype and disease activity.Immunol Res. 2024 Aug;72(4):697-706. doi: 10.1007/s12026-024-09475-8. Epub 2024 Apr 10. Immunol Res. 2024. PMID: 38594415
-
Molecular analysis of complement component C4 gene copy number.Methods Mol Biol. 2012;882:159-71. doi: 10.1007/978-1-61779-842-9_9. Methods Mol Biol. 2012. PMID: 22665233
-
The intricate role of complement component C4 in human systemic lupus erythematosus.Curr Dir Autoimmun. 2004;7:98-132. doi: 10.1159/000075689. Curr Dir Autoimmun. 2004. PMID: 14719377 Review.
Cited by
-
Copy number variation at the complement C4 locus is associated with risk for multiple sclerosis.Mult Scler. 2025 Mar 15:13524585251324850. doi: 10.1177/13524585251324850. Online ahead of print. Mult Scler. 2025. PMID: 40088042 Free PMC article.
-
MHConstructor: A high-throughput, haplotype-informed solution to the MHC assembly challenge.bioRxiv [Preprint]. 2024 May 21:2024.05.20.595060. doi: 10.1101/2024.05.20.595060. bioRxiv. 2024. Update in: Genome Biol. 2024 Oct 17;25(1):274. doi: 10.1186/s13059-024-03412-6. PMID: 38826378 Free PMC article. Updated. Preprint.
-
Complex genetic variation in nearly complete human genomes.bioRxiv [Preprint]. 2024 Sep 25:2024.09.24.614721. doi: 10.1101/2024.09.24.614721. bioRxiv. 2024. Update in: Nature. 2025 Aug;644(8076):430-441. doi: 10.1038/s41586-025-09140-6. PMID: 39372794 Free PMC article. Updated. Preprint.
-
MHConstructor: a high-throughput, haplotype-informed solution to the MHC assembly challenge.Genome Biol. 2024 Oct 17;25(1):274. doi: 10.1186/s13059-024-03412-6. Genome Biol. 2024. PMID: 39420419 Free PMC article.
References
-
- Wang H, Liu M. Complement C4, Infections, and Autoimmune Diseases. Frontiers in Immunology [Internet]. 2021. [cited 2022 Apr 28];12. Available from: https://www.frontiersin.org/article/10.3389/fimmu.2021.694928 - DOI - PMC - PubMed
-
- Toapanta FR, Ross TM. Complement-mediated activation of the adaptive immune responses: role of C3d in linking the innate and adaptive immunity. Immunol Res. 2006;36(1–3):197–210. - PubMed
-
- Charles A Janeway J, Travers P, Walport M, Shlomchik MJ. The complement system and innate immunity. Immunobiology: The Immune System in Health and Disease 5th edition [Internet]. 2001. [cited 2022 Jan 4]; Available from: https://www.ncbi.nlm.nih.gov/books/NBK27100/
-
- Yang Y, Chung EK, Zhou B, Blanchong CA, Yu CY, Füst G, et al. Diversity in Intrinsic Strengths of the Human Complement System: Serum C4 Protein Concentrations Correlate with C4 Gene Size and Polygenic Variations, Hemolytic Activities, and Body Mass Index. The Journal of Immunology. 2003. Sep 1;171(5):2734–45. - PubMed
Publication types
MeSH terms
Substances
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous