Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2025 Mar 21;6(1):103580.
doi: 10.1016/j.xpro.2024.103580. Epub 2025 Jan 20.

Protocol for reconstructing ancestral genomes from present-day samples by applying local ancestry inference

Affiliations

Protocol for reconstructing ancestral genomes from present-day samples by applying local ancestry inference

Xiaoxi Zhang et al. STAR Protoc. .

Abstract

The genome of the most recent common ancestor is generally not available but can greatly facilitate the inference of demographic history and the detection of local adaptations. Here, we present a protocol for applying local ancestry inference in present-day samples to reconstruct ancestral genomes. We describe steps for estimating haplotypes, inferring local ancestry, and assembling ancestral haplotypes. This protocol describes the analytic steps of reconstructing ancestral genomes using the example data of the Miao and She target populations. For complete details on the use and execution of this protocol, please refer to Gao et al.1.

Keywords: Bioinformatics; Evolutionary biology; Genetics; Genomics.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Flowchart for reconstructing ancestral genomes The flowchart includes input data, output result, the mid-step scripts, and tools to be used for each step.
Figure 2
Figure 2
The format of the local ancestry inference result files The screenshot of the output file (161.cp.chr21.samples.out.gz) of local ancestry inferred by ChromopainterV2 is shown. The first row in the figure represents the second haplotype of the HGDP01193 sample. The following ten rows represent the results of 10 local ancestry inference iterations: The first column indicates the iteration number, while the numbers in the subsequent columns indicate which donor haplotype ID the alleles on the recipient haplotype are derived from.
Figure 3
Figure 3
The format of reconstructed ancestral genomes The screenshot of the output VCF file (504.assemble.chr21.0.2.6000.5.vcf.gz) of reconstructed ancestral genomes is shown. The first row in the figure represents the header row. Each subsequent row corresponds to a genomic locus: the first column indicates the chromosome number, the second column indicates the physical position, the fourth column represents the reference allele, the fifth column represents the alternative allele, and from the tenth column onward, the genotype of each sample is shown. The other columns can be disregarded.
Figure 4
Figure 4
Screenshot of the output files of PCA and ADMIXTURE (A) 'eigenvec' file of PCA results. The first and second columns represent the sample IDs. The third and fourth columns indicate the PC1 and PC2 values of the samples, respectively. (B) 'eigenval' file of PCA results. Respectively store the eigenvalues from PC1 to PC10. (C) ADMIXTURE results of Q file with K = 3. Each row represents a sample, and each column indicates the proportions of different ancestries for that sample. All of the results are used variants on Chromosome 21.
Figure 5
Figure 5
Plot of the principal component analysis (PCA) and ancestry component (A) PCA; (B) The proportion of ancestry component. The reconstructed ancestral genomes are genetically similar to the present-day Hmong-Mien speakers and further away from the present-day Han Chinese, which is expected.

References

    1. Gao Y., Zhang X., Chen H., Lu Y., Ma S., Yang Y., Zhang M., Xu S. Reconstructing the ancestral gene pool to uncover the origins and genetic links of Hmong-Mien speakers. BMC Biol. 2024;22:59. doi: 10.1186/s12915-024-01838-9. - DOI - PMC - PubMed
    1. Price A.L., Tandon A., Patterson N., Barnes K.C., Rafaels N., Ruczinski I., Beaty T.H., Mathias R., Reich D., Myers S. Sensitive detection of chromosomal segments of distinct ancestry in admixed populations. PLoS Genet. 2009;5 doi: 10.1371/journal.pgen.1000519. - DOI - PMC - PubMed
    1. Lawson D.J., Hellenthal G., Myers S., Falush D. Inference of population structure using dense haplotype data. PLoS Genet. 2012;8 doi: 10.1371/journal.pgen.1002453. - DOI - PMC - PubMed
    1. Maples B.K., Gravel S., Kenny E.E., Bustamante C.D. RFMix: a discriminative modeling approach for rapid and robust local-ancestry inference. Am. J. Hum. Genet. 2013;93:278–288. doi: 10.1016/j.ajhg.2013.06.020. - DOI - PMC - PubMed
    1. HUGO Pan-Asian SNP Consortium, Abdulla M.A., Ahmed I., Assawamakin A., Bhak J., Brahmachari S.K., Calacal G.C., Chaurasia A., Chen C.H., Chen J., et al. Mapping human genetic diversity in Asia. Science. 2009;326:1541–1545. doi: 10.1126/science.1177074. - DOI - PubMed

LinkOut - more resources