Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2021 May 13;184(10):2595-2604.e13.
doi: 10.1016/j.cell.2021.03.061. Epub 2021 Apr 3.

Early introductions and transmission of SARS-CoV-2 variant B.1.1.7 in the United States

Affiliations

Early introductions and transmission of SARS-CoV-2 variant B.1.1.7 in the United States

Tara Alpert et al. Cell. .

Abstract

The emergence and spread of SARS-CoV-2 lineage B.1.1.7, first detected in the United Kingdom, has become a global public health concern because of its increased transmissibility. Over 2,500 COVID-19 cases associated with this variant have been detected in the United States (US) since December 2020, but the extent of establishment is relatively unknown. Using travel, genomic, and diagnostic data, we highlight that the primary ports of entry for B.1.1.7 in the US were in New York, California, and Florida. Furthermore, we found evidence for many independent B.1.1.7 establishments starting in early December 2020, followed by interstate spread by the end of the month. Finally, we project that B.1.1.7 will be the dominant lineage in many states by mid- to late March. Thus, genomic surveillance for B.1.1.7 and other variants urgently needs to be enhanced to better inform the public health response.

Keywords: B.1.1.7; SARS-CoV-2; community transmission; epidemiology; flight volumes; genomic surveillance; introductions; lineage; phylogenetics; variant.

PubMed Disclaimer

Conflict of interest statement

Declarations of interests M.J.M., G.K., J.M., J.T.D., M.N., N.B., and C.E.M. work for Tempus Labs. K.S.G. receives research support from Thermo Fisher for the development of assays for the detection and characterization of viruses. The remaining authors declare no competing interests.

Figures

None
Graphical abstract
Figure 1
Figure 1
Identification of regions in the United States at risk for importation of B.1.1.7 (A) County-level risk assessment of B.1.1.7 introductions from air passenger travelers entering US airports from the UK during December 2020. Labeled are the top 15 airports in the US for passenger volumes from the UK (shown in (C). The county-level heatmap represents the probability of where passengers travel to after arriving at each airport (i.e., the airport catchment area, estimated using the Huff model multiplied by the total number of travelers entering each airport; see STAR Methods). (B) An expanded view of the counties in New York, New Jersey, and Connecticut is shown to highlight the catchment of the large numbers of UK travelers entering the New York JFK and Newark Liberty airports. The same legend in (A) applies to (B). (C) The total number of passengers entering the top 15 US airports from the UK during December 2020. See also Table S1.
Figure 2
Figure 2
Identification of genomic surveillance gaps and regions that may be disproportionately underreporting B.1.1.7 (A) Bar plot represents the percentage of cases in each state from December 2020 to February 2021 (bottom x axis; sourced from https://covidtracking.com/data/) that have sequences uploaded to https://www.gisaid.org/ (accessed March 4, 2021). Bars are colored according to region (legend, top right). The number of B.1.1.7 sequences for each state (top x axis; black dots) was determined by the Pangolin lineage assignment in the https://www.gisaid.org/ metadata. (B) Total number of passengers arriving from the UK in Dec 2020 to each state in the continental US (data from Huff model in Figure 1) is plotted against the percent of sequenced COVID-19 cases. The horizontal dashed line represents the US average (0.43%) for sequenced cases. States sequencing below the US average with more than 2,000 passengers (vertical dashed line) are at risk for underreporting B.1.1.7 (gray box). (C) Number of B.1.1.7 SARS-CoV-2 sequences available on https://www.gisaid.org/ for each state. Points are colored according to region (legend from A). The data used to create this figure are listed in Data S1. See also Figure S1.
Figure S1
Figure S1
The percentage of total COVID-19 cases that were sequenced in December 2020 (Dec), January 2021 (Jan), and February 2021 (Feb) in each state of the continental US, related to Figure 2 Color legend is the same as in Figure 2A.
Figure 3
Figure 3
Multiple introductions, domestic spread, and community transmission of B.1.1.7 SARS-CoV-2 in the US (A) Maximum likelihood phylogeny of B.1.1.7, including 1,908 representative genomes from the US, Europe, and other global locations. Tree topology and bootstrap values obtained using IQ-Tree 1.6.12, with timescale inferred by TreeTime 0.8.0, discrete state reconstruction inferred using BEAST v1.10, and data integration and visualization using baltic 0.1.5. The tree was rooted using a P.1 genome (Brazil/AM-20842882CA/2020) as an outgroup (not shown in this plot). (B) Exploded tree layout, highlighting clades with three or more taxa, bootstrap values (UFBoot) >70 (small circles), and US ancestral state probability at MRCA > 0.7 (values at the root), representing independent international introductions of B.1.1.7 into distinct regions of the US, based on the same phylogenetic tree shown in (A). A list of international transitions to the US can be found in Data S1. (C–H) Time-informed maximum likelihood phylogeny of distinct B.1.1.7 clades showing instances of intra-region (C–E, and G) and inter-region (D and H) domestic spread. The list of SARS-CoV-2 sequences used in this study and author acknowledgments can be found in Data S2. Supporting phylogenetic analysis can be found in Figures S2, S3, S4, and S5. For comparison, an interactive phylogenetic tree, inferred using IQ-Tree and TreeTime only, can be accessed from our custom Nextstrain build: https://nextstrain.org/community/grubaughlab/CT-SARS-CoV-2/paper5. See also Figures S3, S4, and S5.
Figure S2
Figure S2
Maximum likelihood phylogeny of B.1.1.7, including 8,829 representative genomes from the US, Europe, and other global locations, related to Figure 3 Phylogenetic inference was performed using IQ-Tree 1.6.12, with timescale and discrete state reconstruction inferred using TreeTime 0.8.0, and data visualization using baltic 0.1.5. US B.1.1.7 genomes are highlighted with circles at the tips, while international genomes are only represented as branches. The tree was rooted using a P.1 genome (Brazil/AM-20842882CA/2020) as an outgroup (not shown in this plot). This larger dataset was used to further subsample the genomes, removing redundant B.1.1.7 clades containing only genomes of international origin. From this phylogeny we created a succinct dataset containing 1,908 shown in Figure 3.
Figure 4
Figure 4
Increasing frequency of weekly spike gene target failure (SGTF) results across four US states (A) The weekly positivity rate of SARS-CoV-2 testing for four states (legend, B) since the first week of December 2020, calculated as the number of positive test results (including SGTF) divided by total tests. (B) The percentage of weekly positive test results that have SGTF are shown for the same time period and states from A (legend, top left). (C) The weekly percentage of SGTF data from (B) fit to a logistic regression model (see STAR Methods) to project the week in which we estimate SGTF results, and by proxy B.1.1.7, will cross the 50% and 75% thresholds for each state population. The color schemes shown in (A)–(C) match the color schemes used in Figures 2 and 3. The data used to create this figure are listed in Data S1.
Figure S3
Figure S3
Root-to-tip analysis of 1,908 B.1.1.7 genomes used to obtain the phylogenetic results shown in Figure 3 (A) Correlation between genetic divergence (subs/site) and time. Samples generated in this study are highlighted with colors, while background international genomes are shown on gray. (B) Distribution of genetic divergence residuals of genomes shown in (A). Any outliers with residuals above ± 0.0002 subs/site were removed from downstream analyses.
Figure S4
Figure S4
Re-analysis of phylogenetic results, Related to Figure 3 (A) Tree topology and bootstrap values (UFBoot > 70 represented by small black circles at the nodes) obtained using IQ-Tree 1.6.12, with timescale and discrete state reconstruction inferred by TreeTime 0.8.0, and data integration and visualization using baltic 0.1.5. Like the original analysis, the tree was rooted using the genome Wuhan/Hu-1/2019 as an outgroup (not shown in this plot). (B–E) Four clades of US B.1.1.7 genomes selected for comparison of timescales. (F–I) Comparison of median and confidence intervals of tMRCAs obtained in the original study by Washington et al. (2021) (at the top) and in our analysis using TreeTime (at the bottom of each panel).
Figure S5
Figure S5
Results obtained using TreeTime only, plotted using the same approach used for Figure 3, showing multiple introductions, domestic spread, and community transmission of B.1.1.7 SARS-CoV-2 in the US, related to Figure 3 (A) Maximum likelihood phylogeny of B.1.1.7, including 1,908 representative genomes from the US, Europe, other global locations. Tree topology and bootstrap values obtained using IQ-Tree 1.6.12, with timescale and discrete state reconstruction inferred by TreeTime 0.8.0, and data integration and visualization using baltic 0.1.5. The tree was rooted using a P.1 genome (Brazil/AM-20842882CA/2020) as an outgroup (not shown in this plot). (B) Exploded tree layout, highlighting clades with 3 or more taxa, UFBoot > 70 (small circles), and US ancestral state probability at MRCA > 0.7 (values at the root), representing independent international introductions of B.1.1.7 into distinct regions of the US, based on the same phylogenetic tree shown in (A). A list of international transitions to the US can be found in Data S1. (C–H) Time-informed maximum likelihood phylogeny of distinct B.1.1.7 clades showing instances of intra-region (C, D, E, G) and inter-region (D, H) domestic spread. (C,E) and/or community transmission within New York (C), Connecticut (C), Michigan (C,D), and Illinois (E). The list of SARS-CoV-2 sequences used in this study and author acknowledgments can be found in Data S2. Supporting phylogenetic analysis can be found in Figures S2, S3, S4, and S5. For comparison, an interactive phylogenetic tree, inferred using IQ-Tree and TreeTime only, can be accessed from our custom Nextstrain build: https://nextstrain.org/community/grubaughlab/CT-SARS-CoV-2/paper5

Update of

  • Early introductions and community transmission of SARS-CoV-2 variant B.1.1.7 in the United States.
    Alpert T, Brito AF, Lasek-Nesselquist E, Rothman J, Valesano AL, MacKay MJ, Petrone ME, Breban MI, Watkins AE, Vogels CBF, Kalinich CC, Dellicour S, Russell A, Kelly JP, Shudt M, Plitnick J, Schneider E, Fitzsimmons WJ, Khullar G, Metti J, Dudley JT, Nash M, Beaubier N, Wang J, Liu C, Hui P, Muyombwe A, Downing R, Razeq J, Bart SM, Grills A, Morrison SM, Murphy S, Neal C, Laszlo E, Rennert H, Cushing M, Westblade L, Velu P, Craney A, Fauntleroy KA, Peaper DR, Landry ML, Cook PW, Fauver JR, Mason CE, Lauring AS, George KS, MacCannell DR, Grubaugh ND. Alpert T, et al. medRxiv [Preprint]. 2021 Mar 11:2021.02.10.21251540. doi: 10.1101/2021.02.10.21251540. medRxiv. 2021. Update in: Cell. 2021 May 13;184(10):2595-2604.e13. doi: 10.1016/j.cell.2021.03.061. PMID: 33594373 Free PMC article. Updated. Preprint.

Comment in

References

    1. Becker R.A., Wilks A.R., Brownrigg R., Minka T.P., Deckmyn A. The R Project for Statistical Computing; 2018. maps: Draw Geographical Maps. R package version 3.3.0.
    1. Borges V., Sousa C., Menezes L., Gonçalves A.M., Picão M., Almeida J.P., Vieita M., Santos R., Silva A.R., Costa M., et al. Tracking SARS-CoV-2 VOC 202012/01 (lineage B.1.1.7) dissemination in Portugal: insights from nationwide RT-PCR Spike gene drop out data. Euro. Surveill. 2021;26:2100131. - PMC - PubMed
    1. CDC . 2021. US COVID-19 Cases Caused by Variants.https://www.cdc.gov/coronavirus/2019-ncov/transmission/variant-cases.html
    1. CDC . 2021. Genomic Surveillance for SARS-CoV-2.https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/variant-surveill...
    1. Davies N.G., Abbott S., Barnard R.C., Jarvis C.I., Kucharski A.J., Munday J.D., Pearson C.A.B., Russell T.W., Tully D.C., Washburne A.D., et al. CMMID COVID-19 Working Group. COVID-19 Genomics UK (COG-UK) Consortium Estimated transmissibility and impact of SARS-CoV-2 lineage B.1.1.7 in England. Science. 2021;372:eabg3055. - PMC - PubMed

Publication types