Volume 92, Issue 6 p. 602-611
RESEARCH ARTICLE

Evolutionary history, potential intermediate animal host, and cross-species analyses of SARS-CoV-2

Xingguang Li

Corresponding Author

Xingguang Li

Hubei Engineering Research Center of Viral Vector, Wuhan University of Bioengineering, Wuhan, China

Correspondence Dr Xingguang Li and Prof Yi Li, Hubei Engineering Research Center of Viral Vector, Wuhan University of Bioengineering, Wuhan, 430415, China.

Email: [email protected] (X. L.) and [email protected] (Y. L.) Prof Brian T. Foley, HIV Databases, Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87544.

Email: [email protected] Dr Antoine Chaillon, Department of Medicine, University of California San Diego, La Jolla, CA 92093-0679.

Email: [email protected]

Search for more papers by this author
Junjie Zai

Junjie Zai

Immunology Innovation Team, School of Medicine, Ningbo University, Ningbo, China

Search for more papers by this author
Qiang Zhao

Qiang Zhao

Precision Cancer Center Airport Center, Tianjin Cancer Hospital Airport Hospital, Tianjin, China

Search for more papers by this author
Qing Nie

Qing Nie

Department of Microbiology, Weifang Center for Disease Control and Prevention, Weifang, China

Search for more papers by this author
Yi Li

Corresponding Author

Yi Li

Hubei Engineering Research Center of Viral Vector, Wuhan University of Bioengineering, Wuhan, China

Correspondence Dr Xingguang Li and Prof Yi Li, Hubei Engineering Research Center of Viral Vector, Wuhan University of Bioengineering, Wuhan, 430415, China.

Email: [email protected] (X. L.) and [email protected] (Y. L.) Prof Brian T. Foley, HIV Databases, Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87544.

Email: [email protected] Dr Antoine Chaillon, Department of Medicine, University of California San Diego, La Jolla, CA 92093-0679.

Email: [email protected]

Search for more papers by this author
Brian T. Foley

Corresponding Author

Brian T. Foley

HIV Databases, Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico

Correspondence Dr Xingguang Li and Prof Yi Li, Hubei Engineering Research Center of Viral Vector, Wuhan University of Bioengineering, Wuhan, 430415, China.

Email: [email protected] (X. L.) and [email protected] (Y. L.) Prof Brian T. Foley, HIV Databases, Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87544.

Email: [email protected] Dr Antoine Chaillon, Department of Medicine, University of California San Diego, La Jolla, CA 92093-0679.

Email: [email protected]

Search for more papers by this author
Antoine Chaillon

Corresponding Author

Antoine Chaillon

Department of Medicine, University of California San Diego, La Jolla, California

Correspondence Dr Xingguang Li and Prof Yi Li, Hubei Engineering Research Center of Viral Vector, Wuhan University of Bioengineering, Wuhan, 430415, China.

Email: [email protected] (X. L.) and [email protected] (Y. L.) Prof Brian T. Foley, HIV Databases, Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87544.

Email: [email protected] Dr Antoine Chaillon, Department of Medicine, University of California San Diego, La Jolla, CA 92093-0679.

Email: [email protected]

Search for more papers by this author
First published: 27 February 2020
Citations: 270

Xingguang Li, Junjie Zai, and Qiang Zhao contributed equally to this study.

Abstract

To investigate the evolutionary history of the recent outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in China, a total of 70 genomes of virus strains from China and elsewhere with sampling dates between 24 December 2019 and 3 February 2020 were analyzed. To explore the potential intermediate animal host of the SARS-CoV-2 virus, we reanalyzed virome data sets from pangolins and representative SARS-related coronaviruses isolates from bats, with particular attention paid to the spike glycoprotein gene. We performed phylogenetic, split network, transmission network, likelihood-mapping, and comparative analyses of the genomes. Based on Bayesian time-scaled phylogenetic analysis using the tip-dating method, we estimated the time to the most recent common ancestor and evolutionary rate of SARS-CoV-2, which ranged from 22 to 24 November 2019 and 1.19 to 1.31 × 10−3 substitutions per site per year, respectively. Our results also revealed that the BetaCoV/bat/Yunnan/RaTG13/2013 virus was more similar to the SARS-CoV-2 virus than the coronavirus obtained from the two pangolin samples (SRR10168377 and SRR10168378). We also identified a unique peptide (PRRA) insertion in the human SARS-CoV-2 virus, which may be involved in the proteolytic cleavage of the spike protein by cellular proteases, and thus could impact host range and transmissibility. Interestingly, the coronavirus carried by pangolins did not have the RRAR motif. Therefore, we concluded that the human SARS-CoV-2 virus, which is responsible for the recent outbreak of COVID-19, did not come directly from pangolins.

Highlights

  • We identified a unique peptide (PRRA) insertion in the human SARS-CoV-2 virus, which may be involved in the proteolytic cleavage of the spike protein by cellular proteases, and thus could impact host range and transmissibility.

CONFLICT OF INTERESTS

The authors declare that there are no conflict of interests.

The full text of this article hosted at iucr.org is unavailable due to technical difficulties.