The COVID-19 virus holds some mysteries. Scientists remain in the dark on aspects of how it fuses and enters the host cell; how it assembles itself; and how it buds off the host cell.
Computational modeling combined with experimental data provides insights into these behaviors. But modeling over meaningful timescales of the pandemic-causing SARS-CoV-2 virus has so far been limited to just its pieces like the spike protein, a target for the current round of vaccines.
A new multiscale coarse-grained model of the complete SARS-CoV-2 virion, its core genetic material and virion shell, has been developed for the first time using supercomputers. The model offers scientists the potential for new ways to exploit the virus's vulnerabilities.
"We wanted to understand how SARS-CoV-2 works holistically as a whole particle," said Gregory Voth, the Haig P. Papazian Distinguished Service Professor at the University of Chicago. Voth is the corresponding author of the study that developed the first whole virus model, published November 2020 in the Biophysical Journal.
"We developed a bottom-up coarse-grained model," said Voth, "where we took information from atomistic-level molecular dynamics simulations and from experiments." He explained that a coarse-grained model resolves only groups of atoms, versus all-atom simulations, where every single atomic interaction is resolved. "If you do that well, which is always a challenge, you maintain the physics in the model."
The early results of the study show how the spike proteins on the surface of the virus move cooperatively.
"They don't move independently like a bunch of random, uncorrelated motions," Voth said. "They work together."
This cooperative motion of the spike proteins is informative of how the coronavirus explores and detects the ACE2 receptors of a potential host cell.
"The paper we published shows the beginnings of how the modes of motion in the spike proteins are correlated," Voth said. He added that the spikes are coupled to each other. When one protein moves another one also moves in response.
"The ultimate goal of the model would be, as a first step, to study the initial virion attractions and interactions with ACE2 receptors on cells and to understand the origins of that attraction and how those proteins work together to go on to the virus fusion process," Voth said.
Voth and his group have been developing coarse-grained modeling methods on viruses such as HIV and influenza for more than 20 years. They 'coarsen' the data to make it simpler and more computationally tractable, while staying true to the dynamics of the system.
"The benefit of the coarse-grained model is that it can be hundreds to thousands of times more computationally efficient than the all-atom model," Voth explained. The computational savings allowed the team to build a much larger model of the coronavirus than ever before, at longer time-scales than what has been done with all-atom models.
"What you're left with are the much slower, collective motions. The effects of the higher frequency, all-atom motions are folded into those interactions if you do it well. That's the idea of systematic coarse-graining."
The holistic model developed by Voth started with atomic models of the four main structural elements of the SARS-CoV-2 virion: the spike, membrane, nucleocapsid, and envelope proteins. These atomic models were then simulated and simplified to generate the complete course-grained model.
The all-atom molecular dynamics simulations of the spike protein component of the virion system, about 1.7 million atoms, were generated by study co-author Rommie Amaro, a professor of chemistry and biochemistry at the University of California, San Diego.
"Their model basically ingests our data, and it can learn from the data that we have at these more detailed scales and then go beyond where we went," Amaro said. "This method that Voth has developed will allow us and others to simulate over the longer time scales that are needed to actually simulate the virus infecting a cell."
Amaro elaborated on the behavior observed from the coarse-grained simulations of the spike proteins.
"What he saw very clearly was the beginning of the dissociation of the S1 subunit of the spike. The whole top part of the spike peels off during fusion," Amaro said.
One of the first steps of viral fusion with the host cell is this dissociation, where it binds to the ACE2 receptor of the host cell.
"The larger S1 opening movements that they saw with this coarse-grained model was something we hadn't seen yet in the all-atom molecular dynamics, and in fact it would be very difficult for us to see," Amaro said. "It's a critical part of the function of this protein and the infection process with the host cell. That was an interesting finding."
Voth and his team used the all-atom dynamical information on the open and closed states of the spike protein generated by the Amaro Lab on the Frontera supercomputer, as well as other data. The National Science Foundation (NSF)-funded Frontera system is operated by the Texas Advanced Computing Center (TACC) at The University of Texas at Austin.
"Frontera has shown how important it is for these studies of the virus, at multiple scales. It was critical at the atomic level to understand the underlying dynamics of the spike with all of its atoms. There's still a lot to learn there. But now this information can be used a second time to develop new methods that allow us to go out longer and farther, like the coarse-graining method," Amaro said.
"Frontera has been especially useful in providing the molecular dynamics data at the atomistic level for feeding into this model. It's very valuable," Voth said.
The Voth Group initially used the Midway2 computing cluster at the University of Chicago Research Computing Center to develop the coarse-grained model.
The membrane and envelope protein all-atom simulations were generated on the Anton 2 system. Operated by the Pittsburgh Supercomputing Center (PSC) with support from National Institutes of Health, Anton 2 is a special-purpose supercomputer for molecular dynamics simulations developed and provided without cost by D. E. Shaw Research.
"Frontera and Anton 2 provided the key molecular level input data into this model," Voth said.
"A really fantastic thing about Frontera and these types of methods is that we can give people much more accurate views of how these viruses are moving and carrying about their work," Amaro said.
"There are parts of the virus that are invisible even to experiment," she continued. "And through these types of methods that we use on Frontera, we can give scientists the first and important views into what these systems really look like with all of their complexity and how they're interacting with antibodies or drugs or with parts of the host cell."
The type of information that Frontera is giving researchers helps to understand the basic mechanisms of viral infection. It is also useful for the design of safer and better medicines to treat the disease and to prevent it, Amaro added.
Said Voth: "One thing that we're concerned about right now are the UK and the South African SARS-CoV-2 variants. Presumably, with a computational platform like we have developed here, we can rapidly assess those variances, which are changes of the amino acids. We can hopefully rather quickly understand the changes these mutations cause to the virus and then hopefully help in the design of new modified vaccines going forward."
###
The study, "A multiscale coarse-grained model of the SARS-CoV-2 virion," was published on November 27, 2020 in the Biophysical Journal. The study co-authors are Alvin Yu, Alexander J. Pak, Peng He, Viviana Monje-Galvan, Gregory A. Voth of the University of Chicago; and Lorenzo Casalino, Zied Gaieb, Abigail C. Dommer, and Rommie E. Amaro of the University of California, San Diego. Funding was provided by the NSF through NSF RAPID grant CHE-2029092, NSF RAPID MCB-2032054, the National Institute of General Medical Sciences of the National Institutes of Health through grant R01 GM063796, National Institutes of Health GM132826, and a UC San Diego Moore's Cancer Center 2020 SARS-COV-2 seed grant. Computational resources were provided by the Research Computing Center at the University of Chicago, Frontera at the Texas Advanced Computer Center funded by the NSF grant (OAC-1818253), and the Pittsburgh Super Computing Center (PSC) through the Anton 2 machine. Anton 2 computer time was allocated by the COVID-19 HPC Consortium and provided by the PSC through Grant R01GM116961 from the National Institutes of Health. The Anton 2 machine at PSC was generously made available by D. E. Shaw Research."