By Alan Cheng
Unprecedented. It describes the current pandemic and its terrible human and economic impact. It also describes the speed and pace of scientific work unraveling the novel virus and enabling drug discovery efforts. That speed and pace is driven in no small part by recent bioinformatics advances.
The first global alert of a novel virus causing severe cases of pneumonia came in late December 2019. Just a couple weeks later, the first genome sequence of this novel virus was completed, enabling scientists to use bioinformatics to identify the novel virus as a beta-coronavirus, and a relative of the SARS and MERS viruses. A phylogenetic analysis found the novel virus most closely resembles the SARS-CoV coronavirus, leading to the official naming of the virus as SARS-CoV-2.
Naming the virus is one thing. Using bioinformatics approaches again, the proteins produced by the virus were identified, and subsequently made and characterized using experimental molecular biology techniques. The close homology of the SARS-CoV-2 proteins to those from SARS-CoV enabled us to transfer the learnings about SARS over the last 15+ years to help us understand SARS-CoV-2 and accelerate vaccine and drug discovery efforts around the world. Homology modeling allows us to rapidly build approximate 3D structures of the proteins and help suggest existing experimental drugs that were quickly put into clinical trials for combatting SARS-CoV-2 viral entry and replication. Experimental molecular biology and protein structure work, while more time and resource-intensive, has led to more accurate atomic resolution 3D structures of key SARS-Cov-2 proteins, including the spike protein, RNA polymerase, and main protease. This molecular understanding allows scientists at biopharmaceutical and academic institutions to efficiently begin to identify molecules that are not only efficacious in stopping viral expansion but also selective enough to be safe while not being too selective that viruses can easily mutate away from drug binding. All of this happened within two months of the genome sequence, with many efforts, especially in drug discovery, ongoing.
There is still a lot unknown about medical aspects of the disease itself, and how SARS-CoV2 interacts with and affects human host biology. Using proteomics approaches, scientists are identifying human host interactions with the viral proteins. Using genomics and statistical genetics approaches, scientists are analyzing how each of our unique genetic compositions and health situations affects disease progression, which will impact how we can most effectively treat patients. Longer term, the unprecedented speed enabled in no small part by bioinformatics will be important in preventing and treating future epidemics as well.
Developing your bioinformatics skillsets. The importance of bioinformatics in improving human health is growing. For current and prospective Brandeis students, here is how your coursework relates to the approaches being discussed.
- Foundational bioinformatics analysis
RBIF 101: Bioinformatics Scripting and Databases with Python
RBIF 111: Biomedical Statistics with R - Genomics analysis of viruses and host response
RBIF 109: Biological Sequence Analysis - Protein homology modeling and structural bioinformatics
RBIF 101: Structural Bioinformatics - Understanding disease biology
RBIF 102: Molecular Biology, Genetics, and Disease - Proteomics and expression profiling
RBIF 114: Molecular Profiling and Biomarker Discovery
RBIF 112: Mathematical Modeling for Bioinformatics - How individual human genetics affects disease progression
RBIF 108: Computational Systems Biology
RBIF 115: Statistical Genetics
RBIF 290: Special Topics: Functional Genomics - Drug discovery for treatments
RBIF 106: Drug Discovery and Development
RBIF 110: Cheminformatics
Bioinformatics resources
- SARS-Cov-2 strains:
https://www.ncbi.nlm.nih.gov/genbank/sars-cov-2-seqs/
https://www.gisaid.org/ - Phylogenetic tree and global epidemiology:
https://www.gisaid.org/epiflu-applications/next-hcov-19-app/ - Cryo-EM and crystal structures of SARS-Cov-2 proteins:
www.rcsb.org (COVID links in the middle of the page) - Virus-host proteomic interactions:
https://thebiogrid.org/ (COVID links in the middle of the page)
https://www.ebi.ac.uk/intact/ (COVID links at top of page)
Dr. Alan Cheng is chair of the MS in Bioinformatics program at Brandeis Graduate Professional Studies. In his day job, Alan is a Director at Merck & Co., where he leads a group applying computational and structure-based approaches towards discovery of new therapeutics. He received a PhD from the University of California, San Francisco, and undergraduate degrees from the University of California at Berkeley. All opinions presented here are his own.
Brandeis Graduate Professional Studies is committed to creating programs and courses that keep today’s professionals at the forefront of their industries. To learn more about the MS in Bioinformatics, visit www.brandeis.edu/gps.
Leave a Reply