The exact, atomic scale structures of the main proteins which play a key role in the Zika virus (ZIKV) lifecycle have yet to be determined, by X-ray crystallography or NMR experiments. Currently the cryo-EM structures for the complete virion has been elucidated 1 which gives us structures of the glycoprotein E and M structures.

Recently, the structures of the NS12 and E3 protein were determined by X-ray crystallography. Until we have the structures for all the proteins of the Zika virus, we will use approximate structures derived from a process called homology modeling.

NS1 protein structure determined by X-ray crystallography

E glicoprotein structure determined by X-ray crystallography

Homology modeling is a computational tool which helps to obtain 3D coordinates of proteins based on homologous protein, when there is no experimental data available. Several steps compose the process: (1) identification of template; (2) single or multiple primary sequence alignments; (3) model building for the target based on the 3D structure of the template; (4) model refinement, analysis of alignments, gap deletions and additions, and (5) model validation.4 In case of ZIKV proteins, the process involves using the genetic information for the Zika proteins and looking for very similar target proteins from other organisms, such as the dengue virus, West Nile virus, Murray Valley encephalitis virus and Japanese encephalitis virus, for which some of the protein structures are known. These known structures are then used as the basis to develop models of the targets that likely resemble the Zika proteins. We have recently described the development of multiple ZIKV homology models for the proteins NS5, FtsJ, HELICc, DEXDc, Peptidase S7, NS1, E Stem, Glycoprotein M, propeptide, capsid and glycoprotein E.5

It is important to highlight that our homology models obtained for the NS1 and E structures (available on line in March 3rd, 2016)showed a good overlap to the NS12 and E3 structures experimentally determined by X-ray crystallography (published on April 18th and May 2nd, 2016, respectively).

When we aligned both structures of NS1 (alpha carbons, chain A), the average RMSD (root mean square deviation) was 0.818 Å. The alignment between both structures of E glycoprotein (alpha carbons, chain A), was also good and the average RMSD (root mean square deviation) was 1.860 Å. The Figure of the protein structures overlap can be seen below.

NS1 crystal structure (purple, alpha carbons, Chain A) aligned with NS1 structure homology model (orange, alpha carbons, Chain A)

E crystal structure (blue, alpha carbons, Chain A)3  aligned with E structure homology model (magenta, alpha carbons, Chain A)

The images below show the selected ZIKV homology models structures (minimized proteins) that had good sequence coverage with template proteins and the Ramachandran plots for each protein, showing the Phi and Psi dihedral angles  for all amino acids in the protein structure.5 In Ramachandran plots, red regions represents most favored combinations of Phi-Psi values; yellow region represents additional allowed combinations; beige region represents generously allowed combinations; and white regions are disallowed combinations.

ZIKV NS5 protein homology model (left) and Ramachandran plots for ZIKV NS5 (right)

ZIKV FtsJ protein homology model (left) and Ramachandran plots for ZIKV FtsJ (right)

ZIKV HELICc protein homology model (left) and Ramachandran plots for ZIKV HELICc (right)

ZIKV DEXDc protein homology model (left) and Ramachandran plots for ZIKV DEXDc (right)

ZIKV Peptidase S7 protein homology model (left) and Ramachandran plots for ZIKV Peptidase S7 (right)

ZIKV NS1 protein homology model (left) and Ramachandran plots for ZIKV NS1 (right)

ZIKV protein E Stem homology model (left) and Ramachandran plots for ZIKV E Stem (right)

ZIKV Glycoprotein M  homology model (left) and Ramachandran plots for ZIKV Glycoprotein M (right)

ZIKV Propeptide homology model (left) and Ramachandran plots for ZIKV Propeptide (right)

ZIKV Capsid homology model (left) and Ramachandran plots for ZIKV Capsid (right)

ZIKV Glycoprotein E homology model (minimized proteins) (left) and Ramachandran plots for ZIKV Glycoprotein E (right).

These models will be used to prioritize compounds to be tested in in vitro experiments. In previous studies, homology models have been successfully used to identify new compounds that were proven to be effective against pathogenic protein targets. World Community Grid projects such as Genome Comparison and Uncovering Genome Mysteries have been helpful in identifying similar proteins.

1. Sirohi, D.; Chen, Z.; Sun, L.; Klose, T.; Pierson, T. C.; Rossmann, M. G.; Kuhn, R. J. The 3.8 Å resolution cryo-EM structure of Zika virus, Science, 2016; 352(6284):467-470.10.1126/science.aaf5316

2. Song, H.; Qi, J.; Haywood, J.; Shi, Y.; Gao, G.F. Zika virus NS1 structure reveals diversity of electrostatic surfaces among flaviviruses, Nat.Struct.Mol.Biol. 2016; 23(5):456-458. 10.1038/nsmb.3213

3.Dai, Lianpan et al. Homology Modeling a Fast Tool for Drug Discovery: Current Perspectives, Indian J Pharm Sci. 2012; 74(1): 1–17. http://dx.doi.org/10.1016/j.chom.2016.04.013

4.Vyas,V. K.; Ukawala, R. D.; Ghate,M.; Chintha,C. Structures of the Zika Virus Envelope Protein and Its Complex with a Flavivirus Broadly Protective Antibody, Cell Host & Microbe 2012; 19(5): 696–704. 10.4103/0250-474X.102537

5. Ekins, S.; Liebler, J.; Neves, B. J.; Lewis, W. G.; Coffee, M.; Bienstock, R.; Southan, C.; Andrade, C. H. Illustrating and Homology Modeling the Proteins of the Zika Virus. F1000Research 2016; 5 (0), 275. 10.12688/f1000research.8213.1