Noroviruses constitute a major genus in the family Caliciviridae, which contains icosahedral viruses with positive-sense single-stranded RNA genome. In humans, these constantly evolving viruses are the cause of sporadic and epidemic gastroenteritis. Despite a lack of a reproducible cell culture system or a small animal model, remarkable progress has been made in our understanding of the molecular biology, immunology, structural biology, and evolution of human noroviruses. This understanding is further enhanced by studies of nonhuman noroviruses and animal caliciviruses that are cultivatable. The main focus of this chapter is to review our current understanding of the structural biology of noroviruses in particular and of caliciviruses in general, with an emphasis on the unique modular organization of the capsid that allows for strain-dependent variations in glycan recognition and antigenicity to facilitate sustained virus evolution. Finally, structures of the proteins are reviewed that are critical for virus replication and that can be targeted in the design of small molecule drugs for use as effective antivirals.
Structural Biology of Noroviruses
Abstract
Structural Biology of Noroviruses
Chapter 3.1
S. Shanker*
Z. Muhaxhiri*
J.-M. Choi*
R.L. Atmar**
norovirus
calicivirus
T=3 capsid assembly
capsid-receptor interactions
histo-blood group antigens
viral protease
viral polymerase
VPg
Noroviruses (NoVs) constitute one of the five genera in the Caliciviridae family (Ramani et al., 2014; Green et al., 2000). They are the leading cause of epidemic acute gastroenteritis (Ahmed et al., 2014). It is estimated that these viruses are responsible for ∼20 million total illnesses with a disease burden of ∼2 billion dollars in the United States alone each year, and ∼200,000 annual deaths of children under the age of 5 years worldwide (Patel et al., 2008). NoVs exhibit considerable genetic diversity. Based on phylogenetic analyses, NoVs are classified into six genogroups (GI-GVI), and they are further subdivided into genotypes (designated with an Arabic numeral) within each genogroup. While the genogroups GI, GII, and GIV predominantly contain human strains, the other genogroups only contain animal strains (Zheng et al., 2006). Epidemiological studies indicate that the NoVs belonging to genogroup II, genotype 4 (GII.4) are the most prevalent and account for up to 70-80% of the outbreaks worldwide (Ramani et al., 2014; Kroneman et al., 2008). These GII.4 NoVs undergo epochal evolution, similar to A/H3N2 influenza virus strains, with the emergence of new variants every 2 years coinciding with a new epidemic peak (Siebenga et al., 2007; Donaldson et al., 2008; Lindesmith et al., 2012). Recent epidemiological studies also show a considerable increase in the prevalence of GI outbreaks worldwide, with different genotypes, such as GI.4, GI.6, GI.3, and GI.7 predominating in different geographical regions (Vega et al., 2014; Grytdal et al., 2015). Several studies have demonstrated that susceptibility to many NoVs is determined by genetically controlled expression of histo-blood group antigens (HBGAs), which are also critical for NoV attachment to host cells (Ruvoen-Clouet et al., 2013) (see Chapter 3.3). Consistent with their high genetic diversity, these viruses exhibit extensive strain-dependent variation in the recognition of HBGAs, which together with antigenic variations allow for their sustained evolution. The preponderance of global NoV outbreaks together with the recognition of new genogroups and rapid emergence of new variants within each genogroup signify a major health concern, particularly considering current lack of effective antiviral strategies either in terms of vaccines or in terms of small molecule drugs.
Members of the Caliciviridae, including NoVs, are nonenveloped, icosahedral viruses typically 380–400 Å in diameter. The genome consists of a linear, positive-sense, single-stranded RNA of 7.4 to 8.3 kb in size with a covalently linked VPg at the 5′ end and a polyadenylated tail at the 3′ end (Green, 2007; Thorne and Goodfellow, 2014). Caliciviruses exhibit two distinct types of genome organization. In the members of the Norovirus and Vesivirus genera, the genome is organized into three open reading frames (ORFs), whereas in the Sapovirus, Lagovirus, and Nebovirus genera, the genome is organized into two ORFs (Thorne and Goodfellow, 2014; Smiley et al., 2002). In all cases, however, the calicivirus RNA encodes a large polyprotein, the major capsid protein VP1 (55–70 kDa), and a basic minor structural protein VP2 (Bertolotti-Ciarlet et al., 2003; Sosnovtsev et al., 2005). In the Norovirus and the Vesivirus genera, the large polyprotein, VP1 and VP2 are encoded separately by ORF1, ORF2, and ORF3, respectively. In contrast, in the Sapovirus, Lagovirus, and Nebovirus genera, the polyprotein and the major capsid protein VP1 are contiguously encoded by ORF1, and VP2 is encoded by the ORF2. In all caliciviruses, the polyprotein is posttranslationally processed by the viral protease, which itself is a component of the polyprotein, into nonstructural proteins (NSPs) that are essential for virus replication. In NoVs, these NSPs include p48, p41 (NTPase), p22, VPg, protease, and RNA-dependent RNA polymerase (RdRp) (Thorne and Goodfellow, 2014).
Capsid organization of NoVs and several other caliciviruses have been studied either by cryo-EM or by X-ray crystallographic techniques (Chen et al., 2004; Prasad et al., 1994a,b; Kumar et al., 2007; Katpally et al., 2008; Wang et al., 2013). The structures of recombinant NoV (rNoV) particles from different genogroups, murine NoV (MNV), and three animal caliciviruses are known. Since the human NoVs are so far resistant to growth in cell culture, recombinant virus-like particles (VLPs) have been produced by the coexpression of VP1 and VP2, preserving the morphological and antigenic features of the authentic virions, for use in structural studies. The first crystallographic structure of a calicivirus capsid was that of recombinant Norwalk virus (rNV), which is a GI.1 NoV (Prasad et al., 1999) (Fig. 3.1.1A). Since then, crystallographic structures of San Miguel Sealion virus (SMSV) (Chen et al., 2006) (Fig. 3.1.1B) and feline calicivirus (FCV) (Ossiboff et al., 2010) (Vesivirus genus), derived from authentic virions, and a GII.4 (HOV strain) recombinant NoV (rHOV) capsid (manuscript in preparation) have been determined. All these structural studies have consistently shown that calicivirus capsids, irrespective of the genera, have similar capsid architecture with a T=3 icosahedral symmetry (Fig. 3.1.1C), formed by 90 dimers of VP1 (Fig. 3.1.1D).
The capsid protein has a modular domain organization with an N-terminal arm (NTA) that is important for directing capsid assembly, followed by a shell (S) domain that is important for stabilizing the icosahedral scaffold (Bertolotti-Ciarlet et al., 2002), and a protruding (P) domain emanating from the icosahedral shell that is further divided into P1 and P2 subdomains (Figs. 3.1.1D–F). The S and P domains are linked by a flexible hinge. The P1 subdomain is formed by two noncontiguous segments within the P domain, whereas the P2 subdomain facing the exterior is formed by the intervening segment. The polypeptide fold in each of these domains is also essentially conserved among calicivirus structures. The S domain exhibits an 8-stranded antiparallel β-barrel motif that is typically observed in T=3 icosahedral viruses (Prasad and Schmid, 2012). The fold of the P1 subdomain, consisting of three β-strands in the N-terminal portion, a twisted antiparallel β-sheet formed by four strands in the C-terminal portion, and a well-defined α-helix, is novel and only seen in calicivirus structures (Fig. 3.1.2A). The fold of the P2 subdomain is a β-barrel of six antiparallel strands connected by loops of various lengths. Despite the similar structural characteristics among the members of Caliciviridae, there are significant variations within the capsid protein structure providing insight into how the unique modular organization of the capsid protein is conducive to the wide diversity and host specificity associated with this family of viruses. Comparisons of the calicivirus capsid protein sequences indicate that the S domain is highly conserved and the P1 subdomain is moderately conserved, whereas the distally located P2 subdomain is highly variable.
In the T=3 icosahedral lattice, the capsid protein is located in three quasi-equivalent positions, conventionally designated A, B, and C, which constitute the icosahedral asymmetric unit (Fig. 3.1.1C). The A subunits surround the icosahedral fivefold axis, whereas B and C subunits alternate around the icosahedral threefold axis giving rise to quasi sixfold symmetry. In the calicivirus capsid, as in other T=3 icosahedral capsids, A, B, and C subunits form two types of quasi-equivalent dimers, A/B dimer related by the quasi twofold axis (A/B2) and C/C dimer related by the strict icosahedral twofold axis (C/C2) (Fig. 3.1.2C). A common feature is that the A/B dimer has a “bent” conformation, whereas the C/C dimer has a “flat” conformation. Such a dual conformation imparts the necessary curvature for the formation of the T=3 icosahedral capsid. In many of the structurally characterized T=3 capsids, the NTA is implicated in providing a switch by undergoing an order to disorder transition to facilitate the bent A/B and the flat C/C conformations during T=3 capsid assembly (Harrison, 2001; Rossmann and Johnson, 1989). In these T=3 virus capsids, including rNV and rHOV, only one of the three NTAs of the quasi-equivalent subunits is ordered. In the case of rNoV structures, while the NTA of the B subunit is ordered to a larger extent, the equivalent regions in the A and C subunits are disordered (Fig. 3.1.1F). The ordered NTA portion of the B subunit interacts with the base of the S domain of the neighboring C subunit to stabilize the flat conformation of the C/C dimer (Fig. 3.1.1F). The equivalent interactions involving the NTA are not observed in the A/B dimer which adopts a bent conformation (Prasad et al., 1999). In contrast, although serving the same purpose, an ordered NTA of the C subunit provides a switch in the T=3 plant tombus- and sobemoviruses instead of the NTA of the B subunit as observed in rNoV capsids (Harrison, 2001; Rossmann and Johnson, 1989). In nodaviruses, which also exhibits a T=3 capsid organization, in addition to an ordered NTA of the C subunit, a piece of genomic RNA keeps the “flat” conformation of the C/C dimers (Fisher and Johnson, 1993). Interestingly, the SMSV and FCV capsid structures exhibit a novel and distinct variation from any of these viruses. In these structures, the NTAs of all three subunits are equally ordered, essentially maintaining the T=3 symmetry at this level. Instead of an order-to-disorder transition of the NTA, a distinct conformational change involving a Pro residue in the B subunit that leads to the formation of a ring-like structure around the fivefold axis appears to provide a switch (Chen et al., 2006; Ossiboff et al., 2010). Whether these unique NTA interactions found only in SMSV and FCV are influenced by the genome or the proteolytic processing of the capsid protein, a common feature in the members of Vesivirus genus, remains a question.
Like capsid proteins in other T=3 icosahedral viruses with distinct S and P domains, such as plant tombusviruses (Harrison, 2001), calicivirus VP1 has a flexible hinge between these two domains (Fig. 3.1.1D, denoted by h and an arrow). This hinge, which facilitates the interactions between the P1 subdomain and the upper portion of the S domain, is likely important for locking the A/B and C/C dimers in their appropriate conformations, as this interaction is seen only in the A/B dimers and not in the C/C dimers (Prasad et al., 1999; Chen et al., 2006; Ossiboff et al., 2010). Although such an interaction between the P1 and S domains is conserved in rNoV, SMSV, and FCV crystal structures as a structural requirement, the relative orientations between these two domains, likely because of the sequence changes, are noticeably altered. In the animal calicivirus structures including SMSV, FCV, and recently determined high resolution cryo-EM structure of rabbit hemorrhagic disease virus (RHDV) (Wang et al., 2013), this change in S–P1 orientation together with a compensatory change in the P1–P2 orientation causes only the distal P2 subdomain to participate in the dimeric interactions. This is in contrast to that observed in rNoV structures, in which both P1 and P2 subdomains participate in the dimeric interactions. A conserved glycine located at the junction of P1 and P2 is suggested to allow this compensatory change in the P1–P2 orientation. Thus, in addition to the flexibility between the S and P1 domains, there is an additional point of flexibility between the P1 and P2 subdomains that was not immediately apparent from the rNoV structures alone. The multiple points of flexibility could be an important factor in enhancing calicivirus diversity through structural variations within the context of a similar domain organization, somewhat akin to the interdomain flexibility seen in antibody structures with a hinge and an elbow.
In addition to structures of the calicivirus capsids, in recent years, structures of the recombinant P domain of different NoV genogroups and genotypes and of RHDV have been determined (Cao et al., 2007; Choi et al., 2008; Hansman et al., 2011; Kubota et al., 2012; Shanker et al., 2011; 2014; Taube et al., 2010). All these P domain structures show a similar dimeric conformation to that observed in the capsid structures (Choi et al., 2008). While all of the NoV P domain dimer structures determined to date consistently show both P1–P1 and P2–P2 dimeric interactions as observed in rNoV capsid (Prasad et al., 1999), the RHDV P domain dimer (Wang et al., 2013), as in SMSV (Chen et al., 2006) and the FCV capsid (Ossiboff et al., 2010) structures, only exhibits P2–P2 interactions. Thus, dimer-related P2–P2 interactions are a common feature in all of the caliciviruses for which structures have been determined. The underlying functional significance of this structural conservation across the caliciviruses is unclear.
Comparison of the calicivirus sequences clearly indicates that the region that forms the P2 subdomain exhibits the most variability, consistent with its role in host specificity, antigenicity and receptor interactions. Accordingly, comparison of the animal calicivirus and norovirus capsid and P domain structures show that the P2 subdomain exhibits the most structural variability (Fig. 3.1.2B–D). Despite significant variation in the sequences, the basic polypeptide fold in the P2 subdomain, with six antiparallel β-strands forming a β-barrel structure, is conserved in all the calicivirus capsid structures so far determined (Fig. 3.1.2A). The main differences are in the orientations and the extension of the surface-exposed loops that connect these β-strands. These loops extend sideward from the dimeric surface with the conserved β-barrel structure at the dimeric interface. With shorter loops, the NV P2 subdomain exhibits the most compact structure, whereas the P2 subdomains of SMSV and FCV exhibit the most elaborate loop structures. The loop regions in the P2 subdomain of GII.4 are also more elaborate compared to GI.1 NV, exemplifying the intergenogroup differences in NoVs that play a role in differential glycan recognition as described in the next section. FCV is the only calicivirus for which a protein receptor has been identified (Makino et al., 2006). Cryo-EM structure of the FCV–fJAM-1 complex clearly shows that the receptor fJAM-1 (feline junctional adhesion molecule 1) binds to the P2 subdomain (Bhella et al., 2008). Known neutralization epitopes in the FCV capsid map to the loops in the P2 subdomain (Chen et al., 2006). Thus, crystallographic analyses of caliciviruses of different genera and genotypes demonstrate how the P2 subdomain in these viruses accommodates significant sequence alterations within the context of a conserved polypeptide fold to achieve antigenic diversity and strain-dependent receptor recognition.
Based on the observation that dimeric interactions of the capsid protein particularly involving the P domain are a common feature in all the caliciviruses, a reasonable assumption is that the VP1 dimer is the building block for the assembly. The VP1 dimer may exist in a dynamic equilibrium between “bent” and “flat” conformations in solution prior to assembly, and during the assembly process may switch to appropriate conformations directed by the NTA arms. The role of the NTA arms and also that of dimeric interactions in the capsid assembly is substantiated by a systematic structure-directed mutational analysis of the NV VP1 (Bertolotti-Ciarlet et al., 2002). Based on the calicivirus capsid structures (Chen et al., 2006) and mass spectrometric analysis of rNV capsid and its pH-induced dissociation/association intermediates (Baclayon et al., 2011; Shoemaker et al., 2010), it is plausible that capsid assembly proceeds through the formation of trimers of dimers that are then brought together into an icosahedral structure.
In the context of the capsid assembly and perhaps more particularly in the genome encapsidation, another important factor to be considered is the minor protein VP2. All calicivirus genomes encode this protein (Green et al., 2000), which is highly basic in nature. The association of VP2 with the infectious virus particles has been demonstrated in NV (Glass et al., 2000) as well as FCV (Sosnovtsev and Green, 2000) and is suspected to be present in all caliciviruses. The role of VP2, particularly in enhancing the capsid stability and size homogeneity, is clearly evident from the dynamic light scattering experiments of the NoV VLPs produced by the expression of VP1 alone and in comparison with those obtained from the coexpression of both VP1 and VP2 (Bertolotti-Ciarlet et al., 2003). Studies on FCV also suggest VP2 stabilizes the icosahedral capsid and is required for the production of infectious virus (Sosnovtsev et al., 2005). However, VP2 is not visualized in any of the calicivirus capsid structures, including the crystal structures of the rNoV particles produced by the coexpression of VP1 and VP2. This is likely because of the substoichiometric proportion of VP2 (suspected to be ∼2–8 molecules per virion) which does not allow VP2 to interact with VP1 conforming to the icosahedral symmetry and causes it to be transparent in the structures in which the icosahedral symmetry is explicitly used for structure determination.
Recently, cotransfection studies of NoV VP1 and VP2 in mammalian cells accompanied by systematic mutational analysis demonstrated that VP2 directly interacts with the Ile52 residue of a highly conserved IPPWI motif in the NTA (Vongpunsawad et al., 2013). Although direct interaction with VP1 involves only a small region, this region inside the capsid is surrounded by a stretch of negatively charged residues, suggesting that highly basic VP2 may nonspecifically interact with the interior of the VP1 capsid spanning a larger area to influence both the stability and the size homogeneity of the capsid. Given that the calicivirus capsid lacks an abundance of basic residues in the interior surface, the bound VP2 may counteract the electrostatic repulsion between the RNA and capsid and help stabilize the encapsidated genome in infectious virions. By directly interacting with VP1, VP2 may also play a critical role in encapsidating the genome during capsid assembly, providing a rationale for the observation that VP2 is required for the production of infectious particles in the studies on FCV (Sosnovtsev et al., 2005).
Susceptibility to some NoVs is associated with the expression of genetically determined histo-blood group antigens (HBGAs) (Ramani et al., 2014; Ruvoen-Clouet et al., 2013; Imbert-Marcille et al., 2014). These glycoconjugates, found in mucosal secretions and on epithelial cells (Hakomori, 1999), appear to function as initial receptors or coreceptors for human NoVs (Marionneau et al., 2002; Lindesmith et al., 2003; Hutson et al., 2004; Tan and Jiang, 2005). HBGAs are oligosaccharide epitopes with varying carbohydrate compositions and linkages between them (Marionneau et al., 2001). It has been proposed that human NoVs exploit the polymorphic nature of HBGAs in the host population to counter herd immunity during their evolution. (See also Chapter 3.3).
HBGAs are glycans that include the determinants of secretor-status and blood type of an individual (Ruvoen-Clouet et al., 2013). They are synthesized by the sequential addition of a monosaccharide to a terminal precursor disaccharide by genetically controlled expression of certain glycosyl-transferases. Depending upon the composition of the disaccharide backbone and the linkage, they are grouped in the four types: type 1 (Galβ1-3GlcNAcβ), type 2 (Galβ1-4GlcNAcβ), type 3 (Galβ1-3GalNAcα), and type 4 (Galβ1-3GalNAcβ). Although several studies have suggested that the secretor-positive status is a susceptibility factor for the majority of NoVs (Lindesmith et al., 2003; Hutson et al., 2005), recent epidemiological studies indicate that nonsecretors are susceptible to some GI and GII NoVs (Currier et al., 2015). The secretor-positive status of an individual is determined by expression of a functional FUT2 enzyme, a fucosyl-transferase, that catalyzes the addition of an α fucose (SeFuc) to the β galactose (β Gal) of the disaccharide precursor to form the secretor epitope or the H-type HBGA. The H-type HBGA can be further modified by enzymes A or B by adding N-acetyl galactosamine (GalNAc) or Gal to the precursor β Gal to form A- or B-type HBGA, respectively (Ruvoen-Clouet et al., 2013). Similarly, the Lewis-positive status is determined by the activity of the fucosyl transferase 3 (FUT3) enzyme, which adds an α fucose (LeFuc) to the N-acetylglucosamine (GlcNAc) of the precursor disaccharide to form the Lewis epitope. Thus the sequential addition of carbohydrate moieties by the FUT2 and FUT3 along with enzymes A and B give rise to the secretor/nonsecretor Lewis and ABH families of HBGAs (Imbert-Marcille et al., 2014; Marionneau et al., 2005). Several studies using NoV VLPs with saliva, red blood cells, and synthetic carbohydrates demonstrate direct interaction between NoV VLPs and HBGAs (Tan and Jiang, 2005; Hutson et al., 2003; Huang et al., 2005; Shirato et al., 2008) and show that the specificity to HBGAs varies not only within a particular genogroup but also between the genogroups.
A typical strategy that is used to understand the structural basis of the HBGA interactions in NoVs is to determine the X-ray structure of the recombinantly expressed P domain of the NoV in complex with the HBGA (Cao et al., 2007; Choi et al., 2008; Tan et al., 2004). Following this strategy, in recent years there has been an explosion of crystallographic structures of the P domain of various NoVs in complex with a variety of HBGAs (Cao et al., 2007; Choi et al., 2008; Hansman et al., 2011; Kubota et al., 2012; Shanker et al., 2011, 2014; Prasad et al., 2014; Jin et al., 2015; Atmar et al., 2015). In addition to revealing that the HBGA interaction exclusively involves distally exposed regions of the hypervariable P2 subdomain, these studies have shown how the HBGA binding sites between the genogroups differ and how the sequence variations in the P2 subdomain within each genogroup affect the HBGA specificity.
A striking observation from the crystallographic studies is that the binding sites in GI and GII NoVs are distinctly different in their locations, structural characteristics, and in the modalities how the carbohydrate residues of the HBGA interact (Fig. 3.1.3A). In GI NoVs, while the majority of interactions with HBGA are localized within each subunit of the P domain dimer, in GII NoVs, they are shared between the opposing subunits of the P dimer (Fig. 3.1.3B). Another distinguishing feature is that in GI, the majority of the interactions with HBGA primarily involve a Gal moiety (Fig. 3.1.3C), whereas in GII, the interactions are centered on the Fuc residue (Fig. 3.1.3D). An exceptionally well-conserved feature in GI viruses is the hydrophobic interaction between the SeFuc moiety (as in the H-type) or the N-acetyl arm of N-acetylgalactosamine (as in the A-type) with a conserved Trp residue in the P2 subdomain (Fig. 3.1.3C). This combinatorial requirement of Gal and hydrophobic interactions appears to restrict the types of HBGAs that GI NoVs can bind. Several studies consistently show that most GI NoVs do not bind B-type HBGAs. A likely explanation is as follows: although B-type HBGA has a terminal Gal residue, it lacks an additional group, such as SeFuc present in the H-type or an N-acetyl arm present in the A-type that could engage in the hydrophobic interactions, resulting in lower affinity. HBGA binding in GII NoVs does not involve such a combinatorial requirement allowing them to bind all ABH HBGAs. This likely is one of the factors why GIIs, particularly GII.4 NoVs, are globally more prevalent.
Sequence alterations in the P domain within the genogroup members also contribute to the variation of HBGA binding profiles (Prasad et al., 1999; Cao et al., 2007; Choi et al., 2008; Hansman et al., 2011; Kubota et al., 2012; Shanker et al., 2011; 2014; Bu et al., 2008; Chen et al., 2011). A generalizable concept from the crystallographic studies of the GI and GII P domains in complex with various HBGAs is that HBGA binding in human NoVs involves two sites. The first site is formed by the highly conserved residues in the less flexible regions of the P2 subdomain, which preserves the Gal and Fuc dominant nature of interactions in GI and GII, respectively. Minor variations in this site could result in differences in the HBGA binding affinity. The second site is formed by the less conserved residues that are typically from the loop regions surrounding the first site. This site allows for differential binding to Lewis HBGAs in both GI and GII as discussed later.
In GI NoVs, sequence changes differentially alter their ability to bind nonsecretor monofucosyl Lewis HBGA (Lea/x) as observed in GI.4, GI.6, GI.3, GI.2, and GI.7 NoVs (Lindesmith et al., 2012; Vega et al., 2014; Grytdal et al., 2015; Ruvoen-Clouet et al., 2013; Green, 2007). The GI.1, in contrast, does not bind Lea/x. Crystallographic structures of GI.7 and GI.2 P domain with various HBGAs including Lea have been determined (Shanker et al., 2014; Kubota et al., 2012). Comparative analysis of these crystal structures with GI.1 show while the Gal binding site remains invariant, genotypic sequence variations profoundly alter the loop structures to allow differential HBGA specificity and possibly antigenicity. Based on such comparative analyses, it is suggested that the threshold length and structure of one of the loops, the P loop, is the critical determinant for Lea binding (Fig. 3.1.3E). The comparative analysis further showed significant differences in loops A and B. These two loops in GI.7 are significantly more separated in a distinctly “open” conformation in contrast to a “closed” conformation in GI.1 and GI.2 P domains. Interestingly, in the GI.1 NV, the B loop contains a residue critical for binding of HBGA blocking antibodies (Chen et al., 2013), and the corresponding loop in the P domains of murine NoV (genogroup V) (Katpally et al., 2008; Taube et al., 2010) and rabbit hemorrhagic disease virus (animal calicivirus) (Wang et al., 2013) contains the neutralization antigenic sites. Thus, this region is potentially a major site for differential antigenic presentations contributing to serotypic differences among the GI variants.
Similarly, in GII NoVs, including GII.4 variants that are suggested to undergo epochal evolution, while the first site involved in the interactions with Fuc is well conserved, the second site is susceptible to genotypic or temporal alterations and allows for differential binding of difucosyl Lewis HBGAs. A fascinating example is provided by the comparative analysis of the P domain structures of four GII.4 variants in which structural changes in the T-loop of the P2 subdomain, due to temporal sequence changes, modulate the binding strength of the difucosyl Lewis HBGAs between the variants (Shanker et al., 2014; Bu et al., 2008; Singh et al., 2015) (Fig. 3.1.3F). Interestingly, these crystallographic studies have also revealed a novel variation. In contrast to the three GII.4 variants (1996, 2004, and 2006) in which the first site interacts with the seFuc, in the 2012 variant, the first site is involved in anchoring the LeFuc residue, which is also observed in the GII.9 NoV. These studies suggest that the epochal evolution of GII.4 is driven by differentially (de Rougemont et al., 2011) targeting secretor-positive, Lewis-positive individuals.
Another important observation from the crystallographic studies of the P domain of GII.4 NoVs is that the temporal sequence changes contribute to distinct differences in the electrostatic landscape of the P2 subdomain, likely reflecting antigenic variations (Shanker et al., 2011). Some of these changes are in close proximity to the HBGA binding sites suggesting a coordinated interplay between antigenicity and HBGA binding in epochal evolution (Shanker et al., 2011). Despite the lack of a cell culture system and an efficient small animal model system for human NoVs, surrogate HBGA blockade assays with human antibodies, in lieu of neutralization assays, have shown how the variations within GII.4 variants affect antigenic profiles (Lindesmith et al., 2008, 2011, 2012). Circulating antibodies that block HBGA binding correlate with protection in chimpanzees (Bok et al., 2011). The importance of such surrogate neutralizing antibodies is further underscored by recent studies showing that circulating antibodies that block HBGA binding correlate with protection from NoV-associated illness (Atmar et al., 2011; Reeck et al., 2010). Although the effect of sequence changes on HBGA binding has been structurally well characterized, currently no structural studies have been reported on how HBGA-blocking antibodies interact with NoV strains.
The ORF1 in NoVs encodes a polyprotein that is proteolytically processed by virus-encoded protease into at least six NSPs (Thorne and Goodfellow, 2014). These NSPs (from N- to C-terminus of the polyprotein) include p48 of unknown function; p41, an NTPase with AAA+ sequence motifs similar to the picornavirus enzyme 2C; p22, which shares sequence similarities with picornavirus 3A, with a possible function as an antagonist of Golgi-dependent cellular protein secretion (Sharp et al., 2012); VPg that is covalently linked to the viral RNA; a protease that is analogous to the picornavirus 3C enzyme, and an RNA-dependent RNA polymerase (RdRp) orthologous to picornavirus 3Dpol (Pfister and Wimmer, 2001; Sosnovtsev et al., 2002; Hardy, 2005; Clarke and Lambden, 2000). In recent years, substantial progress has been made in understanding structure and function of three of these proteins including VPg, protease, and RdRp as discussed later. (See also Chapter 3.2).
Similar to picornaviruses and as demonstrated in animal caliciviruses, VPg (by inference in NoVs) is covalently linked to the genomic RNA (Thorne and Goodfellow, 2014). However, unlike picornavirus VPg, which is only about ∼2 kDa in size, VPg in caliciviruses is significantly larger (12–15 kDa). The calicivirus genomes do not have a capped 5′-end, like cellular RNAs, or an internal ribosome entry site (IRES) as in picornaviral RNA, for RNA translation. Studies on animal caliciviruses and NoVs have attributed a dual role to this protein. First, VPg acts as a “cap substitute” and mediates translation initiation of viral RNA based on the observations that m7-GTP cap can substitute for VPg to confer infectivity in vitro to synthesized FCV RNA (Sosnovtsev and Green, 1995), and that it can bind directly to initiation factor eIF4E (Daughenbaugh et al., 2003; 2006; Goodfellow et al., 2005). Second, analogous to picornavirus VPg, the NoV VPg has a priming function during RNA replication based on the observation that VPg is uridylylated at a conserved Tyr residue by the viral RdRp followed by elongation in the presence of RNA (Belliot et al., 2008; Han et al., 2010; Mitra et al., 2004; Royall et al., 2015; Chung et al., 2014).
Currently, there is no structural information on full-length VPg for any calicivirus. However, NMR structures of the central core of VPg, consisting of about ∼55 residues, from FCV (Leen et al., 2013), porcine sapovirus (PSV) (Hwang et al., 2015), and murine NoV (MNV) (Leen et al., 2013) have been determined. The N-terminal and the C-terminal regions flanking the central core are considered mostly disordered. The VPg core structures of FCV and PSV are very similar with a well-defined three-helical bundle, whereas that of MNV consists of only two of these helical segments (Fig. 3.1.4A). The Tyr residue within a conserved DDEYDEW motif is suggested to function as a nucleotide acceptor for viral replication and translation (Belliot et al., 2008; Han et al., 2010; Mitra et al., 2004). In all the three structures, the location of this residue, which is fully solvent exposed, is conserved. Although currently there are no structural studies on calicivirus VPg-RdRp, crystallographic structures of VPg-RdRp complex of foot-and-mouth disease virus (a picornavirus) both in the presence and absence of oligoadenylate substrate have been determined, and provide mechanistic details of how VPg interacts with RdRp in carrying out its priming function (Ferrer-Orta et al., 2006). Given that calicivirus VPg is significantly larger than picornavirus VPg, it remains to be seen how much of the structural details of the interactions between VPg and RdRp in caliciviruses remain similar.
The role of the NoV protease in the polyprotein processing of the polyprotein, the primary cleavage sites, and the order in which it cleaves the polyprotein have been firmly established. The proteolytic processing of the polyprotein occurs in a sequential manner as a mechanism to regulate viral genome expression and replication (Belliot et al., 2003; Hardy et al., 2002). Protease cleaves the polyprotein at five sites with three different cleavage junctions—Gln/Gly, Glu/Gly, and Glu/Ala, first cleaving the two Gln/Gly junctions between p48/p41 and p42/p22 followed Glu/Gly (VPg/Pro) and Glu/Ala (Pro/RdRp) junctions (Sosnovtsev and Green, 2000; Hardy et al., 2002; Sosnovtsev et al., 2006). These sites exhibit significant variations in the amino acid composition in both the N- (P5–P2) and C-terminal (P2′–P5′) sides flanking the scissile bond (P1/P1′) (Fig. 3.1.4B). Mutational analysis has shown that residues surrounding the cleavage sites contribute to proteolytic efficiency (Hardy et al., 2002). An interesting question is how the protease recognizes such nonhomologous sites within the polyprotein with differential affinities.
Crystallographic structures of proteases from different NoVs show that NoV protease, similar to the picornavirus 3CPro, is a cysteine protease with a chymotrypsin-like fold comprised of two domains separated by a groove where the active site is located (Hussey et al., 2011; Leen et al., 2012; Nakamura et al., 2005; Zeitler et al., 2006; Muhaxhiri et al., 2013) (Fig. 3.1.4C). The active site consists of a catalytic triad with Cys as a nucleophile, His as the general base catalyst, and Glu as the anion to orient the imidazole ring of His, similar to the Ser-His-Asp triad in the trypsin-like serine proteases (Bazan and Fletterick, 1988). More recently, crystal structures of the NoV protease in complex with substrates bearing P1–P4 residues or substrate-mimics have provided novel insights into the structural basis of substrate recognition and how NoV protease accommodates varying residue compositions of the substrate (Fig. 3.1.4D) (Leen et al., 2013; Hussey et al., 2011; Muhaxhiri et al., 2013). These studies show that the substrate adopts an extended β-strand conformation to pair with a β-strand in the active site cleft of the protease, and the side chains P1–P4 optimally interact with the S1–S4 pockets of the protease. The P1 positions in the NoV polyprotein show limited variation with only Glu or Gln. Interestingly, Gln in P1 position is a common occurrence in the picornavirus and coronavirus substrates that are cleaved by their respective proteases, which are structurally similar to the NoV protease. The S1 pocket in the NoV protease is ideally suited for optimal hydrophobic and hydrogen bond interactions with either Glu or Gln and remains unaltered with variations in P2–P4 positions. A novel observation from these protease-substrate structures is the conformational change induced by the substrate binding in the main chain amide group of a conserved Gly adjacent to the catalytic Cys to form the oxyanion hole required for stabilizing the tetrahedral intermediate during peptide hydrolysis (Muhaxhiri et al., 2013).
Another particularly striking observation is the coordinated conformational alterations that S2 and S4 pockets undergo in response to variations in the residue composition at P2 and P4 positions of the substrate (Muhaxhiri et al., 2013). The S2 pocket undergoes transition from an “open” state as observed in the apo-structure to a gradual closed state depending upon the bulkiness of the sidechain in the P2 position, whereas the S4 pocket shows a reverse trend from a closed (apo-structure) to an open state in response to P4 sidechain interactions Fig. 3.1.4D). In contrast to P1, P2, and P4 positions, which show extensive interactions with well-defined S1, S2, and S4 pockets, respectively, the interactions between the P3 residue and the S3 pocket are minimal suggesting that this position can tolerate variations.
Similar to proteases of other RNA viruses, the NoV protease, particularly of the G1.1 NV, has been targeted for structure-assisted design and development of small molecule inhibitors (Muhaxhiri et al., 2013; Deng et al., 2013; Prior et al., 2013). Currently, these studies have focused on the characterization of substrate-based peptido-mimetics containing aldehydes, ketones, esters, or bisulfite adducts as electrophilic warheads attached to the P1 residue that form a covalent bond with the sulfhydryl group of the catalytic cysteine. Structures of the NV protease with three of the inhibitors (Muhaxhiri et al., 2013), containing a terminal aldehyde and selected based on their potency of inhibition, have been determined. These studies show that the mechanism of inhibition, in addition to the formation of the covalent adducts, involves prevention of the conformational change necessary for the formation of the oxyanion hole. They further suggest that peptido-mimetics with suitable warheads, a Glu-like chemical entity at P1 for optimal interactions with S1, and an appropriate combination of hydrophobic residues at P2 and P4 that maximizes the interactions with S2 and S4, have a direct impact on the potency of the inhibitors. Further studies should be anticipated that are directed at optimizing the design strategies and improving the pharmacokinetic properties of such inhibitors as antivirals using structural analysis and cell-based assays (Qu et al., 2014; Chang et al., 2006).
In caliciviruses, this obligatory enzyme that is similar to the picornavirus 3Dpol, is responsible for synthesizing both negative-sense RNA as well as newly made positive-sense genomic RNA. X-ray structures of RdRps from several caliciviruses including RHDV (Ng et al., 2002), GII NoV (Ng et al., 2004; Zamyatkin et al., 2008), sapovirus (Fullerton et al., 2007), and MNV (Lee et al., 2011; Hogbom et al., 2009) have been determined. Calicivirus RdRp exhibits a typical “right hand” configuration of palm, finger, and thumb domains as observed in all RNA/DNA polymerases (Ng et al., 2008). In addition to these domains, like in other RdRps, it has a distinct N-terminal domain bridging the fingers and the thumb domains. The active site of the RdRp is located in the thumb domain, which consists of three conserved Asp residues critical for mediating catalysis through a two metal-ion mechanism (Fig. 3.1.5C) and other key residues such as Arg, Asn, and Ser required for substrate binding and catalysis.
Comparative analysis of the various calicivirus RdRps show the following: While the conformations of the individual domains, despite only marginal sequence similarities, are highly conserved, the conformations of the loop regions, N-terminal domain, C-terminal region, and interdomain orientations are susceptible to significant variations. Norovirus RdRp can exist in two principal conformations: an “open” active site conformation that represents the inactive state of the RdRp, as observed generally in the apo RdRp structures (Ng et al., 2002; Ng et al., 2004; Fullerton et al., 2007) (Fig. 3.1.5A); and a “closed” active site conformation that is primed for catalyzing the nucleotidyl transfer reaction, as observed in the RdRp structure in complex with divalent metal ions, the primer-template RNA duplex, and the NTP that is required for template elongation (Zamyatkin et al., 2008) (Fig. 3.1.5B). The “closed” conformation allows optimal positioning of the NTP, RNA and the metal ions for the catalytic reaction to occur (Fig. 3.1.5C). Transition from the “open” to “closed” conformation involves displacement of the C-terminal tail, which triggers the central helix in the palm domain to rotate by about 20 degree (Fig. 3.1.5D). The C-terminal tail appears to function as a lid to regulate the access of the active site. In the inactive “open” conformation, it is located in the active site restricting the access for the RNA, whereas in the active “closed” conformation, it moves away and essentially becomes unstructured. This region, which is relatively well conserved in all NoVs, is suggested to play a role not only in regulating the initiation of RNA synthesis but also in mediating interactions with accessory proteins during replication.
In addition to its known interaction with VPg during VPg-primed RNA synthesis, recent studies of human and murine NoVs have shown that the major capsid protein VP1 can interact with RdRp through its S domain in a species-specific and concentration-dependent manner to modulate the rate and kinetics of RdRp activity (Subba-Reddy et al., 2012). Such an interaction is suggested to play a significant role in the temporal regulation of RdRp activity during genome replication, capsid assembly, and genome encapsidation. Previous to these studies, the possibility of VP1–RdRp interaction in replication complexes was also suggested in the case of FCV based on a yeast two-hybrid assay (Kaiser et al., 2006). Further studies are required to understand the mechanistic basis of how the S domain of VP1 influences the RdRp activity and whether VPg and VP1 are the only interacting partners of RdRp or whether other viral proteins, such as p41 with its NTPase activity or protease with its recently discovered property to bind RNA (Viswanathan et al., 2013), can also exert influence on the RdRp activity. This is feasible, considering their likely close proximity in the membranous replication complexes.
Structural studies on calicivirus RdRps have provided a rational basis to embark on structure-based in silico screening for small molecule inhibitors (Mastrangelo et al., 2012; Croci et al., 2014). These studies identified two molecules, suramin and its analogue NF023, consisting of naphthalene-trisulfonic acid moiety, that inhibit NoV RdRp with IC50s of 24.6 and 71.5 nM, respectively. Crystallographic analysis show that both inhibitors interact with the RdRp at a similar region along the access pathway of the NTPs between the fingers and thumb domains. Future studies using similar structure-based techniques as well as high throughput screening (Eltahla et al., 2014) are likely to provide more potent small molecule RdRp inhibitors with desirable pharmacokinetic properties.
In the last two decades, starting from cryo-EM and X-ray crystallographic analyses of the recombinant NV capsid, there have been a significant number of structural studies that have led to a better understanding of the structure and function of NoVs and their encoded proteins. These studies have provided insights into how the elements required for capsid assembly, strain diversity, and immunogenicity are integrated into a single capsid protein through an elegant modular domain organization and how the distally located P2 subdomain with a unique fold provides an efficient platform for genotype-dependent variations in glycan recognition and antigenicity to facilitate virus evolution. Structural studies on nonstructural proteins, such as protease and RdRp have uncovered fascinating novel mechanistic details that underlie their enzymatic functions, and allowed design and development of small molecule inhibitors. However, there are still several significant questions that merit further studies.
For the capsid proteins, these questions include: (1) what is the structural basis of how ‘neutralizing’ antibodies block HBGA binding in human NoVs? The results may inform the design of immunotherapeutic agents. (2) What is the structure and function of the minor protein VP2? Such data may provide insights into capsid assembly and genome encapsidation. For the nonstructural proteins, our understanding of the proteins, such as p48, p41, and p22 both in terms of structure and function is very limited. Sequence analysis and available experimental data suggest that these proteins are likely membrane-associated and have a role in initiating and structuring vesicular replication compartments. 2C-like p41 with NTPase activity is particularly enigmatic with distinct AAA+ motifs—raising the question: how is the NTPase activity used during replication? VPg is largely disordered except for the central core. As it has to interact with multiple partners during replication, such a flexible state prior to interacting with its partners could be a necessity. Further structural studies may provide insight into how this protein interacts with eIF4E and RdRp. The recent discovery that protease can interact with RNA opens further structural and functional studies to understand how this interaction interferes with the enzymatic activity, and whether it has a role in replication. Many of these proteins including RdRp are likely to have temporal and transient interactions with each other within the confines of the replication compartments to regulate and coordinate various stages of virus replication. Further structural studies of virus infected cells including electron tomographic approaches may provide mechanistic insights into these processes. Considering the recent advances in producing human NoVs in cultured cells (Katayama et al., 2014; Jones et al., 2014), exciting progress in furthering our understanding of NoV structure-function is to be expected.
We acknowledge support from NIH grant PO1 AI057788 (MKE, RLA, BVVP), and a grant (Q1279) from the Robert Welch foundation (BVVP).