As of December 2021, over 8 billion vaccine doses for coronavirus disease 2019 (COVID-19) have been administered, including around 217 million ‘booster’ shots. The main target for these vaccines has been the so-called “spike,” or “S,” protein, an essential viral protein that plays a key role in allowing the virus to invade host cells.
While vaccines are critical, the development of COVID-19 therapeutics has revealed that intrinsically disordered proteins may play a critical pathological role. Historically, biologists believed the amino acid sequence of each protein determines its three-dimensional structure, which, in turn, determines its function. However, there is a large group of proteins and regions that lack a fixed or ordered 3D structure, yet they still exhibit essential biological activities—so-called intrinsically disordered proteins and regions (Figure 1).
This protein disorder is encoded into amino acid sequences and is abundant in all living organisms and viruses. A deeper understanding of these noteworthy regions within the SARS-CoV-2 protein characteristics could enable faster progress for therapeutic development in COVID-19.
Examples of intrinsically disordered proteins
The natural variability found in the “intrinsically disordered proteins” (IDPs) or the “intrinsically disordered regions” (IDPRs) within proteins can be noticed in all three kingdoms of life. They are linked to important processes such as enzyme catalysis, allosteric regulation, cell signaling, transcription, and others.
However, they also play a role in disease, including neurodegeneration, diabetes, cardiovascular disease, amyloidosis, genetic diseases, and cancer. Furthermore, viral proteins often contain such regions, which have been correlated with virulence because they endow the viral proteins’ ability to easily and promiscuously bind to host proteins.
Interest in IDPs/IDPRs in protein science has been rapidly increasing since the year 2000, as demonstrated by a search in the CAS Content CollectionTM (Figure 2), and their roles in drug design, including in COVID-19, are beginning to be explored.
Intrinsically disordered proteins in SARS-CoV-2
SARS-CoV-2 forms a virion including its genomic RNA bundled in a particle comprising: the S protein, important for entry into host cells; the membrane (M) protein facilitating viral assembly; the ion channel small envelope (E) protein; and the nucleocapsid (N) protein, which assembles with viral RNA to form the nucleocapsid (Figure 3).
IDPs/IDPRs are not common in the SARS-CoV-2 proteome. As a matter of fact, SARS-CoV-2 proteome exhibits significant levels of structural order — except for the nucleocapsid (N) protein, SARS-CoV-2 proteins are highly ordered proteins containing a few intrinsically disordered protein regions. Noteworthy, however, the existing disordered regions contribute significantly to the functioning and virulence of the virus and are thus promising drug targets for antiviral drug discovery; such an approach has already proven to be valuable in identifying new drug candidates.
The nucleocapsid (N) protein
The RNA-binding N protein stabilizes the genomic RNA inside the virus particle and regulates the viral genome transcription, replication, and packaging. The N protein is highly disordered—its average percentage of predicted intrinsic disorder is around 65%. These disordered regions seem to be important in maintaining the nucleocapsid and so could serve as targets for drug design. Disordered regions within the N protein also appear to be important in enabling the protein to aggregate via a process termed ‘liquid-liquid phase separation’, potentially as a way of disrupting the natural formation of stress granules, important in host cell immunity. Thus, disruption of the N protein liquid-liquid phase separation process holds promise for antiviral intervention and offers new targets and strategies for the development of drugs to combat COVID-19.
The spike (S) protein
The S protein ornaments the viral surface like a crown. It is critical for viral entry into the host (Figure 4) and as such, has been a commonly-utilized drug target in the development of COVID-19 vaccinations. The receptor binding and membrane fusion, the initial steps in infection, are both related to regions of substantial intrinsic disorder.
Analysis of the S protein indicates that both S subunit cleavage sites associated with S maturation, and the S fusion peptide, are associated with IDPRs. Considering that proteolytic digestion is considerably faster in unstructured relative to structured protein regions, this structural specificity of the SARS-CoV-2 S protein might be of high functional importance.
During SARS-CoV-2 virus infection, IDPRs can be detected at the interface of the spike protein and ACE2 receptor, the receptor found in human tissues to which the virus binds. The key residues of the spike protein have a strong binding affinity to ACE2, one likely reason for the high transmissibility of SARS-CoV-2.
Thus, receptor binding and membrane fusion, the initial and important steps in the coronavirus infection, are both related to regions of substantial intrinsic disorder in the S protein. They are primary targets for inhibiting SARS-CoV-2 infection.
The membrane (M) protein
The M protein is a major transmembrane protein that is found in large numbers in the virion. SARS-CoV-2 has one of the hardest protective outer shells among coronaviruses–this is potentially related to the low intrinsic disorder of the M protein (6%) and may be responsible for the high resilience and transmissibility of the virus. Indeed, a correlation has been shown between the virulence of various viruses and the percentage of intrinsic disorder of their M proteins, with less disordered M proteins being associated with more contagious viruses.
Future outlook: the frontiers of drug design
The appearance of novel viruses and associated epidemics around the globe are currently a major concern. Knowledge of the structures and functions of viral proteins is thus of high significance for identification of novel therapeutic targets for prevention and treatment of disease.
In our peer-reviewed publication in ACS Infectious Diseases, we summarize the information available on the SARS-CoV-2 proteome with regards to the occurrence of intrinsic disorder in its proteins. In fact, it has been recognized that the SARS-CoV-2 proteome exhibits substantial levels of structural order–only the N protein is highly disordered. Although other SARS-CoV-2 proteins are characterized by lower degrees of disorder, their existing IDPRs contribute significantly to the functioning and virulence of the virus and are promising drug targets for antiviral drug design.
IDPs are widespread and have numerous crucial biological functions that complement the functionality of ordered proteins. However, when misfunction occurs (e.g., misexpression, misprocessing, or misregulation), IDPs/IDPRs tend to engage in undesirable interactions and become involved in the development of various pathological states. In fact, many proteins associated with neurodegeneration, diabetes, cardiovascular disease, amyloidosis, and genetic diseases, as well as most of the human cancer-related proteins, are either IDPs or contain long IDPRs.
Although structural biology techniques can be utilized in drug development, the practice of rational drug design has traditionally underrepresented the presence of intrinsic disorder in target proteins. Understanding the structure of these regions in the SARS-CoV-2 and other pathogenic proteomes would clearly be of great benefit for drug development in COVID-19 and beyond, continuing to push the boundaries of drug design.