Molecular interaction and molecular structure for metastability
In accordance with the metastable free energy state, IDPs, as the scaffold protein of MLOs, often harbor transient and weak molecular interactions 34. The low complexity domains(LCDs), as the domains that mediate the LLPS of IDPs, are largely enriched with charged, polar and aromatic residues whilst commonly devoid of hydrophobic residues 22. Weak, multivalent and non-specific interactions, including electrostatic, pi–pi, cation–pi and dipole–dipole interactions (between polar amino acids), are prevalent among residues in LCDs (Figure 2A )36. The long-range electrostatic interactions among charged blocks may facilitate the initiation of LLPS, while short range interactions, including pi–pi, cation–pi and dipole–dipole interactions, may mediate the multivalent contacts among weakly interacting motifs 34. Among these molecular interactions, cation–pi interactions are considered as the strongest, with the free energy of binding (\(\Delta G_{\text{bind}}\)) around –3.6 kcal/mol 57. This magnitude is lower than the average \(\Delta G_{\text{bind}}\) per residue implicated in the formation of Aβ17–42 amyloids-β protofibrils (–19.3 kcal/mol) 58, thus suggesting molecular interactions driving the formation of MLOs are much weaker than amyloid plaques. Compared with static amyloids 34, LCDs harbor transient interactions among residues with higher dynamics. This can be quantified by fluorescence recovery after photobleaching (FRAP), commonly showing a half time of recovery (\(t_{\frac{1}{2}}\)) on the order of seconds (normalizing the diameter of bleaching spot to 1 μm)34. Efforts have been made to shed light on the possible reason of liquid phase formation from the molecular level, including theory of amyloids-β fibril formation 59, multivalent domain interaction network model 60 and theory of polymer physics 22.
IDPs lack a stable and well-defined 3D molecular structure61–63. IDPs are always devoid of stable tertiary structures under physiological conditions, albeit collapsed IDPs could harbor some stable secondary structure elements 64. The lack of stable structure can be possibly considered as one common and crucial feature for IDPs to form metastable MLOs65. The unstable conformation allows the flexibility of IDPs as major scaffold constituents, which may contribute to the physical fluidity of MLOs 65. The unstable conformation allows formation of the weak and multivalent interactions, which is a common hallmark for the interactions that contribute to LLPS3. The IDPs harbor ‘stickers-and-spacers’ structural features, wherein modules provide attractive interactions are considered as ‘stickers’, and flexible linkers provide no significant attractive interactions are considered as ‘spacers’. The unstable structure with ‘stickers-and-spacers’ features allows the multivalent presence of PTM sites 66, whilst PTMs can efficiently alter the stability of MLOs 53.
Modular interaction domains connected by disordered linkers can mediate multivalent interactions that drive LLPS. Rosen et al. reported the LLPS of multivalent signaling proteins. Neural Wiskott–Aldrich syndrome protein (N-WASP), the actin-regulatory protein, interact with its established biological partners NCK and phosphorylated nephrin1 to form LLPS, wherein NCK contain three SH3 domains that can bind to the six proline-rich motif (PRM) ligands of N-WASP 67. Similar multivalent system were also reported in T cell receptor signaling pathway 68, nucleophosmin (NPM1) interacting with proteins comprising arginine-rich linear motifs and ribosomal RNA69 and pair of polySUMO–polySIM interacting multivalent scaffold proteins 70.
Weakly interacting motifs are prevalent in LCDs to mediate LLPS. It has been widely known that tightly self-complementing ‘steric zipper ’ structure forms solid-like amyloid-β plaques with hydrophobic interfaces and high stability 71–73. By contrast, IDPs largely host motifs that can form thermodynamically metastable ‘kinked β sheets74–77 molecular structure,i.e. , the archetypical [G/S]Y[G/S] motifs of FUS protein. These motifs can form close interactions as quantified by the structural complementarity (\(S_{c}\)) (Table 1 ). However, side chains cannot interdigitate across the β-sheet interface owing to the prevention of kinks. They thus harbor smaller buried solvent-accessible surface area (\(A_{b}\)) and more hydrophilic interfaces, thus exhibiting much lower stability 74 (Figure 2Band Table 1 ). This is exemplified by metastable interaction motifs in LCDs of FUS 55,78–82, Tau83, TDP-43 84 and hnRNP17,76,85 proteins. Specifically, short associative peptide motifs within LCDs can form metastable fibrils in vitro , whilst exhibiting melting behavior in response to mild heating, which is distinctive from stable amyloid fibrils (Table 2 ).Besides kinked β sheets, other interaction motifs that can mediate multivalent interactions were also reported, including repeated [F/R]G and G[F/R] pair motifs of Ddx4 proteins 53, α-helix-forming 321AMMAAAQAAL330motif of TDP-43 proteins 86, VPGXG (X is a guest residue except proline that can modulate phase behaviour) motifs of elastin-like proteins 87 and GHGLY motif of histidine-rich squid beak proteins 52. In addition, specific motifs may hinder LLPS, i.e. , FGDF can bind to G3BPs to block the formation of stress granules 88.
The metastable molecular structures and phase behavior of IDPs can be drastically altered simply by mutations17,55,76,80,84,85,89,90 or PTMs55,82,84 on one single residue. For example, The phosphorylation of FUS protein by kinase at the Ser42 site drastically altered the molecular interactions of LCDs, haltering the formation of metastable fibrils and LLPS formation 55. This prominent alteration of phase behavior can be attributed to disruption of metastable kinked structure. The Ser42 site is the primary phosphorylation site by DNA-dependent protein kinase (DNA-PK)91. The phosphorylation at Ser42 can significantly disrupt the hydrogen bonds between Ser42 and Tyr38, interfere with the interaction of mating sheet and destabilize the RAC1 interacting motif, thereby modulating the ability of FUS to undergo LLPS. Additionally, the mutation of Ser42 to Asp (S42D) can also remarkably depress the LLPS of LCD of FUS protein, decreasing the critical temperature of LLPS by 5 °C, as the mutation S42D is a change that mimics serine phosphorylation55.
There are two major types of phase behavior for a biological LLPS system of interest, namely, the entropy-driven LCST phase behavior and enthalpy-driven UCST phase behavior 24. How is the type of phase behavior encoded in motifs of protein sequences? Chilkotiet al. synthesized artificial IDP-like polymers harboring several tens of repeats of short peptide motifs 87. They found that motifs with low-charge content and high hydrophobicity feature tend to engender IDP-like polymer with LCST behaviour, which is reminiscent of tropoelastins. By comparison, motifs with high-charge content and low hydrophobicity feature tend to engender IDP-like polymer with UCST behaviour, which is reminiscent of the dual UCST and LCST behaviour at extremes of temperature of resilin. Furthermore, Chilkoti et al.found that hysteresis behavior can also be encoded and tuned at the motif level by the precise position of an amino acid within a motif, as well as at the macromolecule level by chain length 92.
The unique molecular interaction and molecular structure allow metastable MLOs with unique properties, including liquidity, high dynamics and environmental responsiveness. Learning from nature, the responsible domains and motifs of IDPs have also been exploited as building blocks to design bio-inspired materials93–95, which have been reviewed elsewhere96.