31 January 2005

Biol 6312

Helix capping

Hydrogen bonding patterns of residues at the ends of helices. Remember that the first 4 residues at the N-terminus of an α-helix have unsatisfied H-bond donors (>NH), and that the last 4 residues at the C-terminus have 4 unsatisfied H-bond acceptors (>CO). Helix capping refers to H-bonding to these groups, primarily by nearby side chains. Hydrophobic interactions can also be involved. This helps to "seal" the helix, and prevent easy "fraying".

Two important references:

Helix capping
Rajeev Aurora, George D. Rose
Protein Science (1998) 7: 21-38

Periodicity in a-Helix Lengths and C-capping Preferences
Simon Penel, R. Gwilym Morrison, Russell J. Mortishire-Smith, Andrew J. Doig
J. Mol. Biol. (1999) 293: 1211-1219

First define the positions: N-cap (or C-cap) is the first residue (or last) that is outside, or partially outside, the helix. N1 is the first residue in the helix, N2 is the second, ...

N' is the first residue outside the helix (after the Ncap)
N'' is the next residue outside the helix,...

Likewise for the C1, C2,...... and C', C'', ...

In the paper by Aurora and Rose the propensities of the individual amino acids for all positions (+ or - 5 ) from the N-cap and the C-cap sites are calculated, using a database of protein structures.

How to calculate propensities:    They are normalized frequencies, fij

For example, the propensity of Ala
at the Ncap would be

Charts of these are linked below:

Gly Ala Ser,Thr Phe, Tyr Trp Met
Pro Val,Ile,Leu Asn,Gln His Lys, Arg Glu, Lys
Gln,Glu Cys Asp,Glu Asn,Asp

They found 7 distinct capping motifs: 3 at the N-terminus and 4 at the C-terminus.

At the N-terminus, amide Hydrogens are H-bonded predominantly to side chains of nearby residues.

At the C-terminus, carbonyl oxygens are H-bonded predominantly to backbone amides in the following turn.

In both cases hydrophobic residues at the termini of the helix make hydrophobic contacts with turn residues. (Also see reference below)

Ermolenko DN, Thomas ST, Aurora R, Gronenborn AM, Makhatadze GI.
Hydrophobic interactions at the Ccap position of the C-capping motif of alpha-helices.
J Mol Biol. 2002 Sep 6;322(1):123-35.

In a related study, Penel et al found a preference among a-helices to be an integral number of turns, rather than half-integral. Presumably, this allow the turns at both ends to be on the same side, as would be necessary for a surface helix.

Half turns do exist, also, and they found different propensities at the C-terminus, depending upon whether the helix was a half or a full turn.

Properties and Tendencies of the Individual Amino Acids

The actual torsional angles that each type of amino acid exhibits in a large database has been tabulated and provided by Gerald Kleywegt of Uppsala University. He hosts a web site about the small molecules in protein crystal structures, so-called "hetero-compounds", called HIC-Up

They are linked below:

Gly Pro Ala Val Ile
Leu Met Cys Ser Thr
Phe Tyr Trp Lys Arg
His Asn Gln Asp Glu

3 others: all a.a., all (-gly), and all (-gly,pro)

Glycine, Gly, G Gly

Smallest amino acid, no side chain, no beta-carbon, so greatest range of torsional angles
Positive phi values (+70, +5) left-handed alpha-helix, common at C-terminus of helix, about 1/3 of all alpha-helices end in a Gly,

Gly is also common at the interface when helices pack tightly, (see Jmol)

most Glycine are found outside of repeating secondary structure, e.g. in turns

In beta-sheets, Gly modulates the twist, it can allow large, flat anti-parallel sheets as in silk.

If Glycine allows so many different torsional angles, why is it not more commonly found in proteins? Why not glycine everywhere?

1. Gly loses the most entropy upon protein folding. In the unfolded state it has many possible conformations, but the folded state is (nearly) a single conformation.

2. Gly contributes little to the packing of the interior of proteins, it has no hydrophobic effect.

Therefore Gly contributes little in general to the stability of folded proteins. It tends to be found just where it is needed, in a tight space, or with unusual torsional angles.

Proline, Pro, P Pro

Phi angle constrained to about -60 degrees, because side chain forms ring with amino N
This angle is closer to alpha-helix than beta-sheet, but the Pro lacks an amino Hydrogen for H-bonding in the helix, so it is most often found near the N-terminus (where it is not needed). If found in the helix interior, it often kinks it, about 30˚. H-bonding can be disrupted, essentially creating 2 separate helices. (see Jmol)

Pro is common in turns and loops, very rare in beta-sheets.

Summary: helices (26%), turns (23%), loops, connections (38%), beta-sheets, usually edges (13%)

Pro is commonly found in the alpha-helices of membrane proteins. This is not clear why.

Entropically, Pro is stabilizing to proteins, just as Gly is destabilizing. It loses the least entropy upon protein folding.

cis peptide bond common (about 6% of all X-Pro peptide bonds). The barrier to rotation is somewhat lower than for other amino acids . This isomerization is now known to be enzyme catalyzed by Peptidyl prolyl isomerases (PPI).

Influence of Proline Residues on Protein Conformation (Abstract only)
Malcolm W. MacArthur, Janet M. Thornton
J. Mol. Biol. (1991) 218: 397-412

Cysteine, Cys, C Cys

3 distinct forms, sulfhydryl -SH, sulfide -S-, disulfide (Cystine) -S-S-

1.

-SH These groups must be shielded, or else they could form unintended disulfides. Membrane proteins, or any interior position, but not common in helices.

2. -S- often a ligand to a metal: Fe, Cu, Zn
often 2-4 Cys per metal
Fe-S proteins are often involved in redox chemistry
Zn-S proteins are often involved in DNA-binding
amino acid sequence patterns can often be recognized because of the conserved geometry of the metal binding (see Jmol)

3. Disulfides. are common in exported proteins. The disulfide can form in a more oxidizing environment (e.g. Golgi, periplasm) They include hormones, digestive enzymes. Disulfide bond formation is catalyzed by a family of enzymes called protein disulfide isomerases (PDI).

Geometry is important. (see Jmol)

The S-S bond is about 2.05 Å.

The C-S bonds are about 1.8 Å.

The C-S-S angle is about 103˚

Cys is often compared to Ser, but they are quite different

Cys Ser
weak H-bonds Strong H-bonds
-SH is reactive -OH groups is often modified, but by enzymes
buried inside proteins found in turns, loops on the surfaces of proteins

Alanine, Ala, A Ala

Small size, a single methyl group, No conformational preferences, found inside proteins, and on the surfaces. Very common in helices. In beta-sheets, the small size is a compromise between the flexibility of Gly and its packing surface

Valine, Isoleucine, Leucine, Methionine

Val, Ile, Leu, Met.

V Val , I Ile, L Leu , M Met

V

I

L

M

These are the variously shaped aliphatic side chains. They provide the tight fit of the packed interior of proteins. Met is very flexible, and tends to be partially exposed. It costs a lot of entropy to bury it. The other 3 are all branched, but Val and Ile are branched at the beta-carbon, nearer the backbone, while Leu is branched farther away at the gamma carbon.

This correlates with the greater propensities of Leu and Met to be found in alpha-helices, and Val and Ile to be found in beta-strands.

Short, wide side chains pack well in the extended beta-sheet structures (see Jmol) Longer side chains can extend from one helix to another helix (See Jmol)

Serine, Ser, S Ser and Threonine, Thr, T Thr

These are similar residues, each with a side chain -OH, they tend to be surface residues, not alpha-helix
They are both H-bond donors and acceptors. Ser is more common in turns, loops, where as Thr is more common in beta-strands, because it branches at the beta-carbon (see above). Due to its extra methyl group, Thr is somewhat more hydrophobic.

S T

Asparagine, Asn, N Asn and Glutamine, Gln, Q Gln

.

Similar side chains, but conformationally Asn is special Gln has an extra methylene group

1. The side chain can mimic the backbone by its H-bonding capacity, and so it is often found in certain types of turns.

An asparagine residue is shown with side chain marked as Y. The chain continues to the right and the left.
If the asparagine residue rotates around the C-C bond, it can bring the side chain CO into the position formerly occupied by the main chain CO. The chain now continues down instead of to the left. A turn has been created, and the Asn side chain is uniquely able to contribute a CO to replace the main chain CO.

2. Asn is common at the Ncap position because its side chain C=O can H-bond to the NH of residue 4 (Ser or Asp can also do this)

3. Asn is more likely to have positive phi angles than any residue other than Gly
Example: Gly-101, Asn-37, Asp-13, Ser-9, Gln-6

4. Best substitutions for Asn: Ser (if Ncap), Asp (if turn), Gly (if left-handed, positive phi)

5. Gln has an extra methylene group in its side chain, and it places the amide group more remotely from the backbone. It tends to be found in alpha-helices. The longer side chain allows it to reach out and make H-bonds with distant elements in the protein.

Aspartate, Asp, D Asp and Glutamate, Glu, E Glu

Similar to the difference between Asn and Gln, but the negative charge dominates, and therefore the differences are less.

Both tend to be found towards the N-terminus (positive dipole) of the alpha-helix.
Glu, like Gln, tends more to be found throughout the helix. Both are involved in metal binding, e.g. Ca++, and in ion pairs with Arg and Lys.

Lysine, Lys, K Lys and Arginine, Arg, R Arg

Each has a positive charge, but differ in H-bonding and flexibility.
Because of the helix dipole, they tend to be found at the C-terminus of the alpha-helix.

pKa's differ: Arg 12-13, Lys 9-10, so Arg is most likely to be charged, Lys might be unprotonated, for example in a non-aqueous environment.

Lys has a very flexible side chain. Past the beta carbon it typically adopts many positions. Lys is very common on the surface of proteins, for solvation and entropy.
Arginine has a guanidinium group, which is rigid and planar. It has 5 H-bond donors. It often makes multiple H-bonds to ditant regions of the protein, e.g. backbone groups. So, it is important in holding the protein in its conformation. In this way, Arg cannot always be replaced by Lys.

Histidine, His, H His

Aromatic, sometimes positively charge, pK is 6-7

At high pH His looks like this. At lower pH the other N will be protonated, and have a positive charge. The CH group between the N's can perhaps make a weak H-bond also. In this state, His can donate 1 (or 2) H-bonds, and accept one. In the protonated state, they are all donors.

Tyrosine, Tyr, Y Tyr

Aromatic, sometimes negative charge. pKa of OH group is about 10. Often found at interface of inside/outside protein. Can be H-bond donor or acceptor. Found in beta-sheets, irregular structures. Less so in alpha-helices.

Phenylalanine. Phe, F Phe

Aromatic, completely hydrophobic, except π-interactions or H-bonds with cations. Usually buried in proteins. Found in both alpha-helices and beta-sheets

Tryptophan, Trp, W Trp

Largest, aromatic, hydrophobic, except π-interactions, 1 H-bond donor


Comments/questions: email-me

Copyright 2005, Steven B. Vik, Southern Methodist University

Last modified2/4/05