Primary structure
Encyclopedia : P : PR : PRI : Primary structure
In biochemistry, the primary structure (also known as the primary sequence) of a biological molecule is the exact specification of its atomic composition and the chemical bonds connecting those atoms (including stereochemistry). For a typical unbranched, un-crosslinked biopolymer (such as a molecule of DNA, RNA or typical intracellular protein), the primary structure is equivalent to specifying the sequence of its monomeric subunits, e.g., the nucleotide or peptide sequence. The term "primary structure" was first coined by Linderstrom-Lang in his 1951 Lane Medical Lectures.
Primary structure of polypeptides
In general, polypeptides are unbranched polymers, so their primary structure can often be specified by the sequence of amino acids along their backbone. However, proteins can become cross-linked, most commonly by disulfide bonds, and the primary structure also requires specifying the cross-linking atoms, e.g., specifying the cysteines involved in the protein's disulfide bonds. Other crosslinks include desmosine...
The chiral centers of a polypeptide chain can undergo racemization. In particular, the L-amino acids normally found in proteins can spontaneously isomerize at the [\mathrm}] atom to form D-amino acids, which cannot be cleaved by most proteases.
Finally, the protein can undergo a variety of posttranslational modifications, which are briefly summarized here.
The N-terminal amino group of a polypeptide can be modified covalently, e.g.,
- acetylation [\mathrm}]
- The positive charge on the N-terminal amino group may be eliminated by changing it to an acetyl group (N-terminal blocking).
- formylation [\mathrm]
- The N-terminal methionine usually found after translation has an N-terminus blocked with a formyl group. This formyl group (and sometimes the methionine residue itself, if followed by Gly or Ser) is removed by the enzyme deformylase.
- pyroglutamate
- An N-terminal glutamine can attack itself, forming a cyclic pyroglutamate group.
- myristoylation [\mathrm\right)_-CH_}]
- Similar to acetylation. Instead of a simple methyl group, the myristoyl group has a tail of 13 hydrophobic carbons, which make it ideal for anchoring proteins to cellular membranes.
The C-terminal carboxylate group of a polypeptide can also be modified, e.g.,
- amidation (see Figure)
- The C-terminus can also be blocked (thus, neutralizing its negative charge) by amidation.
- glycosyl phosphatidylinositol (GPI) attachment
- phosphorylation
- Aside from cleavage, phosphorylation is perhaps the most important chemical modification of proteins. A phosphate group can be attached to the sidechain hydroxyl group of serine, threonine and tyrosine residues, adding a negative charge at that site and producing an unnatural amino acid. Such reactions are catalyzed by kinases and the reverse reaction is catalyzed by phosphorylases. The phosphorylated tyrosines are often used as "handles" by which proteins can bind to one another, whereas phosphorylation of Ser/Thr often induces conformational changes, presumably because of the introduced negative charge. The effects of phosphorylating Ser/Thr can sometimes be simulated by mutating the Ser/Thr residue to glutamate.
- glycosylation
- deamidation (succinimide formation)
- methylation
- acetylation
- Acetylation of the lysine amino groups is chemically analogous to the acetylation of the N-terminus. Functionally, however, the acetylation of lysine residues is used to regulate the binding of proteins to nucleic acids. The cancellation of the positive charge on the lysine weakens the electrostatic attraction for the (negatively charged) nucleic acids.
- sulfation
- prenylation and palmitoylation [\mathrm\right)_-CH_}]
- carboxylation
- A relatively rare modification that adds an extra carboxylate group (and, hence, a double negative charge) to a glutamate side chain, producing a Gla residue. This is used to strengthen the binding to "hard" metal ions such as calcium.
- ADP-ribosylation
- ubiquitination and SUMOylation
Most of the polypeptide modifications listed above occur post-translationally, i.e., after the protein has been synthesized on the ribosome, typically occurring in the endoplasmic reticulum, a subcellular organelle of the eukaryotic cell.
Many other chemical reactions (e.g., cyanylation) have been applied to proteins by chemists, although they are not found in biological systems.
Modifications of primary structure
In addition to those listed above, the most important modification of primary structure is peptide cleavage. Proteins are often synthesized in an inactive precursor form; typically, an N-terminal or C-terminal segment blocks the active site of the protein, inhibiting its function. The protein is activated by cleaving off the inhibitory peptide.
Some proteins even have the power to cleave themselves. Typically, the hydroxyl group of a serine (rarely, threonine) or the thiol group of a cysteine residue will attack the carbonyl carbon of the preceding peptide bond, forming a tetrahedrally bonded intermediate [classified as a hydroxyoxazolidine (Ser/Thr) or hydroxythiazolidine (Cys) intermediate]. This intermediate tends to revert to the amide form, expelling the attacking group, since the amide form is usually favored by free energy, (presumably due to the strong resonance stabilization of the peptide group). However, additional molecular interactions may render the amide form less stable; the amino group is expelled instead, resulting in an ester (Ser/Thr) or thioester (Cys) bond in place of the peptide bond. This chemical reaction is called an N-O acyl shift.
The ester/thioester bond can be resolved in several ways:
- Simple hydrolysis will split the polypeptide chain, where the displaced amino group becomes the new N-terminus. This is seen in the maturation of glycosylasparaginase.
- A β-elimination reaction also splits the chain, but results in a pyruvoyl group at the new N-terminus. This pyruvoyl group may be used as a covalently attached catalytic cofactor in some enzymes, especially decarboxylases such as S-adenosylmethionine decarboxylase {SAMDC) that exploit the electron-withdrawing power of the pyruvoyl group.
- Intramolecular transesterification, resulting in a branched polypeptide. In inteins, the new ester bond is broken by an intramolecular attack by the soon-to-be C-terminal asparagine.
- Intermolecular transesterification can transfer a whole segment from one polypeptide to another, as is seen in the Hedgehog protein autoprocessing.
History of protein primary structure
The proposal that proteins were linear chains of α-amino acids was made nearly simultaneously by two scientists at the same conference in 1902, the 74th meeting of the Society of German Scientists and Physicians, held in Karlsbad. Franz Hofmeister made the proposal in the morning, based on his observations of the biuret reaction in proteins. Hofmeister was followed a few hours later by Emil Fischer, who had amased a wealth of chemical details supporting the peptide-bond model. For completeness, the proposal that proteins contained amide linkages was made as early as 1882 by the French chemist E. Grimaux.
Despite these data and later evidence that proteolytically digested proteins yielded only oligopeptides, the idea that proteins were linear, unbranched polymers of amino acids was not accepted immediately. Some well-respected scientists such as William Astbury doubted that covalent bonds were strong enough to hold such long molecules together; they feared that thermal agitations would shake such long molecules asunder. Hermann Staudinger faced similar prejudices in the 1920's when he argued that rubber was composed of macromolecules.
Thus, several alternative hypotheses arose. The colloidal protein hypothesis stated that proteins were colloidal assemblies of smaller molecules. This hypothesis was disproven in the 1920's by ultracentrifugation measurements by The Svedberg that showed that proteins had a well-defined, reproducible molecular weight and by electrophoretic measurements by Arne Tiselius that indicated that proteins were single molecules. A second hypothesis, the cyclol hypothesis advanced by Dorothy Wrinch, proposed that the linear polypeptide underwent a chemical cyclol rearrangement C=O + HN [\rightarrow] C(OH)-N that crosslinked its backbone amide groups, forming a two-dimensional fabric. Other primary structures of proteins were proposed by various researchers, such as the diketopiperazine model of Emil Abderhalden and the pyrrol/piperidine model of Troensegaard in 1942. Although never given much credence, these alternative models were finally disproven when Frederick Sanger successfully sequenced insulin and by the crystallographic determination of myoglobin and hemoglobin by Max Perutz and John Kendrew.
Relation to secondary and tertiary structure
The primary structure of a biological polymer to a large extent determines the three-dimensional shape known as the tertiary structure, but nucleic acid and protein folding are so complex that knowing the primary structure often doesn't help either to deduce the shape or to predict localized secondary structure, such as the formation of loops or helices. However, knowing the structure of a similar homologous sequence (for example a member of the same protein family) can unambiguously identify the tertiary structure of the given sequence. Sequence families are often determined by sequence clustering, and structural genomics projects aim to produce a set of representative structures to cover the sequence space of possible non-redundant sequences.
Primary structure in other molecules
Any linear-chain heteropolymer can be said to have a "primary structure" by analogy to the usage of the term for proteins, but this usage is rare compared to the extremely common usage in reference to proteins. In RNA, which also has extensive secondary structure, the linear chain of bases is generally just referred to as the "sequence" as it is in DNA (which usually forms a linear double helix with little secondary structure). Other biological polymers such as polysaccharides can also be considered to have a primary structure, although the usage is not standard.See also
References
- Iwai K and Ando T. (1967) "N [\rightarrow] O Acyl Rearrangement", Methods Enzymol., 11, 263-282.
- Perler FB, Xu MQ and Paulus H. (1997) "Protein Splicing and autoproteolysis mechanisms", Curr. Opin. Chem. Biol., 1, 292-299.
- Paulus H. "The chemical basis of protein splicing", Chem. Soc. Rev., 27, 375-386.
- Fischer E. (1902) Autoreferat. Chem. Ztg., 26, 93.
- Troensegaard N. (1942) Über die Struktur des Proteinmoleküls: eine chemische Untersuchung. E. Munksgaard, Køpenhavn (Copenhagen).
- Sanger F. (1952) "The arrangement of amino acids in proteins", Adv. Protein Chem., 7, 1-67.
- Fruton JS. (1979) "Early theories of protein structure", Ann. N.Y. Acad. Sci., 325, 1-18.
- Wieland T and Bodanszky M (1991) The World of Peptides, Springer Verlag. ISBN 038752830X
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.



