Non-proteinogenic amino acids are an important part of peptidic natural products, because they greatly increase the structural diversity of these compounds. Since structure determines function, the non-proteinogenic amino acids are often critical to the biological activities of natural products. One non-proteinogenic amino acid, L-enduracididine (L-End) is relatively rare, but is found in a number of interesting peptide natural products (Figure 1). For example, the antibiotics enduracidin and mannopeptimycin are both very effective against MRSA and vancomycin-resistant enterococci (VRE), but both are too toxic to mammalian cells to be useful therapeutic agents. The relatively recently discovered antibiotic teixobactivn generated a great deal of excitement in early 2015 owing to its ability to kill Gram-positive pathogens while avoiding (thus far) antibiotic resistance. Work aimed at altering enduracidin and mannopeptimycin to alleviate their toxicity problems has been hampered in part by a lack of the L-End building block. Most studies to date have used the peptide core as isolated from the producing organisms, so changes have been concentrated on the periphery of the pharmacophore. Having an abundant supply of L-End will allow researchers to explore the effects of changes to the peptide core. Likewise, should teixobactin become an important clinical antibiotic in the future, a facile route to L-End may facilitate its manufacture.

Figure 1. Natural products containing the non-proteinogenic amino acid L-enduracididine (L-End).

Previous work with the enduracidin-producing strain, Streptomyces fungicidicus, showed that (1) L-End originates from L-Arg and (2) the process involves three enzymes: EndP, EndQ, and EndR. The first two enzymes, EndP and EndQ, share ~30 % sequence identity to known fold-type I PLP-dependent aminotransferases, though EndP is ~100 amino acids shorter than most aminotransferases. Acetoacetate decarboxylase is the only enzyme with significant sequence identity to EndR (~20 %). Given the chemical transformation from L-Arg to L-End, it is clear that at least one of the annotated functions must be incorrect. One of the major outcomes of this project was the assignment activities to these enzymes of unknown function. Since the corresponding enzymes from the mannopeptimycin biosynthetic pathway in Streptomyces. hygroscopicus were better behaved, our structural and biochemical studies focused on MppQ, and MppR. Neither the S. fungicidicus EndP nor the S. hygroscopicus MppP expressed in soluble form.

                                                    Figure 2. The overall structure of MppQ (A) shows that the enzyme, as predicted from its sequence, belongs to the fold type I aminotransferase family. The tertiary structure of MppQ is most similar to the bacterial kynurenine aminotransferase (PDB ID 1X0M). The activity of MppQ (B) is in line with its structural similarity to the aminotransferases. As seen in this series of HPLC chromatograms of standards and various MppQ reaction mixtures, the enzyme catalyzes aminotransfer between arginine and either pyruvate or glyoxylate.The X-ray crystal structure of MppQ (Figure 2A) showed that, as expected from the amino acid sequence, the enzyme is a fold type I aminotransferase similar to the well-studied aspartate aminotransferase. The arrangement of amino acids in the MppQ active site closely matches those of E. coli aspartate aminotransferase and Pyrococcus horikoshii kynurenine aminotrans-ferase. Thus, it was not surprising to find that the enzyme catalyzed an aminotransfer reaction between L-Arg (but not D-Arg) and either glyoxylate or pyruvate (Figure 2B), but not the more common α-ketoglutarate or oxaloacetate. The steady state kinetic parameters for the reverse reaction—aminotransfer from alanine to “α-ketoarginine” (2KA, kcat = 0.1 s-1, KM,αKA = 2.5 µM, kcat/KM = 4.0 x 104 M-1s-1)—suggest that, while this is clearly an enzyme-catalyzed reaction, these are likely not the correct substrates. Glyoxylate appears from the HPLC analysis to be the better amino acceptor substrate, though this cannot be concluded from an equilibrium experiment. Unfortunately, the steady state kinetics could not be analyzed for the reaction with glyoxylate since the lactate dehydrogenase used for the alanine/2KA reaction is not sufficiently fast with glyoxylate to be an effective coupling enzyme and any dehydrogenases we examined either have the same flaw or transform the product back into the substrate. Given the activities of MppR and MppP, it is likely that MppQ is the last enzyme in the pathway, transforming the keto-form of L-End into the amino acid.

The crystal structure of MppR shows that the overall fold is nearly identical to that of acetoacetate decarboxylase (ADC), even though it does not have decarboxylase activity. The “unliganded” structure has a HEPES molecule from the crystallization buffer bound in the active site. The location of the sulfate group near the catalytic lysine residue (purple carbons in Figure 3A) suggested that the enzyme might bind α-keto acids. Thus, pyruvate was soaked into MppR crystals and the structure was determined (Figure 3B). Pyruvate binds covalently to the catalytic lysine (K156) as the Schiff base. The carboxylate group of pyruvate binds in the same area as the HEPES sulfonate group in what we have dubbed the “carboxylate-binding site.” In fact, nearly every α-keto acid that fits in the active site becomes trapped on the enzyme in essentially the same conformation. One of these, α-ketoarginine (not shown), took on an apparently unnecessarily strained conformation that placed one of the guanidine nitrogen atoms ~3.2 Å from the γ-carbon atom of the arginine side chain. This led us to soak 2-oxo-4-hydroxy-5-guanidinovaleric acid (aka 4-hydroxy-2-ketoarginine, 4HKA; Figure 3) into MppR crystals and determine the structure (Figure 3C). Surprisingly, the enzyme catalyzed C-N bond formation between the guanidine N atom and the γ-carbon to form the iminoimidazolidine ring of L-End.  This work was published in Biochemistry (2013, 52(26), p 4492-4506).

Figure 3. The overall fold of MppR (A) matches that of acetoacetate decarboxylase (ADC) almost exactly in spite of the relatively low sequence identity between the two (~20 %). MppR does not react with acetoacetate, though the Schiff base chemistry of ADC is conserved. The enzyme reacts with pyruvate (B) to form a stable imine complex. The pyruvate carboxylate group binds in a pre-ordered carboxylate-binding site formed by R148 and Q152. MppR also forms an imine complex with 4(R/S)-hydroxy-2-ketoarginine, though the complex is observed to react further, forming the iminoimidazolidine ring of L-enduracididine (C).

The enduracididine project came close to ending at this point, but thanks to the depo-sition of the S. waddayamensis genome provided the breakthrough we needed to access MppP. As it happens, S. waddayamensis also produces mannopeptimycin, but its MppP homolog is 87 amino acids longer than S. hygroscopicus MppP or S. fungicidicus EndP. The S. waddayamensis MppP (SwMppP) expressed extremely well in E. coli, allowing us to determine the crystal structure (Figure 4A) and characterize its activity (Figure 4B). The tertiary structure of SwMppP places it in the same family as MppQ: the fold type I aminotransferases. However, several of the active site residues known to be important for the reaction of other aminotransferases are not conserved in SwMppP. Future work (see below) will determine the functional consequences, if any, of these residue changes.

Figure 4. Though the tertiary structure of MppP is very similar to the typical fold type I aminotransferases, there are significant differences in several key active site residues (A), suggesting that MppP does not have aminotransferase activity. The residues labeled in red differ from their counterparts in the active sites of true aminotransferases like aspartate aminotransferase and MppQ. MppP consumes dioxygen in a reaction with L-Arg (B). 1H NMR analysis of reaction mixtures shows that the products of this reaction are 2-ketoarginine and 4-hydroxy-2-ketoarginine.

UV-Vis analysis of reaction mixtures containing L-Arg and SwMppP (not shown) showed the accumulation of a species absorbing at 510 nm after ~3-5 minutes. Shaking the cuvette and returning it to the spectrophotometer caused this peak to disappear, but it returned after another 3-5 minute period. Running similar reactions on an oxygen electrode confirmed that the enzyme is truly consuming oxygen in a reaction with L-Arg. Interestingly, D-Arg causes spectral changes that suggest it can bind to the enzyme and react with the PLP cofactor, but the reaction appears to stop at that point. D-Arg also appears to be a weak competitive inhibitor of the reaction with L-Arg. Other amino acids, like L-Lys, L-Met, and L-Ala, show no sign of binding to SwMppP. Reaction mixtures containing SwMppP and L-Arg were analyzed by NMR spectroscopy (1D 1H, COSY, HSQC, and HMBC) to identify the product(s) of the reaction. We found the SwMppP produces a mixture of 4-hydroxy-2-ketoarginine and 2-ketoarginine. This is exciting, since this is the first example of a PLP-dependent hydroxylase. Additional NMR experiments on reaction mixtures containing both MppP and MppR, or all three enzymes P, Q, and R, confirm that MppR is capable of cyclizing 4HKA to give the keto form of L-End and that MppQ reacts with the product of MppR to form a product that remains to be confirmed but that we strongly suspect is L-End. These results are summarized in Scheme 2.


In our 2013 Biochemistry paper describing the structure and possible functions of MppR, we found the MppR had weak, but measurable, aldolase-dehydratase activity with pyruvate and either imidazole 4-carboxaldehyde or 3-(2-furyl)acrolein. This observation led us to look more closely at the acetoacetate decarboxylase-like superfamily (ADCSF). Using sequence similarity networks and structural knowledge of the ADC and MppR active sites, we found that the ADCSF can be divided into 6 families (Figure 6). The long-term goal of this project is to define the range of substrate specificity and reactions catalyzed by ADCSF enzymes. Our first steps toward this goal were to examine the structures and functions of two enzymes from Family 5 (MppR-Like), Sbi00515 from Streptomyces bingchenggensis and Swit4259 from Sphingomonas wittichii.

The overall folds of both of these enzymes are nearly identical to that of MppR, and thus to acetoacetate decarboxylase. The active site architectures of Sbi00515 and Swit4259 are also very similar to MppR. The largest difference between the active sites of Sbi00515 and MppR is the substitution of tyrosine residue at position 252 for E283 of MppR. This change slightly closes down the active site in the neighborhood of the catalytic lysine and alters the electrostatic environment. In the case of Swit4259, this residue is a phenylalanine. Swit4259 also has an asparagine residue at position 118, rather than the glutamine found in MppR (Q152) and Sbi00515 (Q118). This change appears to have significant functional consequences.

Figure 1. Sequence similarity network analysis of the ADCSF. Each red dot represents one sequence; the size of the dot represents the length of the protein sequence. Family assignments are based on analysis of the active site residues in representative sequences. For example, proteins in Family 1 have all of the active site characteristics of the known acetoacetate decarboxylases, Family 2 sequences all appear to be missing the catalytic lysine residue, and Family 5 (MppR-Like) proteins all appear to have an “α-carboxylate-binding site.”

Owing to the similarities between MppR and Sbi00515, the latter was tested for ac-tivity against a model α-keto acid substrate, benzylidenepyruvate (Scheme 1). Figure 2 shows that Sbi00515 hydrolyzes benzylidenepyruvate to give benzaldehyde and pyruvate. Given physiological concentrations of pyruvate (~10 mM) and benzaldehyde, the enzyme will also catalyze the aldol condensation and dehydration to produce benzylidenepyruvate (Figure 2C). While Sbi00515 is very specific for pyruvate (no other α-keto acids support the aldol condensation reaction observed with pyruvate), it is quite promis-cuous in terms of aldehyde substrates. Benzaldehyde is actually quite a poor substrate (KM > 50 mM). Unsaturated, aliphatic aldehydes are significantly better and show a distinct trend of decreasing KM values with increasing chain length from ~6 mM for pentenal to 0.035 mM for citral. The observation that the saturated hexanal is not a substrate for Sbi00515 shows that the unsaturated bond alpha to the carbonyl is required for the condensation reaction. Adding an aryl substituent to an unsaturated, aliphatic aldehyde results in improvements in both kcat and KM. The best substrate identified thus far is 4-nitro-cinnamaldehyde, with a kcat/KM value of 6.6 x 106 M-1s-1, which is in the range one expects for an enzyme working with its preferred substrate. This is not to say that 4-nitro-cinnamaldehyde is the physiological substrate of Sbi00515, but it is likely close in terms of general shape and physicochemical properties. Work is underway to learn more about the cellular function of Sbi00515.

Figure 2. UV-vis spectra show that Sbi00515 consumes benzylidenepyruvate (A) with a concomitant increase in A254nm. HPLC analysis of the reaction mixture and standards (B) suggests that Sbi00515 catalyzes the hydrolysis of benzylidenepyruvate to give benzaldehyde and pyruvate (benzylidenepyruvate: blue, benzaldehyde: red, benzylidenepyruvate and Sbi00515: green). Proton NMR spectra (C) of reaction mixtures and standards confirm that the products of the reaction between Sbi00515 and benzylidenepyruvate are benzaldehyde and pyruvate (benzylidenepyruvate: red, benzaldehyde with pyruvate: cyan, Sbi00515 and benzylidenepyruvate green, and Sbi00515 with benzaldehyde and pyruvate: purple). Together, these data show that Sbi00515 has in vitro aldolase-dehydratase activity.

Pre-steady state analysis of active site mutants shows that E84 is important for catalysis, since in the pre-steady state the kobs for formation of the 4-nitro-cinnamylidenepyruvate-enzyme intermediate is essentially 0 s-1, and Y252 seems to be important for the aldol chemistry. This is based on the observation that for the WT and Y82F variants the intermediate decays at significant rates (kobs ≈ 8 s-1), but the Y252F variant turns it over too slowly to measure a rate (again, kobs is essentially 0 s-1). Whether there is a catalytic role for Y252 (e.g. activating a water molecule to promote hydration of the double bond) or it simply recruits water into what is an otherwise non-polar active site remains unknown. Kinetic isotope effect studies have shown a secondary inverse effect in the reaction of Sbi00515 with 2H3-pyruvate. This and additional pre-steady state data (not shown) have allowed us to propose a mechanism that is consistent with all of the data collected thus far. Single turnover and order of addition experiments suggest that Sbi00515 proceeds by a random ordered mechanism.

Neutron Diffraction

X-Ray crystallography is a powerful tool for the study of enzyme mechanisms, but the inability, even at the highest X-ray resolutions achievable, to routinely locate hydrogen atoms in active site functional groups, substrates, or intermediates places limitations on the information available from these experiments. Neutron diffraction experiments are conceptually similar but are distinguished by the fact that neutrons are scattered by the nuclei of the atoms in the crystal. Thus, in neutron diffraction experiments deuterium atoms have approximately the same scattering power as carbon. When crystals of perdeuterated protein are used, the positions of the deuterium atoms are easily determined, permitting, in most cases, unequivocal assignment of the protonation states of ionizable groups. The major limitation of neutron diffraction has been the rather low intensity of available neutron sources, which necessitates extremely large protein crystals (< 1mm3). However, the advent of spallation neutron sources coupled with improved optics and neutron detector technology has reduced the minimum crystal size approximately 10-fold (0.1mm3), making neutron diffraction accessible to a larger number of proteins. It is my intention to aid efforts to bring neutron crystallography into the mainstream of structural biology. By uniting neutron and X-ray crystallography, structural biologists will be able to solve problems that neither technique could address alone.