Post translational modification in prokaryotes and eukaryotes pdf
File Name: post translational modification in prokaryotes and eukaryotes .zip
- dbPSP 2.0, an updated database of protein phosphorylation sites in prokaryotes
- Post-translational modification
- Post-Translational Modifications Aid Archaeal Survival
dbPSP 2.0, an updated database of protein phosphorylation sites in prokaryotes
In prokaryotes, protein phosphorylation plays a critical role in regulating a broad spectrum of biological processes and occurs mainly on various amino acids, including serine S , threonine T , tyrosine Y , arginine R , aspartic acid D , histidine H and cysteine C residues of protein substrates. Through literature curation and public database integration, here we reported an updated database of phosphorylation sites p-sites in prokaryotes dbPSP 2. To carefully annotate these phosphoproteins and p-sites, we integrated the knowledge from 88 publicly available resources that covers 9 aspects, namely, taxonomy annotation, genome annotation, function annotation, transcriptional regulation, sequence and structure information, family and domain annotation, interaction, orthologous information and biological pathway.
In contrast to version 1. We anticipate that dbPSP 2. As one of the most well-characterized and important post-translational modifications PTMs , protein phosphorylation plays an essential role in almost all signalling pathways and biological processes, from eukaryotes to prokaryotes 1 , 2 , 3 , 4 , 5. This reversibly dynamic process is precisely modulated by protein kinases PKs and protein phosphatases PPs , which are involved in linking or removing a phosphate group at specific residues of protein substrates 1 , 2 , 3 , 4 , 5.
The first eukaryotic phosphoprotein was discovered in by Olof Hammarsten, a Swedish biochemist, who detected phosphorous in a secreted protein, casein, from milk 6. Although later studies demonstrated that many proteins can be phosphorylated in eukaryotes, it was long debated whether protein phosphorylation also exists in prokaryotes until the discovery of isocitrate dehydrogenase in Escherichia coli , the first identified prokaryotic phosphoprotein, in 7 , 8.
In contrast with eukaryotic phosphorylation, which occurs mainly at specific serine S , threonine T and tyrosine Y residues of proteins 5 , prokaryotic protein phosphorylation can occur at additional types of amino acids, such as arginine R , aspartic acid D , histidine H and cysteine C 1 , 9 , 10 , 11 , 12 , Given the importance of phosphorylation in the regulation of protein functions 11 , 12 , 13 , the identification of novel phosphorylation sites p-sites in proteins is fundamental for understanding the molecular mechanism and regulatory roles of prokaryotic phosphorylation.
Previously, experimental identification of p-sites with conventional biochemical assays was usually labour intensive, time consuming and expensive and was accomplished in a low-throughput LTP manner. The quality of p-sites identified in LTP studies is higher, because usually multiple assays were performed, and the biological functions of p-sites were also carefully analyzed.
Recently, advances in the development of proteomic techniques using high-throughput MS HTP-MS have enabled the large-scale phosphoproteomic identification of p-sites in prokaryotic proteins 18 , 19 , 20 , For example, Macek et al. For arginine phosphorylation, Elsholz et al. More recently, Lai et al. Because an increasing number of LTP and HTP p-site investigations have been reported, the collection, curation, integration and annotation of known phosphoproteins and p-sites in prokaryotes will provide invaluable information for better understanding the host-pathogen interaction and development of antimicrobial agents.
In , we developed a new database of phosphorylation sites in prokaryotes dbPSP 1. Compared with the second largest resource, the Phosphorylation Site Database, which curated approximately 1, prokaryotic p-sites 23 , dbPSP 1.
At that time, few annotations were provided, except limited information on p-sites. Due to the large number of prokaryotic p-sites found in recent studies, here we created dbPSP 2.
Furthermore, we carefully annotated these phosphoproteins and p-sites through integrating the knowledge from 88 publicly accessible databases, covering 9 aspects. In contrast with dbPSP 1. We confirmed that dbPSP 2. An overview of the dbPSP 2.
First, we manually re-curated all entries in version 1. Then, we mapped all phosphoproteins to public data sources for cross-referencing. In addition to basic information, we further integrated various annotations from 88 public databases that covered 9 aspects: i taxonomy annotation, ii genome annotation, iii function annotation, iv transcriptional regulation, v sequence and structure information, vi family and domain annotation, vii interaction, viii orthologous information, and ix biological pathway.
Compared with version 1. Through literature curation and public database integration, dbPSP 2. The derivation of In addition to version 1. In dbPSP 1. Due to the new data accumulation, known p-sites have been extended to prokaryotic species in 12 phyla by adding a new phylum, Bacteroidetes Fig. The distribution of numbers of p-sites among different phyla was analyzed, and it was observed that more p-sites were identified in Proteobacteria and Actinobacteria than in other phyla, with proportions of The Proteobacteria phylum comprises a number of extensively studied microorganisms, such as the most widely used model organism E.
In Actinobacteria phylum, one of the most notorious species is Mycobacterium tuberculosis , which is the causative agent of tuberculosis TB and annually causes 1. Due to the high virulence of M. Additionally, we analyzed the distribution of p-sites on different types of amino acid residues and found that pS, pT and pY sites appear more frequently than other types of residues and occupy proportions of Moreover, the distribution of different types of p-sites among the 12 phyla was evaluated Fig.
The distribution of phosphoproteins and p-sites for different phyla and different residue types in prokaryotes. For each prokaryote, its proteome set was downloaded from UniProt 24 by searching the corresponding Proteome ID, e. Then the proportion of phosphoproteins against all protein products were counted, and top 10 species with higher coverage values were shown.
From the results, we found that the coverage values of the 10 prokaryotes ranged from 8. Thus, when more and more phosphoproteomic studies are performed for prokaryotes, the coverage values of their phosphoproteins will be undoubtedly increased. The coverage of phosphoproteins and the conservation of p-sites. For convenience, dbPSP 2.
To provide an integrative annotation of known phosphoproteins and p-sites, we provided a variety of cross-references to public data sources. The gene ontology GO annotations in the Gene Ontology resource 37 were also included if available. These resources covered 9 aspects, namely, taxonomy annotation, genome annotation, function annotation, transcriptional regulation, sequence and structure information, family and domain annotation, interaction, orthologous information and biological pathway Fig.
A brief summary of all public resources integrated in dbPSP 2. For each phosphoprotein with available 3D structures characterized by X-ray crystallography or NMR spectroscopy, a representative 3D structure was selected for intuitive visualization. Users can select all or specific p-sites for visualizing their locations on protein structures. In phosphoproteomic studies, phosphopeptides were derived from mass spectrometry spectral datasets, usually with a false discovery rate FDR of 0. To pinpoint an exact p-site in a phosphopeptide, a localization probability LP score could be calculated by a variety of tools, such as MaxQuant LP scores range from 0 to 1, and a higher LP score represents a higher probability of a detected site being a real p-site.
Thus, the aggregation of false positive identifications might result in a considerable higher FPR value in the cumulative dataset.
A re-analysis of all raw MS datasets under a unified platform will generate phosphopeptides with much higher quality, although such an effort is not within the scope of dbPSP 2. Here, potential orthologues of known phosphoproteins were obtained from Clusters of Orthologous Groups of proteins COG For each orthologous group, all protein sequences were multi-aligned using MUSCLE 48 , and a conservation ratio was calculated for the sequences containing the same types of phosphorylatable residues against all sequences in the group.
The distribution of the conservation ratio ranged from 0 to 1 was illustrated for all p-sites in the orthologous groups Fig. These highly conserved p-sites might be useful for the investigation of conserved functions of phosphorylation in prokaryotes. Here, we chose B. The user can click the phylum to link the taxonomic category of the given phylum Fig.
For example, by clicking the diagram of arginine, all proteins with pR sites are listed Fig. The browse options of the dbPSP 2. Due to the limited number of pC sites, here we only analyzed the sequence preferences of pS, pT, pY, pR, pD and pH sites by using pLogo 49 for bacteria and archaea Fig. For prokaryotic pS, pT and pY sites, we also compared their sequence preferences to those of eukaryotic phosphorylation, including , pS, , pT and 59, pY sites by integrating two previously developed databases, dbPAF 50 and dbPPT Analyses of sequence preferences for p-sites in prokaryotes with pLogo Due to data limitations, pR sites in only bacteria were analyzed.
After the publication of dbPSP 1. For example, Garcia-Garcia et al. With the help of dbPSP, Venkat et al. Additionally, Lin et al. Moreover, Hasan et al. In addition, the phosphorylation data of representative prokaryotes from dbPSP was utilized for kinase motif enrichment analysis, and the results demonstrated that most eukaryotic phosphorylation motifs could not be recovered in prokaryotes In dbPSP 2.
Furthermore, dbPSP 2. In addition, the MSA results of orthologues were provided in this database and will be important for discovering conserved functional p-sites in prokaryote cells. Based on previous studies, dbPSP could work as a well-curated data resource of prokaryotic phosphoproteins to provide helpful support for phosphoproteomic analysis, tool development, and the investigation of prokaryotic phosphorylation events.
We anticipate that the updated dbPSP 2. Protein phosphorylation is one of most well-studied PTMs and is reported to be involved in regulating numerous cellular processes in prokaryotic cells 8 , In , we collected 7, known p-sites of 3, proteins in 96 prokaryotes from published literature and developed dbPSP 1.
Due to the accumulation of phosphorylation information, here we released dbPSP 2. Furthermore, the rich annotations derived from 88 public databases were integrated. In total, dbPSP 2. In this study, to cover the diverse biological roles of prokaryotic phosphoproteins, we included multiple-layer knowledge from other databases to comprehensively annotate phosphoproteins.
For example, the prokaryotic ClpP enzyme plays an important role in modulating various biological processes, such as cellular stress response, pathogenesis and homeostasis Inhibiting the function of ClpP was reported to affect the infectivity and virulence of microbial pathogens Moreover, the arginine phosphorylation of ClpP was essential for maintaining its function 20 , 21 , As shown in Fig.
Meanwhile, ClpP might interact with 9 partners and self-assemble in hexameric ring structures Fig. In particular, we found nearly 15, records from 6 orthologous databases to demonstrate that ClpP is a highly conserved subunit in prokaryotes, and the results are consistent with previous studies. In addition, the functional domain and p-site information of ClpP were also provided.
To understand how gene expression is regulated, we must first understand how a gene codes for a functional protein in a cell. The process occurs in both prokaryotic and eukaryotic cells, just in slightly different manners. Prokaryotic organisms are single-celled organisms that lack a cell nucleus, and their DNA therefore floats freely in the cell cytoplasm. To synthesize a protein, the processes of transcription and translation occur almost simultaneously. When the resulting protein is no longer needed, transcription stops. As a result, the primary method to control what type of protein and how much of each protein is expressed in a prokaryotic cell is the regulation of DNA transcription.
Post-Translational Modifications Aid Archaeal Survival
In this issue, Mahkoul et al. For further details see pages — Ivar W. Dilweg, Remus T.
Since the pioneering work of Carl Woese, Archaea have fascinated biologists of almost all areas given their unique evolutionary status, wide distribution, high diversity, and ability to grow in special environments.
Post-translational modifications PTM are the evolutionary solution to challenge and extend the boundaries of genetically predetermined proteomic diversity. As PTMs are highly dynamic, they also hold an enormous regulatory potential. It is therefore not surprising that out of the 20 proteinogenic amino acids, 15 can be post-translationally modified. Even the relatively inert guanidino group of arginine is subject to a multitude of mostly enzyme mediated chemical changes. The resulting alterations can have a major influence on protein function. In this review, we will discuss how bacteria control their cellular processes and develop pathogenicity based on post-translational protein-arginine modifications. The complexity of proteomes is so tremendous that it exceeds the predictions based on genome data by up to three orders of magnitude Walsh et al.
Within the last few decades, scientists have discovered that the human proteome is vastly more complex than the human genome. While it is estimated that the human genome comprises between 20, and 25, genes, the total number of proteins in the human proteome is estimated at over 1 million. These estimations demonstrate that single genes encode multiple proteins. Genomic recombination, transcription initiation at alternative promoters, differential transcription termination, and alternative splicing of the transcript are mechanisms that generate different mRNA transcripts from a single gene. The increase in complexity from the level of the genome to the proteome is further facilitated by protein post-translational modifications PTMs. PTMs are chemical modifications that play a key role in functional proteomic because they regulate activity, localization, and interaction with other cellular molecules such as proteins, nucleic acids, lipids and cofactors. Post-translational modifications are key mechanisms to increase proteomic diversity.