Characterization of peptide-protein relationships in protein ambiguity groups via bipartite graphs

  • In bottom-up proteomics, proteins are enzymatically digested into peptides before measurement with mass spectrometry. The relationship between proteins and their corresponding peptides can be represented by bipartite graphs. We conduct a comprehensive analysis of bipartite graphs using quantified peptides from measured data sets as well as theoretical peptides from an \(\textit {in silico}\) digestion of the corresponding complete taxonomic protein sequence databases. The aim of this study is to characterize and structure the different types of graphs that occur and to compare them between data sets. We observed a large influence of the accepted minimum peptide length during \(\textit {in silico}\) digestion. When changing from theoretical peptides to measured ones, the graph structures are subject to two opposite effects. On the one hand, the graphs based on measured peptides are on average smaller and less complex compared to graphs using theoretical peptides. On the other hand, the proportion of protein nodes without unique peptides, which are a complicated case for protein inference and quantification, is considerably larger for measured data. Additionally, the proportion of graphs containing at least one protein node without unique peptides rises when going from database to quantitative level. The fraction of shared peptides and proteins without unique peptides as well as the complexity and size of the graphs highly depends on the data set and organism. Large differences between the structures of bipartite peptide-protein graphs have been observed between database and quantitative level as well as between analyzed species. In the analyzed measured data sets, the proportion of protein nodes without unique peptides ranged from 6.4% to 55.0%. This highlights the need for novel methods that can quantify proteins without unique peptides. The knowledge about the structure of the bipartite peptide-protein graphs gained in this study will be useful for the development of such algorithms.

Download full text files

Export metadata

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Karin SchorkORCiDGND, Michael Andreas TurewiczORCiDGND, Julian UszkoreitORCiDGND, Jörg RahnenführerGND, Martin EisenacherORCiDGND
URN:urn:nbn:de:hbz:294-104356
DOI:https://doi.org/10.1371/journal.pone.0276401
Parent Title (English):PLOS ONE
Publisher:PLOS
Place of publication:San Francisco
Document Type:Article
Language:English
Date of Publication (online):2023/11/21
Date of first Publication:2022/10/21
Publishing Institution:Ruhr-Universität Bochum, Universitätsbibliothek
Tag:Open Access Fonds
Volume:17
Issue:10, Article e0276401
First Page:e0276401-1
Last Page:e0276401-21
Note:
Article Processing Charge funded by the Deutsche Forschungsgemeinschaft (DFG) and the Open Access Publication Fund of Ruhr-Universität Bochum.
Institutes/Facilities:Medizinisches Proteom-Center
Zentrum für Protein-Diagnostik (PRODI)
Dewey Decimal Classification:Technik, Medizin, angewandte Wissenschaften / Medizin, Gesundheit
open_access (DINI-Set):open_access
faculties:Medizinische Fakultät
Licence (English):License LogoCreative Commons - CC BY 4.0 - Attribution 4.0 International