In silico approach to designing rational metagenomic libraries for functional studies

  • \(\textbf {Background:}\) With the development of Next Generation Sequencing technologies, the number of predicted proteins from entire (meta-) genomes has risen exponentially. While for some of these sequences protein functions can be inferred from homology, an experimental characterization is still a requirement for the determination of protein function. However, functional characterization of proteins cannot keep pace with our capabilities to generate more and more sequence data. \(\textbf {Results:}\) Here, we present an approach to reduce the number of proteins from entire (meta-) genomes to a reasonably small number for further experimental characterization without loss of important information. About 6.1 million predicted proteins from the Global Ocean Sampling Expedition Metagenome project were distributed into classes based either on homology to existing hidden markov models (HMMs) of known families, or de novo by assessment of pairwise similarity. 5.1 million of these proteins could be classified in this way, yielding 18,437 families. For 4,129 protein families, which did not match existing HMMs from databases, we could create novel HMMs. For each family, we then selected a representative protein, which showed the closest homology to all other proteins in this family. We then selected representatives of four families based on their homology to known and well-characterized lipases. From these four synthesized genes, we could obtain the novel esterase/lipase GOS54, validating our approach. \(\textbf {Conclusions:}\) Using an in silico approach, we were able improve the success rate of functional screening and make entire (meta-) genomes amenable for biochemical characterization.

Download full text files

Export metadata

Additional Services

Share in Twitter Search Google Scholar
Metadaten
Author:Anna KusnezowaGND, Lars I. LeichertORCiDGND
URN:urn:nbn:de:hbz:294-58968
DOI:https://doi.org/10.1186/s12859-017-1668-y
Parent Title (English):BMC bioinformatics
Document Type:Article
Language:English
Date of Publication (online):2018/07/10
Date of first Publication:2017/05/22
Publishing Institution:Ruhr-Universität Bochum, Universitätsbibliothek
Tag:Open Access Fonds
Functional metagenomics; GOS; Global ocean sampling project; Lipase; Protein function
Volume:18
Issue:1
First Page:267-1
Last Page:267-11
Note:
Article Processing Charge funded by the Deutsche Forschungsgemeinschaft (DFG) and the Open Access Publication Fund of Ruhr-Universität Bochum.
Institutes/Facilities:Institut für Biochemie und Pathobiochemie, Abteilung Biochemie der Mikroorganismen
Dewey Decimal Classification:Naturwissenschaften und Mathematik / Biowissenschaften, Biologie, Biochemie
open_access (DINI-Set):open_access
faculties:Medizinische Fakultät
Licence (English):License LogoCreative Commons - CC BY 4.0 - Attribution 4.0 International