Genome-wide enzyme annotation with precision control: catalytic families (CatFam) databases.

dc.contributor.authorYu, Chenggang
dc.contributor.authorZavaljevski, Nela
dc.contributor.authorDesai, Valmik
dc.contributor.authorReifman, Jaques
dc.date.accessioned2020-02-10T17:16:30Z
dc.date.available2020-02-10T17:16:30Z
dc.date.issued2008-12-24
dc.description.abstractIn this article we present a new method termed CatFam Catalytic Families to automatically infer the functions of catalytic proteins which account for 20 40 of all proteins in living organisms and play a critical role in a variety of biological processes CatFam is a sequence based method that generates sequence profiles to represent and infer protein catalytic functions CatFam generates profiles through a stepwise procedure that carefully controls profile quality and employs nonenzymes as negative samples to establish profile specific thresholds associated with a predefined nominal false positive rate FPR of predictions The adjustable FPR allows for fine precision control of each profile and enables the generation of profile databases that meet different needs function annotation with high precision and hypothesis generation with moderate precision but better recall Multiple tests of CatFam databases generated with distinct nominal FPRs against enzyme and nonenzyme datasets show that the method s predictions have consistently high precision and recall For example a 1 FPR database predicts protein catalytic functions for a dataset of enzymes and nonenzymes with 98 6 precision and 95 0 recall Comparisons of CatFam databases against other established profile based methods for the functional annotation of 13 bacterial genomes indicate that CatFam consistently achieves higher precision and in most cases higher recall and that on average CatFam provides 21 9 additional catalytic functions not inferred by the other similarly reliable methods These results strongly suggest that the proposed method provides a valuable contribution to the automated prediction of protein catalytic functions The CatFam databases and the database search program are freely available at http www bhsai org downloads catfam tar gz
dc.identifier.urihttp://dx.doi.org/10.1002/prot.22167
dc.identifier.urihttps://lib.digitalsquare.io/xmlui/handle/123456789/25009
dc.relation.uriProteins
dc.titleGenome-wide enzyme annotation with precision control: catalytic families (CatFam) databases.en
dcterms.abstractIn this article we present a new method termed CatFam Catalytic Families to automatically infer the functions of catalytic proteins which account for 20 40 of all proteins in living organisms and play a critical role in a variety of biological processes CatFam is a sequence based method that generates sequence profiles to represent and infer protein catalytic functions CatFam generates profiles through a stepwise procedure that carefully controls profile quality and employs nonenzymes as negative samples to establish profile specific thresholds associated with a predefined nominal false positive rate FPR of predictions The adjustable FPR allows for fine precision control of each profile and enables the generation of profile databases that meet different needs function annotation with high precision and hypothesis generation with moderate precision but better recall Multiple tests of CatFam databases generated with distinct nominal FPRs against enzyme and nonenzyme datasets show that the method s predictions have consistently high precision and recall For example a 1 FPR database predicts protein catalytic functions for a dataset of enzymes and nonenzymes with 98 6 precision and 95 0 recall Comparisons of CatFam databases against other established profile based methods for the functional annotation of 13 bacterial genomes indicate that CatFam consistently achieves higher precision and in most cases higher recall and that on average CatFam provides 21 9 additional catalytic functions not inferred by the other similarly reliable methods These results strongly suggest that the proposed method provides a valuable contribution to the automated prediction of protein catalytic functions The CatFam databases and the database search program are freely available at http www bhsai org downloads catfam tar gz
dcterms.contributorYu, Chenggang
dcterms.contributorZavaljevski, Nela
dcterms.contributorDesai, Valmik
dcterms.contributorReifman, Jaques
dcterms.identifierhttp://dx.doi.org/10.1002/prot.22167
dcterms.relationProteins
dcterms.titleGenome-wide enzyme annotation with precision control: catalytic families (CatFam) databases.en
Files
Collections