资 源 简 介
Although there have been great advances in understanding bacterial pathogenesis, there is still a lack of integrative information about what makes a bacterium a human pathogen. In this work we determined presence/absence patterns of 814 different virulence-related genes among more than 600 finished bacterial genomes from both human pathogenic and non-pathogenic strains, belonging to different taxonomic
groups. An accuracy of 95% using a cross-fold validation with in-fold feature selection is obtained when classifying human pathogens and non-pathogens. Using a reduced subset of 120 genes a SVM based classifier was built. The statistical model is implemented in this software (BacFier v1.0), that displays not only the prediction (pathogen/non-pathogen) and an associated probability for pathogenicity, but also the presence/absence vector for the analyzed genes, so it is possible to decipher the subset of virulence genes responsible for the classification on the analyzed genome. The user