Application of comparative phylogenomics to study the evolution of Yersinia enterocolitica and to identify genetic differences relating to pathogenicity.
Yersinia enterocolitica, an important cause of human gastroenteritis generally caused by the consumption of livestock, has traditionally been categorized into three groups with respect to pathogenicity, i.e., nonpathogenic (biotype 1A), low pathogenicity (biotypes 2 to 5), and highly pathogenic (biotype 1B). However, genetic differences that explain variation in pathogenesis and whether different biotypes are associated with specific nonhuman hosts are largely unknown. In this study, we applied comparative phylogenomics (whole-genome comparisons of microbes with DNA microarrays combined with Bayesian phylogenies) to investigate a diverse collection of 94 strains of Y. enterocolitica consisting of 35 human, 35 pig, 15 sheep, and 9 cattle isolates from nonpathogenic, low-pathogenicity, and highly pathogenic biotypes. Analysis confirmed three distinct statistically supported clusters composed of a nonpathogenic clade, a low-pathogenicity clade, and a highly pathogenic clade. Genetic differences revealed 125 predicted coding sequences (CDSs) present in all highly pathogenic strains but absent from the other clades. These included several previously uncharacterized CDSs that may encode novel virulence determinants including a hemolysin, a metalloprotease, and a type III secretion effector protein. Additionally, 27 CDSs were identified which were present in all 47 low-pathogenicity strains and Y. enterocolitica 8081 but absent from all nonpathogenic 1A isolates. Analysis of the core gene set for Y. enterocolitica revealed that 20.8% of the genes were shared by all of the strains, confirming this species as highly heterogeneous, adding to the case for the existence of three subspecies of Y. enterocolitica. Further analysis revealed that Y. enterocolitica does not cluster according to source (host).