Matthew Ardito, Joanna Fueyo, Ryan Tassone, Frances Terry, Kristen DaSilva, Songhua Zhang, Wil-liam Martin, Anne S. De Groot, Steven F. Moss and Leonard Moise
Background One useful application of pattern matching algorithms is identification of major histocompatability complex (MHC) ligands and T-cell epitopes. Peptides that bind to MHC molecules and interact with T cell receptors to stimulate the immune system are critical antigens for protection against infectious pathogens. We describe a genomes-to-vaccine approach to H. pylori vaccine design that takes advantage of immunoinformatics algorithms to rapidly identify T-cell epitope sequences from large genomic datasets. Results To design a globally relevant vaccine, we used computational methods to identify a core genome comprised of 676 open reading frames (ORFs) from amongst seven genetically and phenotypically diverse H. pylori strains from around the world. Of the 1,241,153 9-mer sequences encoded by these ORFs, 106,791 were identical amongst all seven genomes and 23,654 scored in the top 5% of predicted HLA ligands for at least one of eight archetypal Class II HLA alleles when evaluated by EpiMatrix. To maximize the number of epitopes that can be assessed experimentally, we used a computational algorithm to increase epitope density in 20-25 amino acid stretches by assembling potentially immunogenic 9-mers to be identically positioned as they are in the native protein antigen. 1,805 immunogenic consensus sequences (ICS) were generated. 79% of selected ICS epitopes bound to a panel of 6 HLA Class II haplotypes, representing >90% of the global human population.
Conclusions The breadth of H. pylori genome datasets was computationally assessed to rapidly and carefully determine a core set of genes. Application of immunoinformatics tools to this gene set accurately predicted epitopes with promising properties for T cell-based vaccine development.