Major histocompatibility complex (MHC) class II antigen presentation is a key component in eliciting a CD4+ T-cell response. Precise prediction of peptide-MHC (pMHC) interactions has thus become a cornerstone in defining epitope candidates for rational vaccine design. Current pMHC prediction tools have, so far, primarily focused on inference from in vitro binding affinity. In the current study, we collate a large set of MHC class II eluted ligands generated by mass spectrometry to guide the prediction of MHC class II antigen presentation. We demonstrate that models developed on eluted ligands outperform those developed on pMHC binding affinity data.
The predictive performance can be further enhanced by combining the eluted ligand and pMHC affinity data in a single prediction model. Furthermore, by including ligand data, the peptide length preference of MHC class II can be accurately learned by the prediction model. Finally, we demonstrate that our model significantly outperforms the current state-of-the-art prediction method, NetMHCIIpan, on an external dataset of eluted ligands and appears superior in identifying CD4+ T-cell epitopes.