Document Type

Article

Publication Date

2024

DOI

10.1093/bib/bbae154

Publication Title

Briefings in Bioinformatics

Volume

25

Issue

3

Pages

bbae154 (1-11)

Abstract

Human leukocyte antigen (HLA) recognizes foreign threats and triggers immune responses by presenting peptides to T cells. Computationally modeling the binding patterns between peptide and HLA is very important for the development of tumor vaccines. However, it is still a big challenge to accurately predict HLA molecules binding peptides. In this paper, we develop a new model TripHLApan for predicting HLA molecules binding peptides by integrating triple coding matrix, BiGRU + Attention models, and transfer learning strategy. We have found the main interaction site regions between HLA molecules and peptides, as well as the correlation between HLA encoding and binding motifs. Based on the discovery, we make the preprocessing and coding closer to the natural biological process. Besides, due to the input being based on multiple types of features and the attention module focused on the BiGRU hidden layer, TripHLApan has learned more sequence level binding information. The application of transfer learning strategies ensures the accuracy of prediction results under special lengths (peptides in length 8) and model scalability with the data explosion. Compared with the current optimal models, TripHLApan exhibits strong predictive performance in various prediction environments with different positive and negative sample ratios. In addition, we validate the superiority and scalability of TripHLApan’s predictive performance using additional latest data sets, ablation experiments and binding reconstitution ability in the samples of a melanoma patient. The results show that TripHLApan is a powerful tool for predicting the binding of HLA-I and HLA-II molecular peptides for the synthesis of tumor vaccines. TripHLApan is publicly available at https://github.com/CSUBioGroup/TripHLApan.git.

Rights

© The Authors 2024.

This is an open access article distributed under the terms of the Creative Commons Attribution Non-Commercial (CC BY-NC 4.0) License, which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

Data Availability

Article states: "The dataset used for TripHLApan training and testing is from the IEDB database (https://www.iedb.org/). Single patient cancer immunopeptidome data is from publications [54-56]."

Original Publication Citation

Wang, M., Lei, C., Wang, J., Li, Y., & Li, M. (2024). TripHLApan: Predicting HLA molecules binding peptides based on triple coding matrix and transfer learning. Briefings in Bioinformatics, 25(3), 1-11, Article bbae154. https://doi.org/10.1093/bib/bbae154

Share

COinS