Supplementary materials for positive-unlabeled learning identifies vaccine candidate antigens in the malaria parasite Plasmodium falciparum

No Thumbnail Available

Files

README.txt (1.46 KB)
No. of downloads: 12
main_notebook.zip (49.47 MB)
No. of downloads: 7
other_files.zip (229.34 MB)
No. of downloads: 4
pf_reverse_vaccinology.sql.tar.gz (92.92 MB)
No. of downloads: 3
purf_models.zip (69.75 MB)
No. of downloads: 3

Related Publication Link

Date

2023

Advisor

Related Publication Citation

Abstract

Malaria vaccine development is hampered by extensive antigenic variation and complex life stages of Plasmodium species. Vaccine development has focused on a small number of antigens identified prior to availability of the P. falciparum genome. In this study, we implement a machine learning-based reverse vaccinology approach to predict potential new malaria vaccine candidate antigens. We assemble and analyze P. falciparum proteomic, structural, functional, immunological, genomic, and transcriptomic data, and use positive-unlabeled learning to predict potential antigens based on the properties of known antigens and remaining proteins. We prioritize candidate antigens based on model performance on reference antigens with different genetic diversity and quantify the protein properties that contribute the most to identifying top candidates. Candidate antigens are characterized by gene essentiality, gene ontology, and gene expression in different life stages to inform future vaccine development. This approach provides a framework for identifying and prioritizing candidate vaccine antigens for a broad range of pathogens.

Notes

The research aims to identify and prioritize previously unknown vaccine antigen candidates with potentially high efficacy against the most prevalent malaria parasite Plasmodium falciparum. Positive-unlabeled random forest (PURF) was applied to learn from the small set of known Plasmodium falciparum antigens and the other proteins with unknown antigenic properties. The research notebook contains data and code generated in the study "Positive-unlabeled learning identifies vaccine candidate antigens in the malaria parasite Plasmodium falciparum." The notebook also includes instructions on installing the PURF package, retrieving protein variables and assembling machine learning input from the database, as well as code for experimental analysis and plotting.

Rights

Attribution-NonCommercial-ShareAlike 3.0 United States
http://creativecommons.org/licenses/by-nc-sa/3.0/us/