Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants

Yin, Rui; Yeng, Brandon F.; Varshney, Amitabh; Pierce, Brian G.

Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants

dc.contributor.author	Yin, Rui
dc.contributor.author	Yeng, Brandon F.
dc.contributor.author	Varshney, Amitabh
dc.contributor.author	Pierce, Brian G.
dc.date.accessioned	2023-09-27T18:00:57Z
dc.date.available	2023-09-27T18:00:57Z
dc.date.issued	2022-07-13
dc.description.abstract	High-resolution experimental structural determination of protein–protein interactions has led to valuable mechanistic insights, yet due to the massive number of interactions and experimental limitations there is a need for computational methods that can accurately model their structures. Here we explore the use of the recently developed deep learning method, AlphaFold, to predict structures of protein complexes from sequence. With a benchmark of 152 diverse heterodimeric protein complexes, multiple implementations and parameters of AlphaFold were tested for accuracy. Remarkably, many cases (43%) had near-native models (medium or high critical assessment of predicted interactions accuracy) generated as top-ranked predictions by AlphaFold, greatly surpassing the performance of unbound protein–protein docking (9% success rate for near-native top-ranked models), however AlphaFold modeling of antibody–antigen complexes within our set was unsuccessful. We identified sequence and structural features associated with lack of AlphaFold success, and we also investigated the impact of multiple sequence alignment input. Benchmarking of a multimer-optimized version of AlphaFold (AlphaFold-Multimer) with a set of recently released antibody–antigen structures confirmed a low rate of success for antibody–antigen complexes (11% success), and we found that T cell receptor–antigen complexes are likewise not accurately modeled by that algorithm, showing that adaptive immune recognition poses a challenge for the current AlphaFold algorithm and model. Overall, our study demonstrates that end-to-end deep learning can accurately model many transient protein complexes, and highlights areas of improvement for future developments to reliably model any protein–protein interaction of interest.
dc.description.uri	https://doi.org/10.1002/pro.4379
dc.identifier	https://doi.org/10.13016/dspace/xbwe-1rix
dc.identifier.citation	Yin, R, Feng, BY, Varshney, A, Pierce, BG. Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants. Protein Science. 2022; 31(8):e4379.
dc.identifier.uri	http://hdl.handle.net/1903/30597
dc.language.iso	en_US
dc.publisher	Wiley
dc.relation.isAvailableAt	Cell Biology & Molecular Genetics	en_us
dc.relation.isAvailableAt	Digital Repository at the University of Maryland	en_us
dc.relation.isAvailableAt	College of Computer, Mathematical & Natural Sciences	en_us
dc.relation.isAvailableAt	University of Maryland (College Park, MD)	en_us
dc.title	Benchmarking AlphaFold for protein complex modeling reveals accuracy determinants
dc.type	Article
local.equitableAccessSubmission	No

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Protein Science - 2022 - Yin.pdf
Size:: 5.06 MB
Format:: Adobe Portable Document Format

Download

Collections

Cell Biology & Molecular Genetics Research Works