Rapid Design of Knowledge-Based Scoring Potentials for Enrichment of Near-Native Geometries in Protein-Protein Docking
Protein-protein docking protocols aim to predict the structures of protein-protein complexes based on the structure of individual partners. Docking protocols usually include several steps of sampling, clustering, refinement and re-scoring. The scoring step is one of the bottlenecks in the performance of many state-of-the-art protocols. The performance of scoring functions depends on the quality of the generated structures and its coupling to the sampling algorithm. A tool kit, GRADSCOPT (GRid Accelerated Directly SCoring OPTimizing), was designed to allow rapid development and optimization of different knowledge-based scoring potentials for specific objectives in protein-protein docking. Different atomistic and coarse-grained potentials can be created by a grid-accelerated directly scoring dependent Monte-Carlo annealing or by a linear regression optimization. We demonstrate that the scoring functions generated by our approach are similar to or even outperform state-of-the-art scoring functions for predicting near-native solutions. Of additional importance, we find that potentials specifically trained to identify the native bound complex perform rather poorly on identifying acceptable or medium quality (near-native) solutions. In contrast, atomistic long-range contact potentials can increase the average fraction of near-native poses by up to a factor 2.5 in the best scored 1% decoys (compared to existing scoring), emphasizing the need of specific docking potentials for different steps in the docking protocol.