Motivation: The ability to reliably predict protein-protein and protein-ligand interactions is important for identifying druggable binding sites and for understanding how proteins communicate. Most currently available algorithms identify cavities on the protein surface as potential ligand recognition sites. The method described here does not explicitly look for cavities but uses small surface patches consisting of triplets of adjacent surface atomic groups that can be touched simultaneously by a probe sphere representing a solvent molecule. A total of 455 different types of triplets can be identified. A training set of 309 protein-ligand protein X-ray structures has been used to generate interface propensities for the triplets, which can be used to predict their involvement in ligand-binding interactions.
Results: The success rate for locating protein-ligand binding sites on protein surfaces using this new surface triplet propensities (STP) algorithm is 88% which compares well with currently available grid-based and energy-based approaches. Q-SiteFinder's dataset (Laurie and Jackson, 2005. Bioinformatics, 21, 1908-1916) was used to show the favorable performance of STP. An analysis of the different triplet types showed that higher ligand binding propensity is related to more polarizable surfaces. The interaction statistics between triplet atoms on the protein surface and ligand atoms have been used to estimate statistical free energies of interaction. The delta G(stat) for halogen atoms interacting with hydrophobic triplets is -0.6 kcal/mol and an estimate of the maximal delta G(stat) for a ligand atom interacting with a triplet in a binding pocket is -1.45 kcal/mol.