Binding pocket shape analysis for protein function prediction Richard J. Morris* , Abdullah Kahraman , Thomas Funkhouser , Rafael Najmanovich , Gareth Stockwell , Fabian Glaser , Roman Laskowski , and Janet M. Thornton EMBL-EBI, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK John Innes Centre, Norwich Research Park, Colney, Norwich NR4 7UH, UK Princeton University, 35 Olden Street, Princeton, NJ 08544, USA 1 Introduction Structural Genomics initiatives are generating an increasing number of protein structures with very limited biochemical characterisation. The analysis and functional assignment of protein structure is a key challenge and a major bottleneck towards the goal of well-annotated genomes. As shape plays a crucial rôle in biomolecular recognition and function, the development of shape analysis techniques is important for understanding protein structure-function relationships. Protein surfaces typically contain clefts of different shapes and sizes. It has been shown that for enzymes, the active site is commonly found in the largest cleft, Laskowski et al. (1996). This makes the detection of the active site relatively straightforward, and can be performed by a program such as SURFNET, Laskowski (1995). SURFNET does this by placing expandable spheres between protein atoms such that the radii of these spheres does not penetrate the two protein atoms or any nearby atoms. The union of all such spheres is used by SURFNET to describe the 3D cleft shape. Residue conservation can be used to improve the binding pocket prediction, Glaser et al. (2005). In this contribution, we present an efficient method for comparing protein binding pockets based on a spherical harmonics expansion. Binding pocket shapes are approximated as func tions on the unit sphere by describing each surface point by its spherical coordinates and setting , Morris et al. (2005). The same methodology can be applied to enhance the description by including other properties such as the electrostatic potential, Ritchie and Kemp (1999, 2000). 2 Shape and spherical harmonics Shape is a key concept in the molecular sciences. Especially in molecular biology, shape is a major factor in virtually all processes within the cell as shape plays an important rôle in molecular recognition. Shape is, however, not an absolute property at the molecular level, as probes with differing electronic properties will experience different interactions and therefore see different shapes. This complication will be ignored in the following and molecules will be treated as rigid bodies with well-defined boundaries defined by their van der Waals radii. Mathematically there exist many ways of describing and representing shapes including triangulations, iso-surfaces, polygons, equations, distance distributions and statistical landmark theory. Here we use explicit, global functions in the form of a real spherical harmonics ex , are single-valued, infinitely differentiable, complex pansion. Spherical harmonics, 91 functions of two variables, and , indexed by two integers, and . Spherical harmonics form a complete set of orthonormal functions and thus any function of and can be expanded as (2.1) The expansion coefficients, , are obtained from (2.2) In this manner, a spectral decomposition (Fourier analysis) of any function on the unit sphere may be performed. The lower values describe the rough low-resolution shape, whereas the higher values add high-resolution detail to the picture. Due to the orthonormality of the spherical harmonics, the expansion coefficients are unique and can therefore be used directly for describing shapes. This numerical integration can be a time-consuming undertaking and the results are often highly sensitive with respect to the integration layout and the weights. We circumvent such problems and speed up the procedure by using spherical-t-designs, Hardin and Sloane (1996). The full integration over the unit sphere for all and values up to can be computed in about one hundredth of a second on a standard single processor linux box. Figure 1: The method in pictures. Top left: The surface of the protein is analysed and binding pockets are predicted using the program SURFNET. Top right: The SURFNET binding pocket prediction is reduced based on residue conservation scores. Bottom left: The real location of the ligand. Bottom right: The shape of the binding pocket is expanded in spherical harmonics and these coefficients are used to compare against coefficients describing other binding pockets and ligands. Shape expansion coefficients have been computed for all ligands in the PDB and for a number of protein binding pockets predicted from deposited protein structures. Shapes can be compared using a standard "! metric 92 Figure 2: The results of matching 40 binding pocket shapes to 40 ligands. Hits on the diagonal indicate that the binding pocket shape matched best to the right ligand in the right conformation. Off-diagonal hits mean either the right ligand in a different conformation or the wrong ligand. The dimension is determined by place in 225-dimensional space. . For ! (2.3) these comparisons therefore take 3 Results We present the results for a test dataset consisting of 40 low-similarity proteins that bind to either ATP, NAD, a heme or a steroid. The shape matching matrix is shown in Fig. 2. As may be seen, shape matching using spherical harmonics provides in many cases valuable information about the ligand. The results shown here are for shape only. The inclusion of additional properties is likely to enhance the performance further. 4 Conclusions A fast shape-matching method has been described for capturing the global shape of a protein’s binding pocket or ligand. The method can also be applied directly to the protein itself. The surface is treated as a (single-valued) function of the spherical coordinates, and , that can then be expanded in a linear combination of real spherical harmonics. If two objects share a common orientation, these coefficients can be directly compared using the standard Euclidean distance ( ! ) metric in -dimensional space, where is determined by the order of the spherical harmonics chosen to represent the shape. We employ a heuristic that defines a standard frame of reference based on the first, second, and third moments of a 3D object. Although 93 spherical harmonics enjoy mathematically convenient rotational properties, it would be of advantage not have to perform any rotational superposition (optimisation) at all. In the field of Computer Science this has led to the development of 3D shape retrieval systems based on rotation invariant descriptors, Funkhouser at al. (2003), Kazhdan at al. (2003). This approach of course loses orientation information in that the original shape can obviously not be reconstructed from the rotation invariant descriptors, but overall the advantages of this method make it superior to registration dependent approaches. The main hurdle in this method is how to obtain an accurate prediction of the binding pocket. Another difficulty is the above mentioned alignment and also how to handle conformational changes in both protein and ligands. Although there are potential problems with this approach, we have shown that for predicting protein function, shape is a powerful concept that can be described and compared efficiently using spherical harmonics. References Funkhouser, T., Min, P., Kazhdan, M., Chen, J., Halderman, A., Dobkin, D. and Jacobs, D. (2003). A search engine for 3D models, ACM Transitions on Graphics, 22,1, 1-28. Glaser, F., Morris, R. J., Najmanovich, R. J., Laskowski, R., and Thornton, J. M. (2005). SURFNET + ConSurf: A two-stage methodology for accurate binding pocket prediction. Submitted. Hardin, R.H and Sloane, J. A. (1996). McLaren’s Improved Snub Cube and Other New Spherical Designs in Three Dimensions, Discrete and Computational Geometry, 15, 429-441. Laskowski, R. (1995). SURFNET: A program for visualizing molecular surfaces, cavities, and intermolecular interactions, J. Mol. Graph., 13, 323-330. Laskowski, R., Luscombe, N. M., Swindells, M. B. and Thornton, J. M. (1996). Protein clefts in molecular recognition and function, Protein Sci., 5. 2438-2452. Kazhdan, M., Funkhouser, T. and Rusinkiewicz, S. (2003). Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors, in Eurographics Symposium on Geometry Processing editors Kobbelt, Schröder and Hoppe. Morris, R. J., Najmanovich, R., Kahraman, A., Thornton, J. M. (2005). Real spherical harmonic expansion coefficients as 3D shape descriptors for protein binding pocket and ligand comparisons. Bioinformatics. 15 21(10):2347-55. Ritchie, D. W. and Kemp, G. J. L. (1999). Fast Computation, Rotation, and Comparison of Low Resolution Spherical Harmonic Molecular Surfaces, J. Comp. Chem., 20,4, 383395. Ritchie, D. W. and Kemp, G. J. L. (2000). Protein Docking Using Spherical Polar Fourier Correlations, Proteins: Structure, Function, and Genetics, 39,4, 178-194. 94
© Copyright 2026 Paperzz