Thesis

A robust machine learning approach for the prediction of allosteric binding sites

Creator
Rights statement
Awarding institution
  • University of Strathclyde
Date of award
  • 2016
Thesis identifier
  • T14537
Person Identifier (Local)
  • 201293005
Qualification Level
Qualification Name
Department, School or Faculty
Abstract
  • Allosteric regulatory sites are highly prized targets in drug discovery. They remain difficult to detect by conventional methods, with the vast majority of known examples being found serendipitously. Herein, a rigorous, wholly-computational protocol is presented for the prediction of allosteric sites. Previous attempts to predict the location of allosteric sites by computational means drew on only a small amount of data. Moreover, no attempt was made to modify the initial crystal structure beyond the in silico deletion of the allosteric ligand. This behaviour can leave behind a conformation with a significant structural deformation, often betraying the location of the allosteric binding site. Despite this artificial advantage, modest success rates are observed at best. This work addresses both of these issues. A set of 60 protein crystal structures with known allosteric modulators was collected. To remove the imprint on protein structure caused by the presence of bound modulators, molecular dynamics was performed on each protein prior to analysis. A wide variety of analytical techniques were then employed to extract meaningful data from the trajectories. Upon fusing them into a single, coherent dataset, random forest - a machine learning algorithm - was applied to train a high performance classification model. After successive rounds of optimisation, the final model presented in this work correctly identified the allosteric site for 72% of the proteins tested. This is not only an improvement over alternative strategies in the literature; crucially, this method is unique among site prediction tools in that is does not abuse crystal structures containing imprints of bound ligands - of key importance when making live predictions, where no allosteric regulatory sites are known.
Advisor / supervisor
  • Dufton, Mark
  • Johnston, Blair
Resource Type
Note
  • Previously held under moratorium from 28 March 2017 until 28 March 2022
DOI
Date Created
  • 2016
Former identifier
  • 9912546383702996
Related items

Relations

Items