Tutorial: PHYRE2

Here we will study the predicted structure of UniProt Q14654, Gene Name KCN11, the human ATP-sensitive inward rectifier potassium channel 11. In particular the sequence variant Arg 201 to His is associated with neonatal diabetes. There is no X-ray or NMR structure for the human protein, so we require a predicted structure to provide insights into the structural basis for this effect.

1) Analysis using Genome3D

Open your favourite web browser (preferably not Internet Explorer!). Navigate to the Genome3D page for  Q14654, either by opening that link or by following the instructions in the next paragraph.

Navigate to the Genome3D website:  http://genome3d.eu and on the front page put in the search box the UniProt id Q14654 and run search. You will see “Displaying 1 matching UniProt entry”. Click on either the IRK11_HUMAN link or the number 8 beneath “Structural Predictions”.

You can see an overview of the predicted domains from the various Genome3D partners. Down the right of the page you can see each of the predicted superfamilies and which structural classification hierarchy they came from (SCOP or CATH). Note the gold, silver & bronze rating, these indicate the degree to which the SCOP and CATH classifications agree.

Look at the top figure, the numbers along the bottom mark the protein sequence position, and the coloured bars show each partner's domain predictions. From the figure we can see from roughly residues 50 to 155, three of the partner prediction methods agree on one domain (Voltage-gated potassium channel) and the CATH and SCOP domain assignments agree; there are no domain assignments from the other partners. For the region 156 – 360, all partners provide predictions, but on closer inspection there are quite different domain assignments. However if you click on b.1.18 on the top right this will take you to SCOP. In the list of families you will find the potassium channel (family number 10).

Return to the Genome3D page. Now look at the section below titled “Predicted 3D Structures”. If you click on all the bars and highlight them in orange and then launch a viewer (say PyMOL), you will see the structures superposed. There is good agreement between all the models.

2) Analysis using Phyre2 link from Genome3D

On the right hand side of either the “Predicted Domains” or “Predicted 3D Structures”, you will see links to the Genome3D modelling resources. Click on PHYRE2 and this takes you to the details of the predicted Phyre2 model.

The page is divided into four main sections a-d, explained below:

  • a) Summary. This section displays an image of the highest confidence single model from Phyre2, information on the template (known structure) used to build the model, confidence in homology, coverage of the input query sequence by the model, and an option to view the model using JSMol (this should work on all browsers except Internet Explorer at the moment).
  • b) Secondary structure and disorder prediction. The sequence has been processed by the programs PSI-Pred and Diso-Pred to predict the locations of alpha helices, beta strands and disordered regions (shown with a question mark). Confidence in the predicted state for each position is shown using a rainbow colour code: red=high confidence, blue=low confidence.
  • c) Domain analysis. The next table shows what regions of the input sequence have been matched by known structures colour coded by confidence in the match. This enables you to see the approximate domain structure of your protein.
  • d) Detailed template information. This is the main table of results showing a ranked list of matches to known structure (the template), information about the template, the region aligned and an image of the model produced. Clicking on the protein picture will download a PDB formatted file of the model of your protein based on the template shown. For even further detail, click on the “Alignment” button in a particular row.

As Genome3D is a database of pre-computed models, if the Phyre2 information is valuable to you, the best procedure is to rerun Phyre2 from its web page to obtain the very latest predictions with the most recent analysis features.

For the purposes of this workshop, we have recently re-run UniProt Q14654 through a newer version of Phyre2. The results of this analysis can be found by scrolling to the top of the results page and clicking on a link entitled: “Click here for UPDATED RESULTS for Genome3D Workshop”. Click that link now.

In general, if you want to resubmit a sequence to Phyre2, just visit the main submission page. This can be found here:  http://www.sbg.bio.ic.ac.uk/phyre2/ or just by searching for Phyre2 in Google.

3) Phyre Investigator

Once you have clicked on the “UPDATED RESULTS” link in Step 2, you will be taken to a newer version of the analysis. The layout is largely similar to that seen in Step 2 but with optional show/hide links to avoid screen clutter. If you go to the “Detailed Template information” section there is, in each row, a new link in the rightmost column saying “View Investigator results”. (Normally a button would be present here for you to choose whether to run Phyre Investigator, but we have already run all these analyses for you.)

Go to the first entry in the table of results (template c3syaA_). Click on the “view Investigator results” link. This will take you to a page showing the results of running a large number of different analysis programs on the top-ranking model from Phyre2.

From left to right, the page displays the model in an interactive JSMol window, buttons providing a choice of analyses, and two bar graphs displaying preferred amino acids at each position in the sequence and mutations likely to have a phenotypic effect as predicted by SuSPect.

At the bottom of the page is a sequence and secondary structure view. As you hover your mouse over a region of the sequence, it will highlight where that residue is in the model. Scroll the sequence pane to the right to find residue R201. If you hover over that residue in the sequence view you can see the mutations graph above shows that almost all changes to this residue (with the exception of an R->A mutation) are strongly predicted (tall red bars) to have a phenotypic effect.

If you click on this residue it will spacefill that position in the JSmol window. Go to the “Analyses” panel, click on “Function” and click on “Mutational sensitivity”. This colours the entire structure by the average confidence from SuSPect of a mutation having a phenotypic effect. As you can see, the highlighted R201 is coloured orange indicating high sensitivity to mutation.

4) The SuSPect web server

SuSPect is our predictor of the phenotypic effect of amino acid variants including nsSNPs. It uses a machine learning approach (an SVM) based on sequence and protein-protein interaction network features to make the prediction. Within SuSPect there are either pdb or Phyre2-predicted structures stored.

To explore the use of SuSPect click on the link:

 http://www.sbg.bio.ic.ac.uk/~suspect/

You can enter Q14654 R201H and run to see the effect of this mutation

You will see a score of 87 suggesting that this sequence change is disease associated. Click on this score and you see the features used in SuSPect to make this prediction.

References

  • Kelley, L.A. & M.J. Sternberg, Protein structure prediction on the Web: a case study using the Phyre server. Nat Protoc, 2009. 4: p. 363-71.
  • Yates, C.M., I. Filippis, L.A. Kelley & M.J. Sternberg, SuSPect: Enhanced Prediction of Single Amino Acid Variant (SAV) Phenotype Using Network Features. J Mol Biol, 2014. 426: p. 2692-701.