English | Deutsch

5. The Sequence Utility


You might have a protein with known sequence but unresolved structure. Or you know a sequence motif occurring in a binding site you know and would like to find it in other binding sites. Then you can use Relibase to find out about protein-ligand interactions in proteins containing a similar sequence by employing the sequence utility.

Suppose you are interested in the active site of the glutamate dehydrogenase of plasmodium falciparum. (The glutamate dehydrogenase is an interesting target for the design of novel antimalarial drugs because it occupies a key position for maintaining an antioxidative environment in plasmodia.) You know the region containing the active site is rather conserved among different species and has the following sequence in the case of plasmodium falciparum:


CFRVQ YNSAL GPYKG GLRFH PSVNL SIVKF LGFEQ IFKNS LTGLS MGGGK GGSDF DPKGK


Starting from anywhere in Relibase you can reach the sequence utility by hitting the Sequence button in the topmost frame.

To search for PDB entries containing your sequence you type or paste it into the form and hit the Submit button.

5_1

Relibase now performs a FASTA search over all proteins in PDB and returns a list of chains containing a significantly similar sequence. For each chain the FASTA score, the percentage of identical residues, the length of the similar sequence and any interacting ligands are listed.

5_2

If you want to know what the glutamate dehydrogenase of plasmodium falciparum and the interaction with its substrate might look like, it is useful to look at the most similar chain containing a ligand. Here it is 1bgv. Before looking at the protein, you might want to check how well the sequences align. By clicking on the percentage of identity (here 79.661), an alignment between your query sequence and the hit is displayed. Conserved residues are displayed in blue, similar residues in red and all others in black. Obviously, most of the regarded sequence is either conserved or similarly replaced.

5_3

By hitting back in your browser, you return to the list of similar chains. Clicking on the chain listing pdb1bgv now takes you to information about the selected protein. You can view the protein in your RasMol window. In order to take a closer look at the protein ligand interaction, you can click on the picture of the ligand. Your browser now displays ligand information, your RasMol window shows the binding site containing the ligand.

5_4



5_5

Now you might want to visualise the conservation of the binding site among all the similar chains you just got by searching for similar sequences. To achieve this you must start with a protein with a bound ligand. You have already done this by clicking on the ligand of 1bgv. Among other information the Similar binding sites button is displayed.

5_6

Clicking on this button brings you to a window with several default values concerning a new Relibase search. This time, Relibase will search for proteins similar to the entire reference chain you select. (Here you can only select one chain: pdb1bgv-A:DBN:reli1 Residues: 449 as only one chain is in contatct with the ligand.) In order to get as many chains from your first search as possible in your search for similar binding sites too, you have to change the value for Minimum Sequence Identity from 95.0 to 30.0 . If you leave the value for Maximum Resolution at 2.2 some of your desired proteins might not be displayed, because of their lower resolution. So change the value to 3.0 to get them as well. (The other parameters and options affect the analysis of conservation performed when superimposing your structures.

Relibase again returns a list which contains the length of the aligned region, the percentage of the sequence identity, a list of interacting ligands and a Select button for each entry.

5_7

For superimposing the binding sites of the related proteins you have to decide which ones you are going to include. You can do this by clicking on the Select button at the very left of each list entry. The more pdb-files you analyse the longer it takes for Relibase to display the result. So be sure that you have deselected all the chains which are redundant or which you are not interested in. In our example we have more than 30 entries so let's deselect all those which are not a dehydrogenase and which are redundant, because they are just mutants of an already selected chain. (That means you deselect all but pdb1bgv-A, pdb1gtm-A, pdb1bvu-A, pdb1b26-A.)

By clicking on the Superimpose selected chains button you can submit your query to Relibase. After a few seconds (or a few more, depending on the amount of chains to be analysed) the report is displayed in your browser and RasMol shows the binding site of 1bgv, the only chain with active protein and ligand buttons in the 3D Visualiser Toolbox.

By activating the Protein button for each desired enzyme you can one by one view the superimposed the binding sites.

5_8

Obviously, the binding site doesn't vary fundamentally among the homologous proteins.

Comparing the hitlist resulting from the sequence search with the similar chain search shows that searching for just a rather conserved sequence stretch containing the active site includes proteins of similar but not identical function in the search (e.g. leucine dehydrogenase pdb1leh-A and phenylalanin dehydrogenase pdb1bw9-A). Due to their low overall sequence identity (24,7% and 22,2% respectively) they would be lost in noise when performing only similar chain search. Thus, sequence search is a useful tool when looking for similar binding sites with low overall homology.



top of the page