5. The Sequence Utility
You might have a protein with known sequence but unresolved structure. Or you know a sequence
motif occurring in a binding site you know and would like to find it in other binding sites.
Then you can use Relibase to find out about protein-ligand interactions in proteins containing
a similar sequence by employing the sequence utility.
Suppose you are interested in the active site of the glutamate dehydrogenase of
plasmodium falciparum. (The glutamate dehydrogenase is an interesting target for the
design of novel antimalarial drugs because it occupies a key position for maintaining an
antioxidative environment in plasmodia.) You know the region containing the active site is
rather conserved among different species and has the following sequence in the case of
plasmodium falciparum:
CFRVQ YNSAL GPYKG GLRFH PSVNL SIVKF LGFEQ IFKNS LTGLS MGGGK GGSDF DPKGK
Starting from anywhere in Relibase you can reach the sequence utility by hitting the
Sequence button in the topmost frame.
To search for PDB entries containing your sequence you type or paste it into the form
and hit the Submit button.
Relibase now performs a FASTA search over all proteins in PDB and returns a list of chains
containing a significantly similar sequence. For each chain the FASTA score, the percentage
of identical residues, the length of the similar sequence and any interacting ligands are listed.
If you want to know what the glutamate dehydrogenase of plasmodium falciparum and the interaction
with its substrate might look like, it is useful to look at the most similar chain containing a
ligand. Here it is 1bgv. Before looking at the protein, you might want to check how well the
sequences align. By clicking on the percentage of identity (here 79.661), an alignment between
your query sequence and the hit is displayed. Conserved residues are displayed in blue, similar
residues in red and all others in black. Obviously, most of the regarded sequence is either
conserved or similarly replaced.
By hitting back in your browser, you return to the list of similar chains. Clicking on
the chain listing pdb1bgv now takes you to information about the selected protein.
You can view the protein in your RasMol window. In order to take a closer look at the protein
ligand interaction, you can click on the picture of the ligand. Your browser now displays
ligand information, your RasMol window shows the binding site containing the ligand.
Now you might want to visualise the conservation of the binding site among all the similar
chains you just got by searching for similar sequences. To achieve this you must start with
a protein with a bound ligand. You have already done this by clicking on the ligand of
1bgv. Among other information the Similar binding sites button is displayed.
Clicking on this button brings you to a window with several default values concerning a new
Relibase search. This time, Relibase will search for proteins similar to the entire reference
chain you select. (Here you can only select one chain:
pdb1bgv-A:DBN:reli1 Residues: 449 as only one chain is in contatct with the
ligand.) In order to get as many chains from your first search as possible in your search for
similar binding sites too, you have to change the value for
Minimum Sequence Identity from 95.0 to 30.0 . If you leave the value for
Maximum Resolution at 2.2 some of your desired proteins might not be displayed, because
of their lower resolution. So change the value to 3.0 to get them as well. (The other
parameters and options affect the analysis of conservation performed when superimposing your
structures.
Relibase again returns a list which contains the length of the aligned region, the percentage
of the sequence identity, a list of interacting ligands and a Select button for each entry.
For superimposing the binding sites of the related proteins you have to decide which ones you
are going to include. You can do this by clicking on the Select button at the very left
of each list entry. The more pdb-files you analyse the longer it takes for Relibase to display
the result. So be sure that you have deselected all the chains which are redundant or which you
are not interested in. In our example we have more than 30 entries so let's deselect all
those which are not a dehydrogenase and which are redundant, because they are just mutants of
an already selected chain. (That means you deselect all but pdb1bgv-A,
pdb1gtm-A, pdb1bvu-A, pdb1b26-A.)
By clicking on the Superimpose selected chains button you can submit your query to
Relibase. After a few seconds (or a few more, depending on the amount of chains to be analysed)
the report is displayed in your browser and RasMol shows the binding site of 1bgv, the
only chain with active protein and ligand buttons in the 3D Visualiser Toolbox.
By activating the Protein button for each desired enzyme you can one by one view the
superimposed the binding sites.
Obviously, the binding site doesn't vary fundamentally among the homologous proteins.
Comparing the hitlist resulting from the sequence search with the
similar chain search shows that searching for just a rather conserved sequence stretch
containing the active site includes proteins of similar but not identical function in the search
(e.g. leucine dehydrogenase pdb1leh-A and phenylalanin dehydrogenase pdb1bw9-A).
Due to their low overall sequence identity (24,7% and 22,2% respectively) they would be lost in
noise when performing only similar chain search. Thus, sequence search is a useful tool when
looking for similar binding sites with low overall homology.
top of the page







