FAST Alignment and Search Tool

Home Help Download Resources Comments

 

Run FAST Online

Thanks to its excellent speed and reliability, the FAST algorithm makes structural alignment just a few clicks away. The FAST web server is easy to use. Just upload your structures (in PDB format) using the "Browse" buttons, and then click "Submit", the structural alignment will be delivered to your browser instantly. Depending on protein size, network traffic and server load, it may take up to 10 seconds. Following the links in the result page, you can download the alignment result (named align.out) and the rasmol script to visualize your alignment (named align.ras). The format and use of these files are described below. Result files are in the same format as generated by running FAST offline.


Run FAST Offline

It is more convenient to run FAST on your own computer. After you have downloaded the FAST executables for your computer, type command "fast --help" to get started. Below is the help message:


 FAST - FAST Alignment and Search Tool, is a protein structure
 alignment algorithm developed by Jianhua Zhu at the Bioinformatics
 Program of Boston University. Please visit the FAST homepage
 at http://bu.wenglab.org/.
 
 USAGE: $ fast protein1 protein2 [-r rasmol-script]
 
 Description:
 %1              fn      protein #1
 %2              fn      protein #2
 -o|--out        fn      output file name [stdout]
 -r|--rasmol     fn      generate a rasmol script for the alignment
 -p|--path       fn      path to protein structures
 -h|--help               display this message
 
 To visualize alignment under unix/linux:
         $ rasmol -script rasmol-script
 
 Email zhiping.weng@umassmed.edu for bug reports.

 

FAST takes two PDB files as input (here are two examples: d1bhga1 and d1cs6a3). FAST only reads the ATOM lines of CA's and ignores chain information. You may want to isolate the segment or chain of interest before running FAST. For example, the following command takes the CA lines of chain A from a PDB file:

$ grep '^ATOM.*CA.* A ' your.pdb

If the PDB files are in another directory, you may specify the directory using the option "-p <path-to-pdb>". The alignment will be sent to the standard output unless the user specifies an output file using "-o <output-file-name>". Use option "-r <rasmol-script-name>" to visualize the alignment (see below). FAST uses a single set of thoroughly tested parameters and no case-to-case adjusting is necessary. Below is the backend command line when you run FAST online.

$ fast protein1 protein2 -o align.out -r align.ras


Interpreting the Alignment

Let's take the alignment of d1bhga1 and d1cs6a3 (both are taken from SCOP version 1.61) as an example:

FAST ALIGNMENT: d1bhga1 d1cs6a3
L=73 SX=5.649e+02 SN=5.835e+00 L1=103 L2=91 RMSD=3.295

 1: ----TYIDDIT---VTTSVEQDS-GLVNYQISVKGSNLFKLEVRLLDAENKVVAN--GTG
 2: RQYAPSIKAKFPADTYALT----GQMVTLECFAFGNPVPQIKWRKLDGSQTSK--WLSSE

 1: TQGQLKVPGVSLWWPYLMHERPAYL-------YSLEVQLTAQTSLGP-VSDFYTLPVGIR
 2: PLLHIQ-------------------NVDFEDEGTYECEAENI-----KGRDTYQGRIIIH

 1: T*
 2: A*

  226  THR T  P PRO   213 
  227  TYR Y  S SER   214 
  228  ILE I  I ILE   215 
  229  ASP D  K LYS   216 
  230  ASP D  A ALA   217 
  231  ILE I  K LYS   218 
  232  THR T  F PHE   219 
  233  VAL V  T THR   223 
  234  THR T  Y TYR   224 
  235  THR T  A ALA   225 
  236  SER S  L LEU   226 
  237  VAL V  T THR   227 
  242  GLY G  Q GLN   229 
  243  LEU L  M MET   230 
  244  VAL V  V VAL   231 
  245  ASN N  T THR   232 
  246  TYR Y  L LEU   233 
  247  GLN Q  E GLU   234 
  248  ILE I  C CYS   235 
  249  SER S  F PHE   236 
  250  VAL V  A ALA   237 
  251  LYS K  F PHE   238 
  252  GLY G  G GLY   239 
  253  SER S  N ASN   240 
  254  ASN N  P PRO   241 
  255  LEU L  V VAL   242 
  256  PHE F  P PRO   243 
  257  LYS K  Q GLN   244 
  258  LEU L  I ILE   245 
  259  GLU E  K LYS   246 
  260  VAL V  W TRP   247 
  261  ARG R  R ARG   248 
  262  LEU L  K LYS   249 
  263  LEU L  L LEU   250 
  264  ASP D  D ASP   251 
  265  ALA A  G GLY   252 
  266  GLU E  S SER   253 
  267  ASN N  Q GLN   254 
  268  LYS K  T THR   255 
  269  VAL V  S SER   256 
  270  VAL V  K LYS   257 
  273  GLY G  S SER   260 
  274  THR T  S SER   261 
  275  GLY G  E GLU   262 
  276  THR T  P PRO   263 
  277  GLN Q  L LEU   264 
  278  GLY G  L LEU   265 
  279  GLN Q  H HIS   266 
  280  LEU L  I ILE   267 
  281  LYS K  Q GLN   268 
  301  TYR Y  G GLY   276 
  302  SER S  T THR   277 
  303  LEU L  Y TYR   278 
  304  GLU E  E GLU   279 
  305  VAL V  C CYS   280 
  306  GLN Q  E GLU   281 
  307  LEU L  A ALA   282 
  308  THR T  E GLU   283 
  309  ALA A  N ASN   284 
  310  GLN Q  I ILE   285 
  316  VAL V  G GLY   287 
  317  SER S  R ARG   288 
  318  ASP D  D ASP   289 
  319  PHE F  T THR   290 
  320  TYR Y  Y TYR   291 
  321  THR T  Q GLN   292 
  322  LEU L  G GLY   293 
  323  PRO P  R ARG   294 
  324  VAL V  I ILE   295 
  325  GLY G  I ILE   296 
  326  ILE I  I ILE   297 
  327  ARG R  H HIS   298 
  328  THR T  A ALA   299 

 

 The output file contains three components from top to bottom:

1. A header line beginning with "FAST ALIGNMENT:" to indicate the pair of protein structures aligned. The next line contains a basic description of the alignment. Each field is in the format of "name=value", explained in the following table

L Length, or the number of residue pairs aligned
SX The raw score, or total similarity
SN Normalized score
L1 Number of residues in protein #1
L2 Number of residues in protein #2
RMSD Root-mean-square distance after superposed

Always use the normalized score to measure the significance of an alignment. Usually, SN greater than 1.5 indicates significant structural similarity. The raw score depends heavily on the sizes of the protein structures. As a result, larger proteins tend to produce bigger raw scores by chance.

2. The next block displays the structure alignment in a sequence alignment format. Lines beginning with "1:" are for protein #1 and "2:" for protein #2. The alignment ends with a pair of *'s.

3. The last block contains lines of aligned residues, one line per pair. For instance, the first line in our example means that "226 THR" in d1bhga1 is paired to "213 PRO" in d1cs6a3.


Visualize the Alignment

To visualize aligned structures, generate a rasmol script using the option "-r <rasmol-script-name>". When running FAST through the web server, just click the link to save the script. Rasmol is a powerful software to display molecular structures. It is freely available here.

Once you have installed Rasmol and have gotten a rasmol script, say align.ras, for your alignment, simply type "rasmol -script align.ras" to view it. The first protein is colored red and the second cyan. The aligned backbone segments are thickened. On Windows systems, the rasmol program is named "rw32b2a.exe". In order to invoke rasmol with the option -script, please type "rw32b2a -script your.script" either in a DOS window or by clicking Start|Run.