Given a set of DNA sequences that share a common function, ab initio cis-element search programs can be used to identify which motifs are the best conserved in the set. Specialties and references to these programs are:
Sequences may be entered in Fasta, raw, or GenBank format. Any non-alphabetic characters in the sequence will be ignored, and any alphabetic characters except A, C, G and T (uppercase or lowercase) will be converted to 'n' and excluded from matching motifs. If GenBank format is used, your program of choice will read and display any 'CDS' (protein-coding region) annotations. Limits: at most 50 sequences, of total length up to 100 kb.
For example GenBank accession numbers (e.g. NC_001669), 'accession.version' numbers (e.g. NC_001669.1), or GI numbers (e.g. 9628421).
Generic format ('|' means or, '[]' means optional) :
>[sequence_1_property]
[substring_1.1|from_1.1-to_1.1,motif_1.1_name[,substring_1.1_property]]
...
|
>[sequence_1_property]
<motif_1.1_name[,motif_1.1_property_for_sequence_1]
[substring_1.1.1|from_1.1.1-to_1.1.1[,substring_1.1.1_property]]
...
|
<motif_1_name[,motif_1_property]
>[motif_1_property_for_sequence_1]
[substring_1.1.1|from_1.1.1-to_1.1.1[,substring_1.1.1_property]]
...
Enjoy this mess for now, examples coming soon :)
-a | minimum alignment width (1) |
-b | maximum alignment width (10000) |
-c | cooling factor (1) |
-d | frequency of width-adjusting moves (1) |
-l | ("ell") filter lowercase letters |
-m | use modified Lam schedule (default = geometric schedule) |
-n | end each run after this many iterations without improvement (10000) |
-p | pseudocount weight (1.5) |
-q | pretend residue abundances = 1/4 |
-r | number of alignment runs (10) |
-s | seed for random number generator (1) |
-t | initial temperature (0.9) |
-u | use uniform pseudocounts: each pseudocount = p/4 |
-v | verbose: print suboptimal alignments |
-z | turn off ZOOPS (force every sequence to participate in the alignment) |
-1 | ("one") just examine forward strand (default = both strands) |
To save computing time, you can upload a file containing previously saved text output for visualization purposes, after you input the proper sequence information. Please assure the integrity of your input text file. You should either redirect command-line program output to a text file, or save text output at the end of MotifViz result page into a text file.
Please visit the following pages for directions on downloading currently-supported programs:
Internet Explorer 5.2+ is recommended for Mac users.
Form entries | Gene Regulation Hub | Suggestions to: Yutao Fu 07/27/06 14:00 EDT