AIMotifViz: Ab Initio Motif Search and Visualization

Instructions | Gene Regulation Hub

1. Pick a program:
GLAM (Gapless Local Alignment of Multiple sequences)
Visualization only (For GLAM and others)

2. Input query sequences:
Enter DNA sequences or    
GenBank identifiers:
AND/OR upload from a file:         
3. Set GLAM options:
Command-line options:
Excluded alignment (optional):  
Display full sequence details. (Number of bases per line:)
Display consensus sequence logo.


 



Instructions

Form entries | Gene Regulation Hub

Program of choice

Given a set of DNA sequences that share a common function, ab initio cis-element search programs can be used to identify which motifs are the best conserved in the set. Specialties and references to these programs are:

Sequence Format

Sequences may be entered in Fasta, raw, or GenBank format. Any non-alphabetic characters in the sequence will be ignored, and any alphabetic characters except A, C, G and T (uppercase or lowercase) will be converted to 'n' and excluded from matching motifs. If GenBank format is used, your program of choice will read and display any 'CDS' (protein-coding region) annotations. Limits: at most 50 sequences, of total length up to 100 kb.

GenBank Identifiers

For example GenBank accession numbers (e.g. NC_001669), 'accession.version' numbers (e.g. NC_001669.1), or GI numbers (e.g. 9628421).

Motif Feature Format

Generic format ('|' means or, '[]' means optional) :

>[sequence_1_property]
[substring_1.1|from_1.1-to_1.1,motif_1.1_name[,substring_1.1_property]]
...
|
>[sequence_1_property]
<motif_1.1_name[,motif_1.1_property_for_sequence_1]
[substring_1.1.1|from_1.1.1-to_1.1.1[,substring_1.1.1_property]]
...
|
<motif_1_name[,motif_1_property]
>[motif_1_property_for_sequence_1]
[substring_1.1.1|from_1.1.1-to_1.1.1[,substring_1.1.1_property]]
...

Enjoy this mess for now, examples coming soon :)

Quick match site sequences


IUPAC_symbol_sequence_1[ IUPAC_symbol_sequence_2 ...]

IUPAC_symbol=A/C/G/T/R/Y/S/W/K/M/H/B/V/D/N/X
R=A/G Y=C/T S=C/G W=A/T K=G/T M=A/C B=C/G/T D=A/G/T H=A/C/T V=A/C/G N/X=A/C/G/T

GLAM options

-aminimum alignment width (1)
-bmaximum alignment width (10000)
-ccooling factor (1)
-dfrequency of width-adjusting moves (1)
-l("ell") filter lowercase letters
-muse modified Lam schedule (default = geometric schedule)
-nend each run after this many iterations without improvement (10000)
-ppseudocount weight (1.5)
-qpretend residue abundances = 1/4
-rnumber of alignment runs (10)
-sseed for random number generator (1)
-tinitial temperature (0.9)
-uuse uniform pseudocounts: each pseudocount = p/4
-vverbose: print suboptimal alignments
-zturn off ZOOPS (force every sequence to participate in the alignment)
-1("one") just examine forward strand (default = both strands)

Visualization only

To save computing time, you can upload a file containing previously saved text output for visualization purposes, after you input the proper sequence information. Please assure the integrity of your input text file. You should either redirect command-line program output to a text file, or save text output at the end of MotifViz result page into a text file.

Please visit the following pages for directions on downloading currently-supported programs:

  1. GLAM -- http://zlab.bu.edu/glam/

Internet Explorer 5.2+ is recommended for Mac users.

Form entries | Gene Regulation Hub | Suggestions to: Yutao Fu   07/27/06 14:00 EDT