ROVER is a tool for determining if one or more of a group of transcription factors is likely to regulate a group of genes. It was designed for use with promoters from groups of genes that are suspected of being co-regulated, such as those from a microarray study. ROVER compares two groups of promoters (a suspected co-regulated group and a non-regulated group) by determining the relative over-abundance of likely binding sites for a particular Transcription Factor (TF) in one group versus the other. ROVER calculates the significance of any over-abundance of binding sites for each TF and reports a probability of its chance occurrence. This can be interpreted as the probability that a given TF regulates the group of genes in question. Likely binding sites are found by looking for high-scoring matches to a Position Specific Weight Matrix (PSSM), which represents known binding sites for a transcription factor. In addition to determining the significance of each TF, ROVER also provides the subset of sequences likely to be regulated by each TF and the specific significant binding sites. ROVER is available as a command-line Java program (download below). A web version of ROVER is also available as part of the MotifViz web site. There is also a C++ version, which is no longer being maintained.
>YBL002W HTB2 TACCCAATAGCTTGTTCAATTCATCATCATTTCTGATGGCCAATTGTAAATGTCTTGGAATAATTCTGGTTTTTTTGTTATCTCTAGCAGCATTACCAGCCAATTCTAAAATTTCAGCAGCCAAATATTCTAAGACAGCAGTTAGATAGACTGGAGCACCAGAACCAATTCTCTGGGCGTAGTTACCTCTTCTTAGCAATCTGTGCACTCTACCAACTGGGAATGTTAAACCAGCTTTAGCAGATCTAGATTGAGAAGCTTTAGCAGCTGAACCAGCTTTACCACCTTTACCACCGGACATTATATATTAAATTTGCTCTTGTTCTGTACTTTCCTAATTCTTATGTAAAAAGACAAGAATTTATGATACTATTTAATAACAAAAAACTACCTAAGAAAAGCATCATGCAGTCGAAATTGAAATCGAAAAGTAAAACTTTAACGGAACATGTTTGAAATTCTAAGAAAGCATACATCTTCATCCCTTATATATAGAGTTATGTTTGATATTAGTAGTCATGTTGTAATCTCTGGCCTAAGTATACGTAACGAAAATGGTAGCACGTCGCGTTTATGGCCCCCAGGTTAATGTGTTCTCTGAAATTCGCATCACTTTGAGAAATAATGGGAACACCTTACGCGTGAGCTGTGCCCACCGCTTCGCCTAATAAAGCGGTGTTCTCAAAATTTCTCCCCGTTTTCAGGATCACGAGCGCCATCTAGTTCTGGTAAAATCGCGCTTACAAGAACAAAGAAAAGAAACATCGCGTAATGCAACAGTGAGACACTTGCCGTCATATATAAGGTTTTGGATCAGTAACCGTTATTTGAGCATAACACAGGTTTTTAAATATATTATTATATATCATGGTATATGTGTAAAATTTTTTTGCTGACTGGTTTTGTTTATTTATTTAGCTTTTTAAAAATTTTACTTTCTTCTTGTTAATTTTTTCTGATTGCTCTATACTCAAACCAACAACAACTTACTCTACAACTA >YDR311W TFB1 TCTTTTATATGAAGCGGATTTGAACCAAAACCAGAGCCAACTTGTCGTTTTATATCAGAATCATCACTGACTGGTATGTCTGTGATGGATGGCAAAGCTTTAGCGTTCGCATCTGTATCTAGCTTCCTCAAACTATTAGCTTGATTTTGAGCACTGGTAAGTGCTAACGTATCTACGTCATCTTTGGGTCCAGACGGAAGTCTCTGTTCATTGGTTATGTTATCAGAAGGGGCTGTGGTGTTCTCAGACATCCCCGCAACAAACGAATTTTGTTAATTATGTATGAAACTTTTCGTTTGATCTCAATAATACCACTAGCGACTAAATTTTTATGATACTTAGCTACTTTAAACAAGTCCCTTGTGCTCTGTTTGCTGACACTTTTGATAAAATATGCCTGTGTATAATTCTTTTAGCAGTTTATTTCAAACACAAATGGTATTAAAAGGATAGATGAAAAAAAAAAAAAAAATTAAAGCCACTAGTAATGATACAATCGTGGTATCACAAGCGCTGAATGAAACAAGTGTGGCTATCTATAGCGGATGCAAGTGGAGAACTTGTGAATCCAAACTGAAATATTTTGCCATCATTTGTTGTCCTTTCCCTTTTCCATTCAGGAAAAAAAAAAAAAATTTGACGTCGCCGTCGCGTCGCAGTCATATAATTACAGCAATTTATCTTGTTGAACGACGCAAATTAATGGAAATTGTGACTTACATAGTAAGTATTAGTAAACGTAGTTAAGGCCACGTGGGAAAGATATGAAAGGAGTGTAAGTAATGGATATCGGTCTAACGAAAATGGAAACCAATCTTTAAAAATGATAGTATGATTCGACAGTAAACTAGAAAAGCCACAACCCGTGGGACATGATAAGGCTGCTCGTTTTTGACGCAATTTTTAGACAATACTGAAATTTAGCATAATAAGCTTTCCCAGTGAAAGTAATAATATTTAACCTAGGGTAGGGGTAGGGAAAAAATAAAAGTAAACCATA
>M00713 TBP 0 8 1 4 0 23 3 20 2 8 6 7 0 2 0 0 0 1 3 2 0 1 0 0 0 0 4 12 7 15 18 21 0 20 3 16 >M00728 ROX1 0 0 1 16 0 0 0 0 0 8 9 7 0 0 0 0 0 1 2 5 1 0 0 0 17 0 1 7 3 8 1 17 17 0 17 15
Usage: java -Xmx 250m -jar rover.jar [-C] [-F] [-f] [-h] [-P pvalue] [-p pvalue] -C Pseudo-counts to add to each PWM cell. Default is 0.375. -p Cutoff for single site P-value. Default is 0.001. -P Cutoff for whole sequence P-value. Default is 0.01. -B File containing Fasta formatted background sequences. -F,--flat_base_frequencies ACGT have equal (or flat) background frequencies. -M File containing PSSMs. -S File containing Fasta formatted sequences. -f,--filter Filter out lower case characters (masked repeats). -h,--help Print help message.
The argument -Xmx250m tells java to let ROVER use 250Mb of memory. You can change 250m to another number to suit your system and data set.
The default sequence significance P-value cutoff is 0.01. This option only affects the output. It determines the cutoff for the overall significance of a sequence (multiple hits or single high-scoring hits).
The default individual cis-element significance cutoff is 0.001. This works well for promoters that are each of length 1000. We recommend adjusting this cutoff to approximately 1 / promoter length.