Nucleotide Count Matrices for a Selection of Cis-elements

Each matrix contains counts of adenines, cytosines, guanines, and thymines observed at each position in a sample of cis-elements of one type.

With current data, it is not possible to construct accurate matrices for each of the thousands of human transcription factors, or the tens of thousands of dimers. Fortunately, transcription factors naturally belong to families that posssess similar, though generally not identical, DNA binding properties. These matrices therefore represent approximate DNA binding representations for selected families of transcription factors. This list will grow in the future, and it will be necessary to accommodate factors that bind to motifs of varying length or half-site organization.

TATA box

ACGT
6114515231
164618309
3520235
3102374
3540530
26800121
3603206
222244121
1554415733
5613515048
8314712831
8212712852
8211812861
6810713975
7710114071

Source: Bucher P (1990) J Mol Biol 212, 563-578, Table 3

CCAAT box

ACGT
52474122
22513852
16674041
8019605
6877811
016400
216110
160013
164000
010163
2088506
9610553
21289817
58573316
47117234
34534035

Source: Mantovani R (1999) Gene 239(1), 15-27

Sp1

ACGT
32213520
2420568
14106519
171891
001080
021060
198009
25992
01998
215766
1710729
3552129
9403227

Source: TRANSFAC 5.0 accession # M00196

AP-1 (Activator Protein 1)

ACGT
1412237
1817183
00056
00560
55100
34454
45740
133058
381053
11131814
11151020

Source: TRANSFAC 5.0 accession # M00174

CRE (cAMP Response Element)

ACGT
2675
17111
00020
00200
20000
02000
10190
15014
10802
14141
2666
3962

Source: TRANSFAC 5.0 accession # M00178

Ets

ACGT
73105
155125
217181
29730
00390
00390
39000
33006
102261
68618
71132

Source: Mimeault M (2000) Crit Rev Oncog 11(3-4), 227-53

ERE (Estrogen Response Element)

Matrix updated 1.29.03

ACGT
16342
20230
52180
20320
02140
22030
01474
31633
6757
30220
10240
19330
02500
12022

Source: O'Lone R, Frith MC, Hansen U (in preparation)

GATA

ACGT
1511110
10111215
112593
252219
00480
48000
00048
48000
271164
118245
1281810
1513164
8131412

Source: TRANSFAC 5.0 accession # M00128

Myc

Matrix corrected 11.18.02

ACGT
2480
3731
01400
14000
01004
00140
02012
00140
2642
3722

Source: Grandori C & Eisenman RN (1997) TIBS 22 177-181, Table 1

Myf (Myogenin / MyoD family)

ACGT
3.540.50
4.503.50
2150
07.50.50
8000
3.504.50
07.50.50
3005
0080
0530
3005
0080

Source: Wasserman WW, Fickett JW (1998) J Mol Biol 278, 167-81

E2F

ACGT
44433
42336
02439
123210
04410
04500
00450
032130
113265
241155
261810
245124

Source: Kel AE, Kel-Margoulis OV et al. (2001) J Mol Biol 309(1), 99-120

NF1

ACGT
17181426
17241420
511158
00174
00750
10740
56721
3013819
23202012
16163211
20191818
3471915
10172721
1041915
11401212
3018918
27122115
22142019

Source: TRANSFAC 5.0 accession # M00193

LSF

Matrix corrected 12.09.02 - thanks to Vivek Ramaswamy. (The web tools used the correct matrix all along.)

ACGT
50113
01720
20017
00172
01180
2368
12115
4627
23104
6643
5194
01720
16210
5185
05140

Source: Frith MC, Hansen U, Weng Z (2001) Bioinformatics 17(10), 878-889

Mef-2 (Myocyte Enhancer Factor 2)

ACGT
504.51.5
01100
0614
00011
11000
00011
3008
10010
1.5009.5
2009
11000
5051

Source: Wasserman WW, Fickett JW (1998)

SRF (Serum Response Factor)

ACGT
3.52.53.51
4.5123
010.500
08.502
9001.5
4.5006
7.50.520.5
4015.5
10.5000
7102.5
0010.50
0010.50
3.5430

Source: Wasserman WW, Fickett JW (1998)

Tef (Transcription Enhancer Factor)

ACGT
0.530.52
4.501.50
0600
6000
0006
0006
0600
05.500.5
2.5003.5
0.53.520
121.51.5
0141

Source: Wasserman WW, Fickett JW (1998)