TRANSFAC Motif Format

Description

The TRANScription FACtor database is a manually curated set of motifs managed by the company BIOBASE. The most up-to-date version of the TRANSFAC database has to be licenced but older versions are avalable free for non-commercial users.

Most MEME Suite programs do not support the TRANSFAC 'matrix.dat' motif file directly but there is a script for converting it into MEME format motifs.

Format Specification

For MEME SUITE purposes, the essential portions of the TRANSFAC motif format are:

ID motif name
BF species name
P0 letter1 ... lettern
01 count1,1 ... count1,n consensus letter
02 count2,1 ... count2,n consensus letter
... more rows here ...
nn countnn,1 ... countnn,n consensus letter
XX
//

The P0 row labels the columns with the letters of the sequence alphabet. The numbered rows (01...nn) contain counts for each letter in the sequence alphabet for that position in the motif. These counts should give the number of times each letter appears in known examples of the motif at the given position. The counts in each numbered row should add up to the same total count. The last column in each count row gives the consensus letter for that position in the motif, and is ignored by MEME SUITE programs.

Here is an example of a TRANSFAC format motif file containing two motifs suitable for MEME SUITE programs:

ID any_old_name_for_motif_1
BF species_name_for_motif_1
P0      A      C      G      T
01      1      2      2      0      S
02      2      1      2      0      R
03      3      0      1      1      A
04      0      5      0      0      C
05      5      0      0      0      A
06      0      0      4      1      G
07      0      1      4      0      G
08      0      0      0      5      T
09      0      0      5      0      G
10      0      1      2      2      K
11      0      2      0      3      Y
12      1      0      3      1      G
XX
//
ID any_old_name_for_motif_2
BF species_name_for_motif_2
P0      A      C      G      T
01      2      1      2      0      R
02      1      2      2      0      S
03      0      5      0      0      C
04      3      0      1      1      A
05      0      0      4      1      G
06      5      0      0      0      A
07      0      1      4      0      G
08      0      0      5      0      G
09      0      0      0      5      T
10      0      2      0      3      Y
11      0      1      2      2      K
12      1      0      3      1      G
XX
//
      

See Also

The TRANSFAC motif format can be converted to MEME format with the transfac2meme script.