jaspar2meme [options] <file_name | directory_name>
Convert a file of motifs in JASPAR 2014 or 2016 PFM format or a directory of JASPAR files in one of the three old JASPAR formats (SITES, PFM or CM) into a MEME motif suitable for use with MEME Suite programs.
file_name
The file contains motifs in JASPAR 2014 or 2016 PFM format. In these formats, each motif is preceded by a header line that begins with '>' and is followed by a unique identfier (e.g., 'MA0001.1'). An optional second identifier can follow the first (e.g., 'SEP4'). JASPAR 2016 PFM format includes the letter at the beginning of each line and square brackets around the line of counts.
directory_name
A directory containing one or more JASPAR motif files in one of the following three formats.
This format describes a motif in terms of a count matrix where the
rows correspond to A, C, G and T respectively. The JASPAR count
file names are expected to end with the .pfm
extension.
This format describes a motif in terms of a multiple alignment of
sites. It contains a multiple alignment in modified
FASTA format. Only capitalized
sequence letters are part of the alignment. The sites formatted file names
are expected to end with the .sites
extension.
This format describes a motif in terms of a count matrix with each
row preceeded by the letters A|
, C|
,
G|
and T|
. The CM count file names are expected
to end with the .cm
extension.
Writes MEME motif format to standard output.
A probability matrix and optionally a log-odds matrix are output for each motif in the file. The probability matrix is computed using pseudo-counts consisting of the background frequency (see -bg, below) multiplied by the total pseudocounts (see -pseudo, below). The log-odds matrix uses the background frequencies in the denominator and is log base 2.
Option | Parameter | Description | Default Behaviour |
---|---|---|---|
General Options | |||
-bundle | Read motifs in JASPAR 2014 or JASPAR 2016 PFM format from the file named file_name .
The lines may be in any order but the MEME matrices will be output with
the lines in the standard order (e.g., ACGT for DNA). |
Read JASPAR SITES files (.sites ) from directory_name . |
|
-pfm | Read JASPAR PFM files (.pfm ) from directory_name . |
Read JASPAR SITES files (.sites ) from directory_name . |
|
-cm | Read JASPAR CM files (.cm ) with line labels A| etc.
from directory directory_name . |
Read JASPAR SITES files (.sites ) from directory_name . |
|
-strands | 1|2 | Specify if a single strand or both strands were considered to create the motif. | Defaults to reporting that both strands were scanned. |
-numbers | Use a number based on the position in the input instead of the JASPAR ID as the motif identifier. | The JASPAR ID is used as the motif identifier. | |
-bg | background file | The background file should be a Markov background model. It contains the background frequencies of letters use for assigning pseudocounts. The background frequencies will be included in the resulting MEME file. | Uses uniform background frequencies. |
-pseudo | total pseudocounts | Add total pseudocounts times letter background to each frequency. | No pseudocount is added. |
-logodds | Include a log-odds matrix in the output. This is not required for versions of the MEME Suite ≥ 4.7.0. | The log-odds matrix is not included in the output. | |
-url | website | The provided website URL will be stored with the motif and this can be used by MEME Suite programs to provide a direct link to that information in their output. If website contains the keyword MOTIF_NAME the motif name is substituted in place of MOTIF_NAME in the output. For example if the url is http://big-box-of-motifs.com/motifs/MOTIF_NAME.html and the motif name is MA0024 , the motif will contain a link to http://big-box-of-motifs.com/motifs/MA0024.html | The output does not include a URL with the motifs. |
>MA0002.1 RUNX1 10 12 4 1 2 2 0 0 0 8 13 2 2 7 1 0 8 0 0 1 2 2 3 1 1 0 23 0 26 26 0 0 4 11 11 14 24 1 16 0 0 25 16 7
0 3 79 40 66 48 65 11 65 0 94 75 4 3 1 2 5 2 3 3 1 0 3 4 1 0 5 3 28 88 2 19 11 50 29 47 22 81 1 6
>MA0024 E2F 1 aTTTGGCGC >MA0024 E2F 2 TTTGGCGC >MA0024 E2F 3 TTTGGCGC >MA0024 E2F 4 TTTGGCGC >MA0024 E2F 5 TTTCGCGC >MA0024 E2F 6 TTTCGCGC >MA0024 E2F 7 TTTCGCGC >MA0024 E2F 8 TTTGCCGC >MA0024 E2F 9 TTTCCCGC >MA0024 E2F 10 TTTGGCGG
A| 0 3 79 40 66 48 65 11 65 0 C| 94 75 4 3 1 2 5 2 3 3 G| 1 0 3 4 1 0 5 3 28 88 T| 2 19 11 50 29 47 22 81 1 6