iupac2meme [options] <iupac_motifs>+
Convert an IUPAC motif into MEME motif format suitable for use with MEME Suite programs.
The program accepts 1 or more IUPAC motifs.
An IUPAC motif represents frequencies by using either an exact letter meaning that letter occurs in all sites, or ambiguous letters, representing an equal frequency of all the letters representing by that letter. This program additionally supports regular expression bracket expressions (character classes) where multiple letters can be grouped in one with the use of square brackets. Negated character classes are also supported and are by indicated using a caret ('^') immediately following the opening bracket.
A background frequency file modifies the assumption of equal probability of all alternative letters.
A probability matrix and optionally a log-odds matrix are output for each motif provided on the command line. The probability matrix is computed using pseudo-counts consisting of the background frequency (see -bg, below) multiplied by the total pseudocounts (see -pseudo, below). The log-odds matrix uses the background frequencies in the denominator and is log base 2.
DNA IUPAC motif:
ACGGWN[ACGT]YCGT
protein IUPAC motif:
IKLVB[^ILVM]ZYXXHG
Writes MEME motif format to standard output.
A probability matrix and optionally a log-odds matrix are output for each motif in the file. The probability matrix is computed using pseudo-counts consisting of the background frequency (see -bg, below) multiplied by the total pseudocounts (see -pseudo, below). The log-odds matrix uses the background frequencies in the denominator and is log base 2.
Option | Parameter | Description | Default Behaviour |
---|---|---|---|
General Options | |||
-dna | Use the DNA alphabet. Note that this is actually the default. | The DNA alphabet is used. | |
-protein | Use the protein alphabet. | The DNA alphabet is used. | |
-alph | alphabet file | Use the alphabet defined in the file. | The DNA alphabet is used. |
-numseqs | count | Assume frequences based on count sequence sites. | The motif is created as if it was made from 20 sequence sites. |
-bg | background file | The background file should be a Markov background model. It contains the background frequencies of letters use for assigning pseudocounts. The background frequencies will be included in the resulting MEME file. | Uses uniform background frequencies. |
-pseudo | total pseudocounts | Add total pseudocounts times letter background to each frequency. | No pseudocount is added. |
-logodds | Include a log-odds matrix in the output. This is not required for versions of the MEME Suite ≥ 4.7.0. | The log-odds matrix is not included in the output. | |
-url | website | The provided website URL will be stored with the motif and this can be used by MEME Suite programs to provide a direct link to that information in their output. If website contains the keyword MOTIF_NAME the IUPAC code is substituted in place of MOTIF_NAME in the output. For example if the url is http://big-box-of-motifs.com/motifs/MOTIF_NAME.html and the IUPAC code is ATGATG , the motif will contain a link to http://big-box-of-motifs.com/motifs/ATGATG.html | The output does not include a URL with the motifs. |
-nosort | Don't change the order of the motifs. | Sort the motifs alphabetically | |
-named | The program will expect the name of each motif to follow the regular expression. | The motifs will be named based on the regular expression used to create them. |