sites2meme

Usage:

sites2meme [options] <directory of files containg sites>

Description

Convert a directory of files containing sites into a MEME motif file suitable for use with MEME Suite programs.

Input

A directory which contains files of sites. A motif will be generated for each file with the expected file extension (default ".txt"). The motif will use the name of the file, minus the file extension, as the identifier.

Example Sites file
AAGGTCA
AAGGTCA
AAGGTCA
AAGGTCA
AAGGTCA
AAGGTCA
AAGGTCA
AAGGTCA
AAGGTCA
AAGGTCA
CGGGTCA
CGGGTCA
CGGGTCA
GGGGTCG

Output

Writes MEME motif format to standard output.

A probability matrix and optionally a log-odds matrix are output for each motif in the file. The probability matrix is computed using pseudo-counts consisting of the background frequency (see -bg, below) multiplied by the total pseudocounts (see -pseudo, below). The log-odds matrix uses the background frequencies in the denominator and is log base 2.

Options

Option Parameter Description Default Behaviour
General Options
-extfile extension The file extension (with '.') of the sites files. Any files without this extension will be ignored. The file name minus the extension will be used as the motif ID. Files with the extension .txt will be used as sites files.
-mapID mapping file The ID mapping file contains space separated pairs of the motif ID and the motif name with one entry per line. Only the motif IDs are used.
-protein Expect all inputs to use the protein alphabet and produce protein motifs. Expect all inputs to use the DNA alphabet and produce DNA motifs
-alphfile Expect all inputs to use the alphabet defined in the file and produce motifs of the same alphabet. Expect all inputs to use the DNA alphabet and produce DNA motifs
-bgbackground fileThe background file should be a Markov background model. It contains the background frequencies of letters use for assigning pseudocounts. The background frequencies will be included in the resulting MEME file.Uses uniform background frequencies.
-pseudototal pseudocountsAdd total pseudocounts times letter background to each frequency.No pseudocount is added.
-logoddsInclude a log-odds matrix in the output. This is not required for versions of the MEME Suite ≥ 4.7.0.The log-odds matrix is not included in the output.
-urlwebsiteThe provided website URL will be stored with the motif and this can be used by MEME Suite programs to provide a direct link to that information in their output. If website contains the keyword MOTIF_NAME the motif ID is substituted in place of MOTIF_NAME in the output.
For example if the url is
http://big-box-of-motifs.com/motifs/MOTIF_NAME.html
and the motif ID is motif_id, the motif will contain a link to
http://big-box-of-motifs.com/motifs/motif_id.html
The output does not include a URL with the motifs.