Usage:

meme-chip [options] [-db <motif database>]* <sequences>

Description

Input

Motif Database (optional but recommended)

File(s) containing MEME formatted motifs. Outputs from MEME and DREME are supported, as well as Minimal MEME Format. You can convert many other motif formats to MEME format using conversion scripts available with the MEME Suite. These database(s) will used by Tomtom and CentriMo.

Sequences

A set of sequences in FASTA format. Ideally the sequences should be all the same length, between 100 and 500 base-pairs long and centrally enriched for motifs. The immediate regions around individual ChIP-seq "peaks" from a transcription factor (TF) ChIP-seq experiment are ideal. The suggested 100 base-pair minimum size is based on the typical resolution of ChIP-seq peaks but it is useful to have more of the surrounding sequence to give CentriMo the power to tell if a motif is centrally enriched. We recommend that you "repeat mask" your sequences, replacing repeat regions to the "N" character.

Output

MEME-ChIP runs each program in its analysis in a different folder in the output directory. The main file, meme-chip.html, is an interactive HTML file, and it contains links to the other output files produced by MEME-ChIP. A tab-separated values (TSV) file suitable for parsing by scripts and viewing by Excel is also created (summary.tsv), as well as an associated file containing all the motifs identified by MEME-ChIP (combined.meme) in MEME Motif Format.

Options

OptionParameterDescriptionDefault Behaviour
General Options
-dbfile Use file containing a database of DNA motifs in MEME format. This database will used by Tomtom and CentriMo. This option may be used multiple times to pass multiple databases. When no databases are provided Tomtom can't suggest similar motifs and CentriMo is limited to the discovered motifs.
-negfile MEME-ChIP will look for motifs enriched in the primary sequences relative to this control set of sequences in FASTA format. These sequences will be input as control sequences to DREME and CentriMo, and used as input to psp-gen to create a postion-specific prior for use by MEME. When this option is used the primary and control sequences should all be the same length; otherwise CentriMo E-values will be inaccurate. If the primary sequences are ChIP-seq peak regions from a transcription factor ChIP-seq experiment, similar regions from a knockout cell line or organism, are a possible choice for control sequences. The control sequences should be prepared in exactly the same way (e.g., repeat-masking) as the primary sequences. No control sequences are used for MEME and CentriMo. For DREME, the positive sequences are shuffled to create the control set.
-dna  Use the standard DNA alphabet. Any motif databases provided must use the DNA alphabet. The standard DNA alphabet is used. Motif databases retain their native alphabet.
-rna  Use the standard RNA alphabet. Any motif databases provided must use the RNA alphabet. The standard DNA alphabet is used. Motif databases retain their native alphabet.
-protein  Not recommended Use the standard Protein alphabet. Any motif databases provided must use the Protein alphabet. Note that while MEME-ChIP does work with protein sequences it was not originally designed to and some of the called programs like DREME don't work well with the protein alphabet. The standard DNA alphabet is used. Motif databases retain their native alphabet.
-bfilefile Pass the file specifying a background model in Markov Background Model Format to programs that support a background model (MEME, CentriMo, FIMO, SpaMo and Tomtom). Consult the documentation for those programs for details on how they use the background model. A background model is calculated from the input sequences and passed by MEME-ChIP to the programs that support it.
-orderorder Set the order of the Markov background model that is generated from the input sequences when a background model is not specified via -bfile. A Markov background model of order 1 is generated and passed to the programs that support it.
-nmemelimit The upper bound on the number of sequences that are passed to MEME. This is required because MEME takes too long to run for very large sequence sets. All input sequences are passed to MEME if there are not more than limit. The number of sequences passed to MEME will be limited to 600.
-seedseed The seed for the randomized selection of sequences for MEME. A seed value of 1 is used.
-norand Disable the random selection of sequences for MEME and select the maximum allowed number (see -nmeme) of sequences in input file order. Sequences are selected randomly without replacement.
-ccutsize The maximum length of a sequence to use before it is trimmed to a central region of this size. A value of 0 indicates that sequences should not be trimmed. A maximum size of 100 is used.
-group-threshgthr Main threshold for clustering highly similar motifs in MEME-ChIP output. All motifs in a group will have a Tomtom E-value less than or equal to gthr when compared to the seed motif for the group, which is the most significant motif in the group. A value of 0.05 is used.
-group-weakwthr Secondary threshold for clustering highly similar motifs in MEME-ChIP output. If this is specified by the user, groups will be merged into a more significant group if all their motifs are weakly similar to the seed motif of the more significant group. wthr specifies the Tomtom E-value threshold for merging groups. Set to be equal to twice the value of the main clustering threshold: 2 * gthr.
-filter-threshfthr E-value threshold for including motifs in the output. A value of 0.05 is used.
-timeminutes The maximum time that MEME-ChIP is allowed to run before terminating itself gracefully. There is no time limit
-descdescription A description of the MEME-ChIP run which is displayed in the summary file. No description is displayed in the summary file.
-fdescfile A file containing a description of the MEME-ChIP run which will be included in the summary file. The summary file will try to preserve some of the formatting by presenting blocks of text separated by multiple new lines as different paragraphs and replacing single new line characters with line breaks. Only the first 500 characters are used. No description is displayed in the summary file.
-norc  Find motifs in given strand only. Find motifs in both strands.
-old-clustering  Pick seed motifs for clustering based only on significance; Discovered motifs are preferentially used as seed motifs for clustering.
-noecho  Don't echo the commands run. Echo the commands run to standard output.
-help  Display a usage message Run as normal.
-version  Display the version and exit. Run as normal.
MEME Specific Options
-meme-modoops|zoops|anr The number of motif sites that MEME will find per sequence.
oops - One Occurrence Per Sequence,
zoops - Zero or One Occurrence Per Sequence,
anr - Any Number of Repetitions
See -mod in the MEME command-line documentation.
MEME defaults to using zoops mode.
-meme-minwwidth The minimum motif width that MEME should find. A minimum width of 6 is used unless the maximum width has been set to be less than 6 in which case the maximum width is used.
-meme-maxwwidth The maximum motif width that MEME should find. A maximum width of 30 is used unless the minimum width has been set to be larger than 30 in which case the minimum width is used.
-meme-nmotifsnum The number of motifs that MEME should search for. If num is 0, MEME will not be run. MEME will find 3 motifs.
-meme-minsitessites The minimum number of sites that MEME needs to find for a motif. MEME doesn't require any minimum number of sites for a motif.
-meme-maxsitessites The maximum number of sites that MEME will find for a motif. MEME doesn't limit the number of sites it will find for a motif.
-meme-pnp Use faster, parallel version of MEME with np processors. The parameter np may be a number or it may be a quoted string starting with a number and followed by arguments to the particular MPI run command for your installation (e.g., mpirun). Use a single processor.
-meme-maxsizesize Change the largest allowed dataset to be size. Note that the default maximum size exists to warn users that their dataset is possibly too large to process in a reasonable time so please consider carefully before increasing this value. The maximum dataset size is 100000. This should be fine with the default settings for -nmeme and -ccut as the largest possible dataset size would be 60000.
-meme-pal  Restrict MEME to searching for palindromes only. MEME searches for any motif not just palindromes.
DREME Specific Options
-dreme-eE-value Stop searching for more motifs if the next best motif found has a worse E-value An E-value threshold of 0.05 is used.
-dreme-mcount Stop searching for more motifs if count motifs have been found. If count is 0, DREME will not be run. There is no limit on the number of motifs.
CentriMo Specific Options
-centrimo-local  CentriMo perform local motif enrichment analysis, computing enrichment in every possible sequence region. CentriMo will perform central motif enrichment analysis, computing enrichment in centered regions only.
-centrimo-scorescore Set the minimum accepted score for a match. A minimum score of 5 is used.
-centrimo-maxregregion Set the size of the maximum region size tested. CentriMo will test all valid region sizes.
-centrimo-ethreshE-value Set the E-value threshold for reporting enriched central regions. An E-value threshold of 10 will be used.
-centrimo-noseq  Do not store sequence IDs in the output of CentriMo. CentriMo stores a list of the sequence IDs with matches in the best region for each motif.
-centrimo-flip  Reflect the positions of matches on the reverse strand around the center. Matches on the reverse strand are counted where they occur in the sequence.
SpaMo Specific Options
-spamo-skip  Do not run SpaMo. Can be combined with -meme-nmotifs 0, -dreme-m 0, and -fimo-skip to use MEME-ChIP to run CentriMo and cluster the significant motifs. Run SpaMo using most significant motif from each cluster as primary.
FIMO Specific Options
-fimo-skip  Do not run FIMO. Can be combined with -meme-nmotifs 0, -dreme-m 0, and -spamo-skip to use MEME-ChIP to run CentriMo and cluster the significant motifs. Run FIMO using most significant motif from each cluster to scan input sequences.

Citing

If you use MEME-ChIP in your research, please cite the following paper:
Philip Machanick and Timothy L. Bailey, "MEME-ChIP: motif analysis of large DNA datasets", Bioinformatics, 2712, 1696-1697, 2011. [full text]