CentriMo

CentriMo identifies known or user-provided motifs that show a significant preference for particular locations in your sequences (sample output from sequences and motifs). CentriMo can also show if the local enrichment is significant relative to control sequences. See this Manual or this Tutorial for more information.

Input

Motif File(s)

File(s) containing MEME formatted motifs. Outputs from MEME and DREME are supported, as well as Minimal MEME Format. You can convert many other motif formats to MEME format using conversion scripts available with the MEME Suite.

Sequence File

A file containing FASTA formatted sequences, ideally all of the same length. The sequences in this file are referred to as the "primary sequences" when a second set of (control) sequences is provided using the --neg option (see below).

Output

CentriMo outputs an HTML file named centrimo.html that allows interactive selection of which motifs to plot the positional distribution for and control over smoothing and other plotting parameters. CentriMo also outputs two text files: centrimo.txt, a tab delimited version of the results, and site_counts.txt, which lists, for each motif and each offset, the number of sequences where the best match of the motif occurs at the given offset.

Options

Option	Parameter	Description	Default Behaviour
Input/Output
--o	dir	Create a folder called dir and write output files in it. This option is not compatible with --oc as only one output folder is allowed.	The program behaves as if `--oc centrimo_out` had been specified.
--oc	dir	Create a folder called dir but if it already exists allow overwriting the contents. This option is not compatible with --o as only one output folder is allowed.	The program behaves as if `--oc centrimo_out` had been specified.
--neg	control sequence file	Plot the motif distributions in this set (the control sequences) as well.
--disc		For each enriched region in the primary sequences, the signficance of the relative enrichment of the motif in that region in the primary versus control sequences is evaluated using Fisher's exact test. Requires the control sequences to be supplied with the --neg option.	Use the binomial test on the primary sequences to evaluate motif enrichment.
--xalph		If the input motifs are in a different alphabet than the input sequences, and the motif alphabet is a subset of the sequence alphabet, you can specify an alphabet file containing the sequence alphabet defintion. The input motifs will be converted to this new alphabet, with the probabilities for the new symbols set to zero prior to applying pseudocounts.	Motifs retain the alphabet defined in the motif file.
--bfile	file	Specify the source of a background model for converting a frequency matrix to a log-odds score matrix and for use in estimating the p-values of match scores. The value of file is either the path to a file in Markov Background Model Format, or one of the keywords `motif-file`, `--motif--` or `--uniform--`. The first two keywords cause the 0-order letter frequencies contained in the first motif file to be used, and `--uniform--` causes uniform letter frequencies to be used.	The frequencies of the letters in the sequences are used as the background model.
--motif-pseudo	pseudocount	Add a this total pseudocount to the counts in each motif column when converting a frequency matrix to a log-odds score matrix. The pseudocount added to each count is pseudocount times the background frequency of the letter (see option --bfile, above). Note: Counts are computed from MEME formatted motifs by multiplying the the frequency of the letter times the value of `nsites` given in the motif `letter-probability matrix` header line.	The program applies a pseudocount of 0.1.
--motif	ID	Select the motif with the ID for scanning. This option may be repeated to select multiple motifs.	The program scans with all the motifs.
--seqlen	length	Use sequences with the length length ignoring all other sequences in the input file(s).	Use sequences with the same length as the first sequence, ignoring all other sequences in the input file(s).
Scanning
--score	S	The score threshold for PWMs, in bits. Sequences without a match with score ≥ S are ignored.	A score of 5 is used.
--optimize_score		Search for the optimal score above the minimum threshold given by the --score option.	The minimum score threshold is used.
--maxreg	max region	The maximum region size to consider.	Try all region sizes up to the sequence width.
--minreg	min region	The minimum region size to consider. Must be less than max region.	Try regions 1 bp and larger.
--norc		Do not scan with the reverse complement motif.	Scans with the reverse complement motif.
--flip		reverse complement matches appear 'reflected' around sequence centers.	Do not 'flip' the sequence; use rc of motif instead.
--local		Compute enrichment of all regions.	Compute enrichment of central regions.
Output filtering
--ethresh	thresh	Limit the results to motifs with an enriched region whose E-value is less than thresh. Enrichment E-values are computed by first adjusting the binomial p-value of a region for the number of regions tested using the Bonferroni correction, and then multiplying the adjusted p-value by the number of motifs in the input to CentriMo.	Include motifs with E-values up to 10.
Miscellaneous
--desc	description	Include the text description in the HTML output.	No description in the HTML output.
--dfile	desc file	Include the first 500 characters of text from the file desc file in the HTML output.	No description in the HTML output.
--noseq		Do not store sequence IDs in the output of CentriMo.	CentriMo stores a list of the sequence IDs with matches in the best region for each motif. This can potentially make the file size much larger.
-verbosity	1\|2\|3\|4\|5	A number that regulates the verbosity level of the output information messages. If set to 1 (quiet) then it will only output error messages whereas the other extreme 5 (dump) outputs lots of mostly useless information.	The verbosity level is set to 2 (normal).
--version		Display the version and exit.	Run as normal.

The MEME Suite

Motif-based sequence analysis tools

Local Motif Enrichment Analysis

Usage:

Description

Input

Motif File(s)

Sequence File

Output

Options

Citing