The name of the motif.
The alternate name of the motif.
The width of the motif. No gaps are allowed in motifs supplied to MCAST as it only works for motifs of a fixed width.
The name of the (FASTA) sequence database file.
The name of the position specific priors file.
The name of the binned distribution of priors file.
The number of sequences in the database.
The number of letters in the sequence database.
The name of the file containing the (MEME-formatted) motifs used in the search.
The block diagram shows the motif matches comprising a motif cluster detected by MCAST.
The score for the match of a position in a sequence to a motif is computed by by summing the appropriate entry from each column of the position-dependent scoring matrix that represents the motif. Sequences shorter than one or more of the motifs are skipped.
The p-value of a motif match is the probability of a single random subsequence of the length of the motif scoring at least as well as the observed match.
A selected portion of the input sequence with the matching motifs displayed above it.
For each matching motif the strand of the match (+/-), the consensus sequence of the motif, the p-value of the individual motif match (see also help button for "Cluster Score") and the sequence logo of the motif is shown.
You can select the portion of the sequence to be displayed by sliding the two buttons below the sequence block diagram so that the portion you want to see is between the two needles attached to the buttons. By default the two buttons move together, but you can drag one individually by holding shift before you start the drag.
The name of the alphabet symbol.
The frequency of the alphabet symbol as defined by the background model.
The full sequence identifier.
This lists the name of the sequence, typically the chromosome or contig name.
MCAST was run with --parse-genomic-coord
specified and
has split the sequence identifier into sequence name, sequence start and sequence end.
This lists the first genomic offset (1-based) of the displayed region.
MCAST was run with --parse-genomic-coord
specified and
has split the sequence identifier into sequence name, sequence start and sequence end in genome coordinates.
This lists the first sequence offset (1-based) of the displayed region.
This lists the last genomic offset (1-based) of the displayed region.
MCAST was run with --parse-genomic-coord
specified and
has split the sequence identifier into sequence name, sequence start and sequence end in genome coordinates.
This lists the last sequence offset (1-based) of the displayed region.
The start of the motif cluster relative to the start of the sequence (1-based).
The last position of the motif cluster relative to the start of the sequence (1-based).
The score that the hidden Markov model created by MCAST assigned to the motif cluster.
This is the sum of the scores of the individual motif matches in the cluster, plus a gap penalty, g, multiplied by the total size of the inter-motif gaps in the cluster. Individual motif match scores are log2(P(s)/p), where s is the log-odds score of the motif match, P(s) is the p-value of the motif match, and p is the user-specified p-value threshold (default: 0.0005).
The p-value of the motif cluster score.
MCAST estimates p-values by fitting an exponential distribution the observed motif cluster scores.
The E-value of the motif cluster score.
MCAST estimates this by multiplying the p-value of the motif cluster score times the (estimated) number of random matches found in the search.
The q-value of the motif cluster score.
MCAST estimates q-values from the motif cluster score p-values using the method of Benjamini and Hochberg (Journal of the Royal Statistical Society B, 57:289-300, 1995).
Motif | 1 |
---|---|
p-value | 8.23e-7 |
Start | 23 |
End | 33 |
Change the portion of annotated sequence by dragging the buttons; hold shift to drag them individually.
For further information on how to interpret these results or to get a copy of the MEME software please access http://meme-suite.org.
If you use MCAST in your research please cite the following paper:
Timothy Bailey and William Stafford Noble,
"Searching for statistically significant regulatory modules",
Bioinformatics (Proceedings of the European Conference on Computational Biology),
19(Suppl. 2):ii16-ii25, 2003.
[full text]
The following sequence databases were supplied to MCAST.
Database | PSP/Wig file | PSP Distribution file | Sequence Count | Letter Count |
---|
The following motif databases were supplied to MCAST.
Database |
---|
Which contained the following motifs.