GOMO File Format

Tag	Child of	Description
<gomo>	Nothing	Information about this run of gomo. version - The version of gomo that generated the xml file. release - The release date of the version that generated the xml.
<program>	<gomo>	Information about the state of the program when it ran. name - name of the program. cmd - the command line passed to the program. gene_url - the url used to lookup further information on the gene ids. The url has ampersands (&) converted into & and the place where the gene ID should be replaced by !!GENEID!! . outdir - the output directory that the program wrote to. clobber - true if gomo was allowed to overwrite the output directory. text_only - true if gomo wrote to stdout, in which case this file would not exist so it must be false. use_e_values - true if gomo used E-values (converted from p-values) as input scores, false if gomo used gene scores. score_e_thresh - if gomo used E-values then this is the threshold that gomo assumed the worst E-value (p-value = 1.0) for the gene to smooth out noise. min_gene_count - the minimum number of genes that a GO term was annotated with before gomo would calculate a score for it. motifs - if present then a space delimited list of the motifs that gomo calculated a score for, othewise gomo scored all motifs. shuffle_scores - the number of times gomo generated a shuffled mapping of gene id to gene id to be used to generate scores from the null model. q_threshold - gomo filtered the results to only show those with a better (smaller) q-value.
<gomapfile>	<program>	Information about the GO mapping file. path - the path to the mapping file.
<seqscorefile>	<program>	Information about the sequence scoring file. path - the path to the sequence scoring file.
<motif>	<gomo>	Information about the motif. id - the motif identifier. genecount - the number of scored sequences that were used to compute the result.
<goterm>	<motif>	Information about the GO term. id - the GO identifier. score - the geometic mean across all species of the rank-sum test p-value. pvalue - the empirically calculated p-value. qvalue - the empirically calculated q-value. annotated - the number of genes annotated with the go term. group - the subgroup that the term belongs to. For the Gene Ontology b = biological process, c = cellular component and m = molecular function. nabove - the number of more general terms that link to this one. nbelow - the number of more specific terms that link from this one. implied - is the go term implied by other significant go terms? Allows values 'y', 'n' or 'u' (default) for yes, no or unknown. description - the GO term description.
<gene>	<goterm>	Information about the GO term's annotated genes for the primary species. id - the gene identifier. rank - the rank of the scored gene.

Tag

Child of

Description

<gomo>

Nothing

Information about this run of gomo.

version - The version of gomo that generated the xml file.
release - The release date of the version that generated the xml.

<gomo>

Information about the state of the program when it ran.

name - name of the program.
cmd - the command line passed to the program.
gene_url - the url used to lookup further information on the gene ids. The url has ampersands (&) converted into & and the place where the gene ID should be replaced by !!GENEID!! .
outdir - the output directory that the program wrote to.
clobber - true if gomo was allowed to overwrite the output directory.
text_only - true if gomo wrote to stdout, in which case this file would not exist so it must be false.
use_e_values - true if gomo used E-values (converted from p-values) as input scores, false if gomo used gene scores.
score_e_thresh - if gomo used E-values then this is the threshold that gomo assumed the worst E-value (p-value = 1.0) for the gene to smooth out noise.
min_gene_count - the minimum number of genes that a GO term was annotated with before gomo would calculate a score for it.
motifs - if present then a space delimited list of the motifs that gomo calculated a score for, othewise gomo scored all motifs.
shuffle_scores - the number of times gomo generated a shuffled mapping of gene id to gene id to be used to generate scores from the null model.
q_threshold - gomo filtered the results to only show those with a better (smaller) q-value.

Information about the GO mapping file.

path - the path to the mapping file.

Information about the sequence scoring file.

path - the path to the sequence scoring file.

<motif>

<gomo>

Information about the motif.

id - the motif identifier.
genecount - the number of scored sequences that were used to compute the result.

<motif>

Information about the GO term.

id - the GO identifier.
score - the geometic mean across all species of the rank-sum test p-value.
pvalue - the empirically calculated p-value.
qvalue - the empirically calculated q-value.
annotated - the number of genes annotated with the go term.
group - the subgroup that the term belongs to. For the Gene Ontology b = biological process, c = cellular component and m = molecular function.
nabove - the number of more general terms that link to this one.
nbelow - the number of more specific terms that link from this one.
implied - is the go term implied by other significant go terms? Allows values 'y', 'n' or 'u' (default) for yes, no or unknown.
description - the GO term description.

<gene>

Information about the GO term's annotated genes for the primary species.

id - the gene identifier.
rank - the rank of the scored gene.

The MEME Suite

Motif-based sequence analysis tools