gendb [options] <sequence count>
gendb generates the specified number of sequences using a Markov model. The sequence lengths are selected uniformly at random within the range specified by --minseq and --maxseq.
No inputs are required.
Writes the sequences in FASTA format to standard output.
Option | Parameter | Description | Default Behaviour |
---|---|---|---|
General Options | |||
--alph | file | Generate random sequences using the alphabet defined in file file, an alphabet definition file. Note that this overrides the --type option. | Protein sequences are generated unless overridden using the --type option. |
--ambig | ambig fraction | Sets the fraction of symbols that will be ambiguous (overrides --type) | The default depends on the --type option. |
--bfile | file | Sets the background model used to generate the sequences from a file in background model format. | For the standard DNA and Protein alphabets a built-in 0-order background is used. If a non-standard alphabet is provided without a background then a uniform frequency distribution is used. |
--order | n | Load the background model up to order n. | Load the background model completely. |
--type | 0|1|2|3|4 | Allowed types are:
|
If an alphabet is not specified with the --alph option then protein sequences are generated. |
--minseq | min | Minimum sequence length. | The minimum sequence length is 50. |
--maxseq | max | Maximum sequence length. | The maximum sequence length is 2,000. |
--dummy | Print a "dummy" sequence record before the generated sequences. The "dummy" sequence record is a a FASTA header line listing the gendb parameters but not followed by any sequence lines. | ||
--seed | seed | Seed for random number generator. |