getsize

Usage:

getsize <sequences> [options]

Description

The program getsize prints statistics about sequences read from a FASTA file.

When counting letters, alias symbols are first converted to their core symbol.

Input

Sequences

A file in FASTA format. If the filename is given as "-", then the program reads from standard input.

Output

Writes statistics about the sequences to standard output.

Options

Option Parameter Description Default Behaviour
General Options
-dna  Assume sequences are DNA and print letter frequencies in background file format.
-rna  Assume sequences are RNA and print letter frequencies in background file format.
-prot  Assume sequences are protein and print letter frequencies in background file format.
-alphfile Use the alphabet defintion in the specified file and print frequencies in background file format.
-f  Print the letter frequences as a C array.
-ft  Print letter frequences in a LaTex table.
-l  Just print the length of each sequence.
-nd  Do not print warnings about duplicate sequences.
-x  Translate DNA in 6 frames (use with -f or -ft) and print the protein letter frequencies in one C array for each frame.
-codons  As for -x, and also print frame 0 codon usage as a C array.