PairwiseAlignments-io {Biostrings} | R Documentation |
The writePairwiseAlignments
function writes a
PairwiseAlignments object to a file.
Only the "pair" format is supported at the moment.
writePairwiseAlignments(x, file="", Matrix=NA, block.width=50)
x |
A PairwiseAlignments object, typically returned by the
|
file |
A connection, or a character string naming the file to print
to. If |
Matrix |
A single string containing the name of the substitution matrix
(e.g. |
block.width |
A single integer specifying the maximum number of sequence letters (including the "-" letter, which represents gaps) per line. |
The "pair" format is one of the numerous pairwise sequence alignment formats supported by the EMBOSS software. See http://emboss.sourceforge.net/docs/themes/AlignFormats.html for a brief (and rather informal) description of this format.
This brief description of the "pair" format suggests that it is best suited for global pairwise alignments, because, in that case, the original pattern and subject sequences can be inferred (by just removing the gaps).
However, even though the "pair" format can also be used for non global
pairwise alignments (i.e. for global-local, local-global,
and local pairwise alignments), in that case the original
pattern and subject sequences cannot be inferred. This is because
the alignment written to the file doesn't necessarily span the entire
pattern (if type(x)
is local-global
or local
)
or the entire subject (if type(x)
is global-local
or local
).
As a consequence, the writePairwiseAlignments
function can be
used on a PairwiseAlignments object x
containing non global
alignments (i.e. with type(x) != "global"
), but with the 2 following
caveats:
The type of the alignments (type(x)
) is not written
to the file.
The original pattern and subject sequences cannot be inferred. Furthermore, there is no way to infer their lengths (because we don't know whether they were trimmed or not).
Also note that the pairwiseAlignment
function
interprets the gapOpening
and gapExtension
arguments
differently than most other alignment tools. As a consequence
the values of the Gap_penalty and Extend_penalty fields written to
the file are not the same as the values that were passed to the
gapOpening
and gapExtension
arguments. With the
following relationship:
Gap_penalty = gapOpening + gapExtension
Extend_penalty = gapExtension
H. Pagès
http://emboss.sourceforge.net/docs/themes/AlignFormats.html
## --------------------------------------------------------------------- ## A. WITH ONE PAIR ## --------------------------------------------------------------------- pattern <- DNAString("CGTACGTAACGTTCGT") subject <- DNAString("CGTCGTCGTCCGTAA") pa1 <- pairwiseAlignment(pattern, subject) pa1 writePairwiseAlignments(pa1) writePairwiseAlignments(pa1, block.width=10) ## The 2 bottom-right numbers (16 and 15) are the lengths of ## the original pattern and subject, respectively. pa2 <- pairwiseAlignment(pattern, subject, type="global-local") pa2 # score is different! writePairwiseAlignments(pa2) ## By just looking at the file, we can't tell the length of the ## original subject! Could be 13, could be more... pattern <- DNAString("TCAACTTAACTT") subject <- DNAString("GGGCAACAACGGG") pa3 <- pairwiseAlignment(pattern, subject, type="global-local", gapOpening=-2, gapExtension=-1) writePairwiseAlignments(pa3) ## --------------------------------------------------------------------- ## B. WITH MORE THAN ONE PAIR (AND NAMED PATTERNS) ## --------------------------------------------------------------------- pattern <- DNAStringSet(c(myp1="ACCA", myp2="ACGCA", myp3="ACGGCA")) pa4 <- pairwiseAlignment(pattern, subject) pa4 writePairwiseAlignments(pa4) ## --------------------------------------------------------------------- ## C. REPRODUCING THE ALIGNMENT SHOWN AT ## http://emboss.sourceforge.net/docs/themes/alnformats/align.pair ## --------------------------------------------------------------------- pattern <- c("TSPASIRPPAGPSSRPAMVSSRRTRPSPPGPRRPTGRPCCSAAPRRPQAT", "GGWKTCSGTCTTSTSTRHRGRSGWSARTTTAACLRASRKSMRAACSRSAG", "SRPNRFAPTLMSSCITSTTGPPAWAGDRSHE") subject <- c("TSPASIRPPAGPSSRRPSPPGPRRPTGRPCCSAAPRRPQATGGWKTCSGT", "CTTSTSTRHRGRSGWRASRKSMRAACSRSAGSRPNRFAPTLMSSCITSTT", "GPPAWAGDRSHE") pattern <- unlist(AAStringSet(pattern)) subject <- unlist(AAStringSet(subject)) pattern # original pattern subject # original subject data(BLOSUM62) pa5 <- pairwiseAlignment(pattern, subject, substitutionMatrix=BLOSUM62, gapOpening=9.5, gapExtension=0.5) pa5 writePairwiseAlignments(pa5, Matrix="BLOSUM62")