AlignedXStringSet-class {Biostrings} | R Documentation |
The AlignedXStringSet
and QualityAlignedXStringSet
classes are
containers for storing an aligned XStringSet
.
Before we define the notion of alignment, we introduce the notion of "filled-with-gaps subsequence". A "filled-with-gaps subsequence" of a string string1 is obtained by inserting 0 or any number of gaps in a subsequence of s1. For example L-A–ND and A–N-D are "filled-with-gaps subsequences" of LAND. An alignment between two strings string1 and string2 results in two strings (align1 and align2) that have the same length and are "filled-with-gaps subsequences" of string1 and string2.
For example, this is an alignment between LAND and LEAVES:
L-A LEA
An alignment can be seen as a compact representation of one set of basic operations that transforms string1 into align1. There are 3 different kinds of basic operations: "insertions" (gaps in align1), "deletions" (gaps in align2), "replacements". The above alignment represents the following basic operations:
insert E at pos 2 insert V at pos 4 insert E at pos 5 replace by S at pos 6 (N is replaced by S) delete at pos 7 (D is deleted)
Note that "insert X at pos i" means that all letters at a position >= i are moved 1 place to the right before X is actually inserted.
There are many possible alignments between two given strings string1 and string2 and a common problem is to find the one (or those ones) with the highest score, i.e. with the lower total cost in terms of basic operations.
In the code snippets below,
x
is a AlignedXStringSet
or QualityAlignedXStringSet
object.
unaligned(x)
:
The original string.
aligned(x, degap = FALSE)
:
If degap = FALSE
, the "filled-with-gaps subsequence" representing
the aligned substring. If degap = TRUE
, the "gap-less subsequence"
representing the aligned substring.
ranges(x)
: The bounds of the aligned substring.
start(x)
:
The start of the aligned substring.
end(x)
:
The end of the aligned substring.
width(x)
:
The width of the aligned substring, ignoring gaps.
indel(x)
:
The positions, in the form of an IRanges
object, of the insertions or
deletions (depending on what x
represents).
nindel(x)
:
A two-column matrix containing the length and sum of the widths for each of
the elements returned by indel
.
length(x)
:
The length of the aligned(x)
.
nchar(x)
:
The nchar of the aligned(x)
.
alphabet(x)
:
Equivalent to alphabet(unaligned(x))
.
as.character(x)
:
Converts aligned(x)
to a character vector.
toString(x)
:
Equivalent to toString(as.character(x))
.
x[i]
:
Returns a new AlignedXStringSet
or QualityAlignedXStringSet
object made of the selected elements.
rep(x, times)
:
Returns a new AlignedXStringSet
or QualityAlignedXStringSet
object made of the repeated elements.
P. Aboyoun
pairwiseAlignment
,
PairwiseAlignments-class
,
XStringSet-class
pattern <- AAString("LAND") subject <- AAString("LEAVES") nw1 <- pairwiseAlignment(pattern, subject, substitutionMatrix = "BLOSUM50", gapOpening = 3, gapExtension = 1) alignedPattern <- pattern(nw1) unaligned(alignedPattern) aligned(alignedPattern) as.character(alignedPattern) nchar(alignedPattern)