AlignedXStringSet-class {Biostrings}R Documentation

AlignedXStringSet and QualityAlignedXStringSet objects

Description

The AlignedXStringSet and QualityAlignedXStringSet classes are containers for storing an aligned XStringSet.

Details

Before we define the notion of alignment, we introduce the notion of "filled-with-gaps subsequence". A "filled-with-gaps subsequence" of a string string1 is obtained by inserting 0 or any number of gaps in a subsequence of s1. For example L-A–ND and A–N-D are "filled-with-gaps subsequences" of LAND. An alignment between two strings string1 and string2 results in two strings (align1 and align2) that have the same length and are "filled-with-gaps subsequences" of string1 and string2.

For example, this is an alignment between LAND and LEAVES:

    L-A
    LEA
  

An alignment can be seen as a compact representation of one set of basic operations that transforms string1 into align1. There are 3 different kinds of basic operations: "insertions" (gaps in align1), "deletions" (gaps in align2), "replacements". The above alignment represents the following basic operations:

    insert E at pos 2
    insert V at pos 4
    insert E at pos 5
    replace by S at pos 6 (N is replaced by S)
    delete at pos 7 (D is deleted)
  

Note that "insert X at pos i" means that all letters at a position >= i are moved 1 place to the right before X is actually inserted.

There are many possible alignments between two given strings string1 and string2 and a common problem is to find the one (or those ones) with the highest score, i.e. with the lower total cost in terms of basic operations.

Accessor methods

In the code snippets below, x is a AlignedXStringSet or QualityAlignedXStringSet object.

unaligned(x): The original string.

aligned(x, degap = FALSE): If degap = FALSE, the "filled-with-gaps subsequence" representing the aligned substring. If degap = TRUE, the "gap-less subsequence" representing the aligned substring.

ranges(x): The bounds of the aligned substring.

start(x): The start of the aligned substring.

end(x): The end of the aligned substring.

width(x): The width of the aligned substring, ignoring gaps.

indel(x): The positions, in the form of an IRanges object, of the insertions or deletions (depending on what x represents).

nindel(x): A two-column matrix containing the length and sum of the widths for each of the elements returned by indel.

length(x): The length of the aligned(x).

nchar(x): The nchar of the aligned(x).

alphabet(x): Equivalent to alphabet(unaligned(x)).

as.character(x): Converts aligned(x) to a character vector.

toString(x): Equivalent to toString(as.character(x)).

Subsetting methods

x[i]: Returns a new AlignedXStringSet or QualityAlignedXStringSet object made of the selected elements.

rep(x, times): Returns a new AlignedXStringSet or QualityAlignedXStringSet object made of the repeated elements.

Author(s)

P. Aboyoun

See Also

pairwiseAlignment, PairwiseAlignments-class, XStringSet-class

Examples

pattern <- AAString("LAND")
subject <- AAString("LEAVES")
nw1 <- pairwiseAlignment(pattern, subject, substitutionMatrix = "BLOSUM50", gapOpening = 3, gapExtension = 1)
alignedPattern <- pattern(nw1)
unaligned(alignedPattern)
aligned(alignedPattern)
as.character(alignedPattern)
nchar(alignedPattern)

[Package Biostrings version 2.46.0 Index]