seqinfo {GenomeInfoDb} | R Documentation |
A set of generic functions for getting/setting/modifying the sequence information stored in an object.
seqinfo(x) seqinfo(x, new2old=NULL, pruning.mode=c("error", "coarse", "fine", "tidy")) <- value seqnames(x) seqnames(x) <- value seqlevels(x) seqlevels(x, pruning.mode=c("error", "coarse", "fine", "tidy")) <- value sortSeqlevels(x, X.is.sexchrom=NA) seqlevelsInUse(x) seqlevels0(x) seqlengths(x) seqlengths(x) <- value isCircular(x) isCircular(x) <- value genome(x) genome(x) <- value
x |
The object from/on which to get/set the sequence information. |
new2old |
The
If |
pruning.mode |
When some of the seqlevels to drop from
See the "B. DROP SEQLEVELS FROM A LIST-LIKE OBJECT" section in the examples below for an extensive illustration of these pruning modes. |
value |
Typically a Seqinfo object for the Either a named or unnamed character vector for the A vector containing the sequence information to store for the other setters. |
X.is.sexchrom |
A logical indicating whether X refers to the sexual chromosome
or to chromosome with Roman Numeral X. If |
The Seqinfo class plays a central role for the functions described in this man page because:
All these functions (except seqinfo
, seqlevelsInUse
,
and seqlevels0
) work on a Seqinfo object.
For classes that implement it, the seqinfo
getter should
return a Seqinfo object.
Default seqlevels
, seqlengths
, isCircular
,
and genome
getters and setters are provided.
By default, seqlevels(x)
does seqlevels(seqinfo(x))
,
seqlengths(x)
does seqlengths(seqinfo(x))
,
isCircular(x)
does isCircular(seqinfo(x))
,
and genome(x)
does genome(seqinfo(x))
.
So any class with a seqinfo
getter will have all the above
getters work out-of-the-box. If, in addition, the class defines
a seqinfo
setter, then all the corresponding setters will
also work out-of-the-box.
Examples of containers that have a seqinfo
getter and setter:
the GRanges and GRangesList
classes in the GenomicRanges package;
the SummarizedExperiment class in the
SummarizedExperiment package;
the GAlignments,
GAlignmentPairs,
and GAlignmentsList classes in the
GenomicAlignments package;
the TxDb class in the
GenomicFeatures package;
the BSgenome class in the
BSgenome package; etc...
The GenomicRanges package defines seqinfo
and
seqinfo<-
methods for these low-level data types:
List
, RangesList
and RangedData
. Those
objects do not have the means to formally store sequence
information. Thus, the wrappers simply store the Seqinfo
object within metadata(x)
. Initially, the metadata
is empty, so there is some effort to generate a reasonable
default Seqinfo
. The names of any List
are
taken as the seqnames
, and the universe
of
RangesList
or RangedData
is taken as the
genome
.
The full list of methods defined for a given generic can
be seen with e.g. showMethods("seqinfo")
or
showMethods("seqnames")
(for the getters),
and showMethods("seqinfo<-")
or showMethods("seqnames<-")
(for the setters aka replacement methods).
Please be aware that this shows only methods defined in packages
that are currently attached.
H. Pagès
The seqlevelsStyle generic getter and setter.
Seqinfo objects.
GRanges and GRangesList objects in the GenomicRanges package.
SummarizedExperiment objects in the SummarizedExperiment package.
GAlignments, GAlignmentPairs, and GAlignmentsList objects in the GenomicAlignments package.
TxDb objects in the GenomicFeatures package.
BSgenome objects in the BSgenome package.
seqlevels-wrappers for convenience wrappers to the
seqlevels
getter and setter.
rankSeqlevels
, on which sortSeqlevels
is
based.
## Finding overlaps, comparing, and matching operations between objects ## containing genomic ranges require the objects to have the same ## seqlevels, or they fail. So before one can perform these operations, ## it is often necessary to modify the seqlevels in some of the objects ## involved in the operation so that all the objects have the same ## seqlevels. This is typically done with the seqlevels() setter. It can ## rename, drop, add and reorder seqlevels of an object. Examples below ## show how to mofify the seqlevels of GRanges, GRangesList, and TxDb ## objects but the approach is the same for any object that has seqlevels. ## --------------------------------------------------------------------- ## A. MODIFY THE SEQLEVELS OF A GRanges OBJECT ## --------------------------------------------------------------------- library(GenomicRanges) gr <- GRanges(rep(c("chr2", "chr3", "chrM"), 2), IRanges(1:6, 10)) ## Add new seqlevels: seqlevels(gr) <- c("chr1", seqlevels(gr), "chr4") seqlevels(gr) seqlevelsInUse(gr) ## Reorder existing seqlevels: seqlevels(gr) <- rev(seqlevels(gr)) seqlevels(gr) ## Drop all unused seqlevels: seqlevels(gr) <- seqlevelsInUse(gr) ## Drop some seqlevels in use: seqlevels(gr, pruning.mode="coarse") <- setdiff(seqlevels(gr), "chr3") gr ## Rename, add, and reorder the seqlevels all at once: seqlevels(gr) <- c("chr1", chr2="chr2", chrM="Mitochondrion") seqlevels(gr) ## --------------------------------------------------------------------- ## B. DROP SEQLEVELS FROM A LIST-LIKE OBJECT ## --------------------------------------------------------------------- grl0 <- GRangesList(A=GRanges("chr2", IRanges(3:2, 5)), B=GRanges(c("chr2", "chrMT"), IRanges(7:6, 15)), C=GRanges(c("chrY", "chrMT"), IRanges(17:16, 25)), D=GRanges()) grl0 grl1 <- grl0 seqlevels(grl1, pruning.mode="coarse") <- c("chr2", "chr5") grl1 # grl0[[2]] was fully removed! (even if it had a range on chr2) ## If what is desired is to remove the 2nd range in grl0[[2]] only (i.e. ## the chrMT:6-15 range), or, more generally speaking, to remove the ## ranges within each list element that are located on the seqlevels to ## drop, then use pruning.mode="fine" or pruning.mode="tidy": grl2 <- grl0 seqlevels(grl2, pruning.mode="fine") <- c("chr2", "chr5") grl2 # grl0[[2]] not removed, but chrMT:6-15 range removed from it ## Like pruning.mode="fine" but also removes grl0[[3]]. grl3 <- grl0 seqlevels(grl3, pruning.mode="tidy") <- c("chr2", "chr5") grl3 library(TxDb.Dmelanogaster.UCSC.dm3.ensGene) txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene ## Pruning mode "coarse" is particularly well suited on a GRangesList ## object that contains exons grouped by transcript: ex_by_tx <- exonsBy(txdb, by="tx") seqlevels(ex_by_tx) seqlevels(ex_by_tx, pruning.mode="coarse") <- "chr2L" seqlevels(ex_by_tx) ## Pruning mode "tidy" is particularly well suited on a GRangesList ## object that contains transcripts grouped by gene: tx_by_gene <- transcriptsBy(txdb, by="gene") seqlevels(tx_by_gene) seqlevels(tx_by_gene, pruning.mode="tidy") <- "chr2L" seqlevels(tx_by_gene) ## --------------------------------------------------------------------- ## C. RENAME THE SEQLEVELS OF A TxDb OBJECT ## --------------------------------------------------------------------- library(TxDb.Dmelanogaster.UCSC.dm3.ensGene) txdb <- TxDb.Dmelanogaster.UCSC.dm3.ensGene seqlevels(txdb) seqlevels(txdb) <- sub("chr", "", seqlevels(txdb)) seqlevels(txdb) seqlevels(txdb) <- paste0("CH", seqlevels(txdb)) seqlevels(txdb) seqlevels(txdb)[seqlevels(txdb) == "CHM"] <- "M" seqlevels(txdb) ## Restore original seqlevels: seqlevels(txdb) <- seqlevels0(txdb) seqlevels(txdb) ## --------------------------------------------------------------------- ## D. SORT SEQLEVELS IN "NATURAL" ORDER ## --------------------------------------------------------------------- sortSeqlevels(c("11", "Y", "1", "10", "9", "M", "2")) seqlevels <- c("chrXI", "chrY", "chrI", "chrX", "chrIX", "chrM", "chrII") sortSeqlevels(seqlevels) sortSeqlevels(seqlevels, X.is.sexchrom=TRUE) sortSeqlevels(seqlevels, X.is.sexchrom=FALSE) seqlevels <- c("chr2RHet", "chr4", "chrUextra", "chrYHet", "chrM", "chrXHet", "chr2LHet", "chrU", "chr3L", "chr3R", "chr2R", "chrX") sortSeqlevels(seqlevels) gr <- GRanges() seqlevels(gr) <- seqlevels sortSeqlevels(gr) ## --------------------------------------------------------------------- ## E. SUBSET OBJECTS BY SEQLEVELS ## --------------------------------------------------------------------- tx <- transcripts(txdb) seqlevels(tx) ## Drop 'M', keep all others. seqlevels(tx, pruning.mode="coarse") <- seqlevels(tx)[seqlevels(tx) != "M"] seqlevels(tx) ## Drop all except 'ch3L' and 'ch3R'. seqlevels(tx, pruning.mode="coarse") <- c("ch3L", "ch3R") seqlevels(tx) ## --------------------------------------------------------------------- ## F. FINDING METHODS ## --------------------------------------------------------------------- showMethods("seqinfo") showMethods("seqinfo<-") showMethods("seqnames") showMethods("seqnames<-") showMethods("seqlevels") showMethods("seqlevels<-") if (interactive()) { library(GenomicRanges) ?`GRanges-class` }