Get an annotation data frame from org db packages
Source:R/gene_annotation.R
get_annotation_orgdb.Rd
Get an annotation data frame from org db packages
Arguments
- de_container
An object containing the data for a Differential Expression workflow (e.g.
DESeq2
,edgeR
orlimma
). Currently, this can be aDESeqDataSet
object, normally obtained after running your data through theDESeq2
framework.- orgdb_package
Character string, named as the
org.XX.eg.db
package which should be available in Bioconductor- id_type
Character, the ID type of the genes as in the row names of the
de_container
, to be used in the call toAnnotationDbi::mapIds()
- key_for_genenames
Character, corresponding to the column name for the key in the orgDb package containing the official gene name (often called gene symbol). This parameter defaults to "SYMBOL", but can be adjusted in case the key is not found in the annotation package (e.g. for
org.Sc.sgd.db
).
Value
A data frame to be used for annotation of genes, with the main
information encoded in the gene_id
and gene_name
columns.
Examples
library("macrophage")
library("DESeq2")
library("org.Hs.eg.db")
# dds object
data(gse, package = "macrophage")
dds_macrophage <- DESeqDataSet(gse, design = ~ line + condition)
#> using counts and average transcript lengths from tximeta
rownames(dds_macrophage) <- substr(rownames(dds_macrophage), 1, 15)
anno_df <- get_annotation_orgdb(dds_macrophage, "org.Hs.eg.db", "ENSEMBL")
#> 'select()' returned 1:many mapping between keys and columns
head(anno_df)
#> gene_id gene_name
#> ENSG00000000003 ENSG00000000003 TSPAN6
#> ENSG00000000005 ENSG00000000005 TNMD
#> ENSG00000000419 ENSG00000000419 DPM1
#> ENSG00000000457 ENSG00000000457 SCYL3
#> ENSG00000000460 ENSG00000000460 FIRRM
#> ENSG00000000938 ENSG00000000938 FGR