Create input file from a DESeq2 object
03-dataframe-creation.Rmd
data("example_data")
example_data
#> class: DESeqDataSet
#> dim: 58051 32
#> metadata(1): version
#> assays(4): counts mu H cooks
#> rownames(58051): DDX11L1 WASH7P ... AC213203.1 FAM231C.1
#> rowData names(58): baseMean baseVar ... deviance maxCooks
#> colnames(32): FLS_Pat1_IFNb FLS_Pat1_IFNg ... FLS_Pat4_TNFa
#> FLS_Pat4_untreated
#> colData names(4): Cell Pat Treatment sizeFactor
summary(example_data)
#> [1] "DESeqDataSet object of length 58051 with 58 metadata columns"
The DESeq2 objects contains the different conditions as colnames and the genes as rownames. We can extract a results table for the different comparisons.
res <- results(example_data, contrast = c("Treatment", "IFNb", "untreated"))
head(res)
#> log2 fold change (MLE): Treatment IFNb vs untreated
#> Wald test p-value: Treatment IFNb vs untreated
#> DataFrame with 6 rows and 6 columns
#> baseMean log2FoldChange lfcSE stat pvalue padj
#> <numeric> <numeric> <numeric> <numeric> <numeric> <numeric>
#> DDX11L1 0.000000 NA NA NA NA NA
#> WASH7P 0.109984 0.776279 5.94781 0.130515 0.896159 NA
#> MIR6859-1 0.000000 NA NA NA NA NA
#> MIR1302-2 0.000000 NA NA NA NA NA
#> FAM138A 0.000000 NA NA NA NA NA
#> OR4G4P 0.000000 NA NA NA NA NA
We can generate a comprehensive results dataframe by looping through
each condition. We can optionally filter the results by log2 fold change
log2fold
and adjusted p-value padj
.
condition_list<-c("IFNb", "IFNg", "IL17", "IL1b", "IL4")
counter<-0
for(cond in condition_list){
counter<-counter + 1
res_i <- results(example_data, contrast = c("Treatment", cond, "untreated"))%>%
as.data.frame() %>%
tibble::rownames_to_column("gene") %>%
dplyr::select(gene, log2FoldChange, padj) %>%
dplyr::mutate(comparison = str_glue(cond,"_vs_untreated")) %>%
dplyr::rename(log2fold = log2FoldChange)
if(counter == 1){res<-res_i}
if(counter >1){res<-rbind(res, res_i)}
}
# Filter signifiant results
res_df_signif<-res %>% dplyr::filter(is.na(log2fold) ==F & is.na(padj) == F & abs(log2fold)>1 & padj<= 0.05)
head(res_df_signif)
#> gene log2fold padj comparison
#> 1 MTND1P23 1.321531 4.332564e-02 IFNb_vs_untreated
#> 2 PLEKHN1 1.797689 3.147407e-05 IFNb_vs_untreated
#> 3 HES4 3.647470 1.744922e-05 IFNb_vs_untreated
#> 4 ISG15 6.963267 4.082580e-77 IFNb_vs_untreated
#> 5 TNFRSF14 1.094154 1.905473e-04 IFNb_vs_untreated
#> 6 IFI6 6.486454 1.722375e-70 IFNb_vs_untreated