Runs differential analyses and enrichment pipelines
cinaR(
matrix,
contrasts,
experiment.type = "ATAC-Seq",
DA.choice = 1,
DA.fdr.threshold = 0.05,
DA.lfc.threshold = 0,
comparison.scheme = "OVO",
save.DA.peaks = FALSE,
DA.peaks.path = NULL,
norm.method = "cpm",
filter.method = "custom",
library.threshold = 2,
cpm.threshold = 1,
TSS.threshold = 50000,
show.annotation.pie = FALSE,
reference.genome = NULL,
batch.correction = FALSE,
batch.information = NULL,
additional.covariates = NULL,
sv.number = NULL,
run.enrichment = TRUE,
enrichment.method = NULL,
enrichment.FDR.cutoff = 1,
background.genes.size = 20000,
geneset = NULL,
verbose = TRUE
)
either bed formatted consensus peak matrix (peaks by 3+samples) CHR, START, STOP and raw peak counts OR count matrix (genes by 1+samples).
user-defined contrasts for comparing samples
The type of experiment either set to "ATAC-Seq" or "RNA-Seq"
determines which pipeline to run: (1) edgeR, (2) limma-voom, (3) limma-trend, (4) DEseq2. Note: Use limma-trend if consensus peaks are already normalized, otherwise use other methods.
fdr cut-off for differential analyses
log-fold change cutoff for differential analyses
either one-vs-one (OVO) or one-vs-all (OVA) comparisons.
saves differentially accessible peaks to an excel file
the path which the excel file of the DA peaks will be saved, if not set it will be saved to current directory.
normalization method for consensus peaks
filtering method for low expressed peaks
number of libraries a peak occurs so that it is not filtered default set to 2
count per million threshold for not to filter a peak
Distance to transcription start site in base-pairs. Default set to 50,000.
shows the annotation pie chart produced with ChipSeeker
genome of interested species. It should be 'hg38', 'hg19' or 'mm10'.
logical, if set will run unsupervised batch correction via sva (default) or if the batch information is known `batch.information` argument should be provided by user.
character vector, given by user.
vector or data.frame, this parameter will be directly added to design matrix before running the differential analyses, therefore won't affect the batch corrections but adjust the results in down-stream analyses.
number of surrogate variables to be calculated using SVA, best left untouched.
logical, turns off enrichment pipeline
There are two methodologies for enrichment analyses, Hyper-geometric p-value (HPEA) or Geneset Enrichment Analyses (GSEA).
FDR cut-off for enriched terms, p-values are corrected by Benjamini-Hochberg procedure
number of background genes for hyper-geometric p-value calculations. Default is 20,000.
Pathways to be used in enrichment analyses. If not set vp2008 (Chaussabel, 2008) immune modules will be used. This can be set to any geneset using `read.gmt` function from `qusage` package. Different modules are available: https://www.gsea-msigdb.org/gsea/downloads.jsp.
prints messages through running the pipeline
returns differentially accessible peaks
# \donttest{
data(atac_seq_consensus_bm) # calls 'bed'
# a vector for comparing the examples
contrasts <- sapply(strsplit(colnames(bed), split = "-", fixed = TRUE),
function(x){x[1]})[4:25]
results <- cinaR(bed, contrasts, reference.genome = "mm10")
#> >> Experiment type: ATAC-Seq
#> >> Matrix is filtered!
#>
#> >> preparing features information... 2022-05-06 16:30:12
#> >> identifying nearest features... 2022-05-06 16:30:12
#> >> calculating distance from peak to TSS... 2022-05-06 16:30:13
#> >> assigning genomic annotation... 2022-05-06 16:30:13
#> >> assigning chromosome lengths 2022-05-06 16:30:27
#> >> done... 2022-05-06 16:30:27
#> >> Method: edgeR
#> FDR:0.05& abs(logFC)<0
#> >> Estimating dispersion...
#> >> Fitting GLM...
#> >> DA peaks are found!
#> >> No `geneset` is specified so immune modules (Chaussabel, 2008) will be used!
#> >> enrichment.method` is not selected. Hyper-geometric p-value (HPEA) will be used!
#> >> Mice gene symbols are converted to human symbols!
#> >> Enrichment results are ready...
#> >> Done!
# }