This function handles multiple quality control methods for single-cell RNA-seq data.
Usage
RunCellQC(
  srt,
  assay = "RNA",
  split.by = NULL,
  return_filtered = FALSE,
  qc_metrics = c("doublets", "outlier", "umi", "gene", "mito", "ribo", "ribo_mito_ratio",
    "species"),
  db_method = "scDblFinder",
  db_rate = NULL,
  db_coefficient = 0.01,
  outlier_threshold = c("log10_nCount:lower:2.5", "log10_nCount:higher:5",
    "log10_nFeature:lower:2.5", "log10_nFeature:higher:5", "featurecount_dist:lower:2.5"),
  outlier_n = 1,
  UMI_threshold = 3000,
  gene_threshold = 1000,
  mito_threshold = 20,
  mito_pattern = c("MT-", "Mt-", "mt-"),
  mito_gene = NULL,
  ribo_threshold = 50,
  ribo_pattern = c("RP[SL]\\d+\\w{0,1}\\d*$", "Rp[sl]\\d+\\w{0,1}\\d*$",
    "rp[sl]\\d+\\w{0,1}\\d*$"),
  ribo_gene = NULL,
  ribo_mito_ratio_range = c(1, Inf),
  species = NULL,
  species_gene_prefix = NULL,
  species_percent = 95,
  seed = 11
)Arguments
- srt
- A Seurat object. 
- assay
- The name of the assay to be used for doublet-calling. Default is "RNA". 
- split.by
- Name of the sample variable to split the Seurat object. Default is NULL. 
- return_filtered
- Logical indicating whether to return a cell-filtered Seurat object. Default is FALSE. 
- qc_metrics
- A character vector specifying the quality control metrics to be applied. Default is `c("doublets", "outlier", "umi", "gene", "mito", "ribo", "ribo_mito_ratio", "species")`. 
- db_method
- Doublet-calling methods used. Can be one of - scDblFinder,- Scrublet,- DoubletDetection,- scds_cxds,- scds_bcds,- scds_hybrid
- db_rate
- The expected doublet rate. By default this is assumed to be 1% per thousand cells captured (so 4% among 4000 thousand cells), which is appropriate for 10x datasets. 
- db_coefficient
- The coefficient used to calculate the doublet rate. Default is 0.01. Doublet rate is calculated as`ncol(srt) / 1000 * db_coefficient` 
- outlier_threshold
- A character vector specifying the outlier threshold. Default is `c("log10_nCount:lower:2.5", "log10_nCount:higher:5", "log10_nFeature:lower:2.5", "log10_nFeature:higher:5", "featurecount_dist:lower:2.5")`. See isOutlier. 
- outlier_n
- Minimum number of outlier metrics that meet the conditions for determining outlier cells. Default is 1. 
- UMI_threshold
- UMI number threshold. Cells that exceed this threshold will be considered as kept. Default is 3000. 
- gene_threshold
- Gene number threshold. Cells that exceed this threshold will be considered as kept. Default is 1000. 
- mito_threshold
- Percentage of UMI counts of mitochondrial genes. Cells that exceed this threshold will be considered as discarded. Default is 20. 
- mito_pattern
- Regex patterns to match the mitochondrial genes. Default is `c("MT-", "Mt-", "mt-")`. 
- mito_gene
- A defined mitochondrial genes. If features provided, will ignore the - mito_patternmatching. Default is- NULL.
- ribo_threshold
- Percentage of UMI counts of ribosomal genes. Cells that exceed this threshold will be considered as discarded. Default is 50. 
- ribo_pattern
- Regex patterns to match the ribosomal genes. Default is `c("RP[SL]\d+\w0,1\d*$", "Rp[sl]\d+\w0,1\d*$", "rp[sl]\d+\w0,1\d*$")`. 
- ribo_gene
- A defined ribosomal genes. If features provided, will ignore the - ribo_patternmatching. Default is- NULL.
- ribo_mito_ratio_range
- A numeric vector specifying the range of ribosomal/mitochondrial gene expression ratios for ribo_mito_ratio outlier cells. Default is c(1, Inf). 
- species
- Species used as the suffix of the QC metrics. The first is the species of interest. Default is - NULL.
- species_gene_prefix
- Species gene prefix used to calculate QC metrics for each species. Default is - NULL.
- species_percent
- Percentage of UMI counts of the first species. Cells that exceed this threshold will be considered as kept. Default is 95. 
- seed
- Set a random seed. Default is 11. 
Examples
data("pancreas_sub")
pancreas_sub <- RunCellQC(pancreas_sub)
#> Warning: The following arguments are not used: drop
#> Warning: The following arguments are not used: drop
#> Error in as.vector(data): no method for coercing this S4 class to a vector
CellStatPlot(
  srt = pancreas_sub,
  stat.by = c(
    "db_qc", "outlier_qc", "umi_qc", "gene_qc",
    "mito_qc", "ribo_qc", "ribo_mito_ratio_qc", "species_qc"
  ),
  plot_type = "upset", stat_level = "Fail"
)
#> Error in StatPlot(meta.data = meta.data, stat.by = stat.by, group.by = group.by,     split.by = split.by, bg.by = bg.by, flip = flip, NA_color = NA_color,     NA_stat = NA_stat, keep_empty = keep_empty, individual = individual,     stat_level = stat_level, plot_type = plot_type, stat_type = stat_type,     position = position, palette = palette, palcolor = palcolor,     alpha = alpha, bg_palette = bg_palette, bg_palcolor = bg_palcolor,     bg_alpha = bg_alpha, label = label, label.size = label.size,     label.fg = label.fg, label.bg = label.bg, label.bg.r = label.bg.r,     aspect.ratio = aspect.ratio, title = title, subtitle = subtitle,     xlab = xlab, ylab = ylab, legend.position = legend.position,     legend.direction = legend.direction, theme_use = theme_use,     theme_args = theme_args, combine = combine, nrow = nrow,     ncol = ncol, byrow = byrow, force = force, seed = seed): db_qc is not in the meta.data.
table(pancreas_sub$CellQC)
#> Error in x[[i, drop = TRUE]]: ‘CellQC’ not found in this Seurat object
#>  Did you mean "CellType"?
data("ifnb_sub")
ifnb_sub <- RunCellQC(ifnb_sub, split.by = "stim", UMI_threshold = 1000, gene_threshold = 550)
#> Warning: The following arguments are not used: drop
#> Warning: The following arguments are not used: drop
#> Error in as.vector(data): no method for coercing this S4 class to a vector
CellStatPlot(
  srt = ifnb_sub,
  stat.by = c(
    "db_qc", "outlier_qc", "umi_qc", "gene_qc",
    "mito_qc", "ribo_qc", "ribo_mito_ratio_qc", "species_qc"
  ),
  plot_type = "upset", stat_level = "Fail"
)
#> Error in StatPlot(meta.data = meta.data, stat.by = stat.by, group.by = group.by,     split.by = split.by, bg.by = bg.by, flip = flip, NA_color = NA_color,     NA_stat = NA_stat, keep_empty = keep_empty, individual = individual,     stat_level = stat_level, plot_type = plot_type, stat_type = stat_type,     position = position, palette = palette, palcolor = palcolor,     alpha = alpha, bg_palette = bg_palette, bg_palcolor = bg_palcolor,     bg_alpha = bg_alpha, label = label, label.size = label.size,     label.fg = label.fg, label.bg = label.bg, label.bg.r = label.bg.r,     aspect.ratio = aspect.ratio, title = title, subtitle = subtitle,     xlab = xlab, ylab = ylab, legend.position = legend.position,     legend.direction = legend.direction, theme_use = theme_use,     theme_args = theme_args, combine = combine, nrow = nrow,     ncol = ncol, byrow = byrow, force = force, seed = seed): db_qc is not in the meta.data.
table(ifnb_sub$CellQC)
#> Error in x[[i, drop = TRUE]]: ‘CellQC’ not found in this Seurat object
#>