scran_qc
Simple quality control on single-cell data
Loading...
Searching...
No Matches
scran_qc::ChooseFilterThresholdsOptions Struct Reference

Options for choose_filter_thresholds(). More...

#include <choose_filter_thresholds.hpp>

Public Attributes

bool lower = true
 
bool upper = true
 
double num_mads = 3
 
double min_diff = 0
 
bool log = false
 

Detailed Description

Member Data Documentation

◆ lower

bool scran_qc::ChooseFilterThresholdsOptions::lower = true

Should low values be considered as potential outliers? If false, no lower threshold is applied when defining outliers.

◆ upper

bool scran_qc::ChooseFilterThresholdsOptions::upper = true

Should high values be considered as potential outliers? If false, no upper threshold is applied when defining outliers.

◆ num_mads

double scran_qc::ChooseFilterThresholdsOptions::num_mads = 3

Number of MADs to use to define outliers. Larger values result in more relaxed thresholds. By default, we require 3 MADs, which is motivated by the low probability (less than 1%) of obtaining such a value for normally distributed data.

◆ min_diff

double scran_qc::ChooseFilterThresholdsOptions::min_diff = 0

Minimum difference from the median to define outliers. This enforces a more relaxed threshold in cases where the MAD may be too small. If ChooseFilterThresholdsOptions::log = true, this difference is interpreted as a unit on the natural log-scale.

◆ log

bool scran_qc::ChooseFilterThresholdsOptions::log = false

Whether to compute the median and MAD on the log-scale. If true, the threshold is calculated in log-space, and the log-transformation is reversed before the function returns; this ensures that the reported thresholds are on the original scale of the metrics and can be directly compared to the per-cell values of the metrics.

Using a log-transformation instructs the outlier definition to focus on the fold-change from the median. This has several benefits for right-skewed distributions of (mostly) positive values, where the log-transformation symmetrizes the distribution and makes it more normal-like. This improves the relevance of the interpretation of ChooseFilterThresholdsOptions::num_mads. When defining a lower threshold, the log-transformation also ensures that the defined threshold is always positive.

Some caution is required for distributions close to zero, e.g., proportions. The conversion of near-zero values to large negative log-values can unexpectedly inflate the MAD. This could be mitigated by adding a pseudo-count prior to log-transformation, but a large pseudo-count would cause the log-transformation to converge to a linear transformation, rendering this option meaningless for distributions consisting of small values.


The documentation for this struct was generated from the following file: