scran_qc
Simple quality control on single-cell data
Loading...
Searching...
No Matches
scran_qc::ChooseFilterThresholdsOptions Struct Reference

Options for choose_filter_thresholds(). More...

#include <choose_filter_thresholds.hpp>

Public Attributes

bool lower = true
 
bool upper = true
 
double num_mads = 3
 
double min_diff = 0
 
bool log = false
 

Detailed Description

Member Data Documentation

◆ lower

bool scran_qc::ChooseFilterThresholdsOptions::lower = true

Should low values be considered as potential outliers? If false, no lower threshold is applied when defining outliers.

◆ upper

bool scran_qc::ChooseFilterThresholdsOptions::upper = true

Should high values be considered as potential outliers? If false, no upper threshold is applied when defining outliers.

◆ num_mads

double scran_qc::ChooseFilterThresholdsOptions::num_mads = 3

Number of MADs to use to define outliers. Larger values result in more relaxed thresholds. By default, we require 3 MADs, which is motivated by the low probability (less than 1%) of obtaining such a value for normally distributed data.

◆ min_diff

double scran_qc::ChooseFilterThresholdsOptions::min_diff = 0

Minimum difference from the median to define outliers. This enforces a more relaxed threshold in cases where the MAD may be too small. If ChooseFilterThresholdsOptions::log = true, this difference is interpreted as a unit on the natural log-scale.

◆ log

bool scran_qc::ChooseFilterThresholdsOptions::log = false

Whether the median and MAD should computed on the log-scale, i.e., FindMedianMadOptions::log = true. (Or, for the overload that accepts a FindMedianMadResult, whether the median and MAD were already computed the log-scale.)

Using a log-transformation instructs the outlier definition to focus on the fold-change from the median. This has several benefits for right-skewed distributions of (mostly) positive values, where the log-transformation symmetrizes the distribution and makes it more normal-like. This improves the relevance of the interpretation of ChooseFilterThresholdsOptions::num_mads. When defining a lower threshold, the log-transformation also ensures that the defined threshold is always positive.

Some caution is required for distributions close to zero, e.g., proportions. The conversion of near-zero values to large negative log-values can unexpectedly inflate the MAD. This could be mitigated by adding a pseudo-count prior to log-transformation, but a large pseudo-count would cause the log-transformation to converge to a linear transformation, rendering this option meaningless for distributions consisting of small values.

If this is true, the reported thresholds are still converted back to the original scale of the metrics.


The documentation for this struct was generated from the following file: