Options for center_size_factors() and center_size_factors_blocked(). More...

#include <center_size_factors.hpp>

Public Attributes
CenterBlockMode	block_mode = CenterBlockMode::LOWEST

bool	ignore_invalid = true

Detailed Description

Options for center_size_factors() and center_size_factors_blocked().

Member Data Documentation

◆ block_mode

CenterBlockMode scran_norm::CenterSizeFactorsOptions::block_mode = CenterBlockMode::LOWEST

Strategy for handling blocks in center_size_factors_blocked().

With the PER_BLOCK strategy, size factors are scaled separately for each block so that they have a mean of 1 within each block. The scaled size factors are identical to those obtained by separate invocations of center_size_factors() on the size factors for each block. This can be desirable to ensure consistency with independent analyses of each block - otherwise, the centering would depend on the size factors in other blocks. However, any systematic differences in the size factors between blocks are lost, i.e., systematic changes in coverage between blocks will not be normalized.

With the LOWEST strategy, we compute the mean size factor for each block and we divide all size factors by the lowest mean. Here, our normalization strategy involves downscaling all blocks to match the coverage of the lowest-coverage block. This is useful for datasets with highly variable coverage between different blocks as it avoids egregious upscaling of low-coverage blocks. Specifically, strong upscaling allows the log-transformation to ignore any shrinkage from the pseudo-count. This is problematic as it inflates differences between cells at log-values derived from low counts, increasing noise and overstating log-fold changes. Downscaling is safer as it allows the pseudo-count to shrink the log-differences between cells towards zero at low counts, effectively sacrificing some information in the higher-coverage batches so that they can be compared to the low-coverage batches (which is preferable to exaggerating the informativeness of the latter for comparison to the former).

◆ ignore_invalid

bool scran_norm::CenterSizeFactorsOptions::ignore_invalid = true

Whether to ignore invalid size factors when computing the mean size factor. Size factors of infinity and NaN or those with non-positive values may occur in datasets that have not been properly filtered to remove low-quality cells. If such values might be present, we can check for and ignore them during the mean calculations.

Note that this setting does not actually remove any of the invalid size factors. If these are present, users should call sanitize_size_factors() after centering. The diagnostics value in center_size_factors() and center_size_factors_blocked() can be used to determine whether such a call is necessary. (In general, sanitization should be performed after centering so that the replacement size factors do not interfere with the mean calculations.)

If users know that invalid size factors cannot be present, they can set this flag to false for greater efficiency.

The documentation for this struct was generated from the following file:

scran_norm/center_size_factors.hpp

Public Attributes

Detailed Description

Member Data Documentation

◆ block_mode

◆ ignore_invalid