Options for center_size_factors_blocked(). More...

#include <center_size_factors.hpp>

Collaboration diagram for scran_norm::CenterSizeFactorsBlockedOptions:

Public Attributes
bool	ignore_invalid = true

SizeFactorDiagnostics *	diagnostics = NULL

CenterBlockMode	block_mode = CenterBlockMode::LOWEST

std::optional< std::vector< double > >	custom_centers

bool	report_final = false

Detailed Description

Options for center_size_factors_blocked().

Member Data Documentation

◆ ignore_invalid

bool scran_norm::CenterSizeFactorsBlockedOptions::ignore_invalid = true

Whether to ignore invalid size factors when computing the mean size factor, see ComputeMeanSizeFactorOptions::ignore_invalid for details. Note that setting this option to true does not actually remove any of the invalid size factors, see comments at CenterSizeFactorsOptions::ignore_invalid.

◆ diagnostics

SizeFactorDiagnostics* scran_norm::CenterSizeFactorsBlockedOptions::diagnostics = NULL

Pointer to diagnostics for invalid size factors, passed to ComputeMeanSizeFactorOptions::diagnostics. Ignored if CenterSizeFactorsBlockedOptions::ignore_invalid = false.

◆ block_mode

CenterBlockMode scran_norm::CenterSizeFactorsBlockedOptions::block_mode = CenterBlockMode::LOWEST

Strategy for handling blocks in center_size_factors_blocked().

With PER_BLOCK, size factors are scaled separately for each block so that they have a mean of 1 within each block. The scaled size factors are identical to those obtained by separate invocations of center_size_factors() on the size factors for each block. This can be desirable to ensure consistency with independent analyses of each block - otherwise, the centering would depend on the size factors in other blocks. However, any systematic differences in the size factors between blocks are lost, i.e., systematic changes in coverage between blocks will not be normalized.

With LOWEST, we compute the mean size factor for each block and we divide all size factors in all blocks by the lowest of the per-block means. Here, our normalization strategy involves downscaling all blocks to match the coverage of the lowest-coverage block. This is useful for datasets with big differences in coverage between blocks as it avoids egregious upscaling of low-coverage blocks. Specifically, strong upscaling allows the log-transformation to ignore any shrinkage from the pseudo-count. This is problematic as it inflates differences between cells at log-values derived from low counts, increasing noise and overstating log-fold changes. Downscaling is safer as it allows the pseudo-count to shrink the log-differences between cells towards zero at low counts, effectively sacrificing some information in the higher-coverage batches so that they can be compared to the low-coverage batches (which is preferable to exaggerating the informativeness of the latter for comparison to the former).

With CUSTOM, size factors are scaled such that the mean for each block is equal to that specified in CenterSizeFactorsBlockedOptions::custom_centers. This is occasionally useful for ensuring that different sets of size factors are scaled to the same per-block mean, e.g., to ensure that average abundances are comparable between spike-in transcripts and endogenous genes in center_spike_in_factors_blocked().

In all cases, if the mean of the input size factors for any block is zero, no centering is attempted for that block.

◆ custom_centers

std::optional<std::vector<double> > scran_norm::CenterSizeFactorsBlockedOptions::custom_centers

Mean of the size factors after centering. Only used if CenterSizeFactorsBlockedOptions::block_mode = CenterBlockMode::CUSTOM.

◆ report_final

bool scran_norm::CenterSizeFactorsBlockedOptions::report_final = false

Whether to report the final mean of the size factors in each block, i.e., after centering. If false, the means of the input size factors are reported instead.

The documentation for this struct was generated from the following file:

scran_norm/center_size_factors.hpp