|
scran_norm
Scaling normalization of single-cell data
|
Options for center_size_factors() and center_size_factors_blocked().
More...
#include <center_size_factors.hpp>

Public Attributes | |
| bool | ignore_invalid = true |
| SizeFactorDiagnostics * | diagnostics = NULL |
| CenterBlockMode | block_mode = CenterBlockMode::LOWEST |
| std::optional< std::vector< double > > | custom_centers |
| bool | report_final = false |
Options for center_size_factors() and center_size_factors_blocked().
| bool scran_norm::CenterSizeFactorsBlockedOptions::ignore_invalid = true |
Whether to ignore invalid size factors when computing the mean size factor, see ComputeMeanSizeFactorOptions::ignore_invalid for details.
Note that this setting does not actually remove any of the invalid size factors, see comments at CenterSizeFactorsOptions::ignore_invalid.
| SizeFactorDiagnostics* scran_norm::CenterSizeFactorsBlockedOptions::diagnostics = NULL |
Pointer to diagnostics for invalid size factors, passed to ComputeMeanSizeFactorOptions::diagnostics. Ignored if CenterSizeFactorsBlockedOptions::ignore_invalid = false.
| CenterBlockMode scran_norm::CenterSizeFactorsBlockedOptions::block_mode = CenterBlockMode::LOWEST |
Strategy for handling blocks in center_size_factors_blocked().
With the PER_BLOCK strategy, size factors are scaled separately for each block so that they have a mean of 1 within each block. The scaled size factors are identical to those obtained by separate invocations of center_size_factors() on the size factors for each block. This can be desirable to ensure consistency with independent analyses of each block - otherwise, the centering would depend on the size factors in other blocks. However, any systematic differences in the size factors between blocks are lost, i.e., systematic changes in coverage between blocks will not be normalized.
With the LOWEST strategy, we compute the mean size factor for each block and we divide all size factors in all blocks by the lowest of the per-block means. Here, our normalization strategy involves downscaling all blocks to match the coverage of the lowest-coverage block. This is useful for datasets with big differences in coverage between blocks as it avoids egregious upscaling of low-coverage blocks. Specifically, strong upscaling allows the log-transformation to ignore any shrinkage from the pseudo-count. This is problematic as it inflates differences between cells at log-values derived from low counts, increasing noise and overstating log-fold changes. Downscaling is safer as it allows the pseudo-count to shrink the log-differences between cells towards zero at low counts, effectively sacrificing some information in the higher-coverage batches so that they can be compared to the low-coverage batches (which is preferable to exaggerating the informativeness of the latter for comparison to the former).
With the CUSTOM strategy, the size factors are scaled such that the mean for each block is equal to that specified in CenterSizeFactorsBlockedOptions::custom_centers. This is occasionally useful for ensuring that different sets of size factors are scaled to the same per-block mean, e.g., to ensure that average abundances are comparable between spike-in transcripts and endogenous genes.
In all cases, if the mean of the input size factors for any block is zero, no centering is attempted for that block.
| std::optional<std::vector<double> > scran_norm::CenterSizeFactorsBlockedOptions::custom_centers |
Mean of the size factors after centering. This should almost always be 1, to ensure that the normalized expression values are on roughly the same scale as the original counts. Nonetheless, expert users can change it to some non-unity value.
| bool scran_norm::CenterSizeFactorsBlockedOptions::report_final = false |
Whether to report the final mean of the size factors in each block, i.e., after centering. If false, the means of the input size factors are reported instead.