scran_blocks
Blocking utilities for libscran
Loading...
Searching...
No Matches
scran_blocks Namespace Reference

Blocking utilities for libscran. More...

Classes

class  SingleQuantile
 Calculate a single quantile from a container. More...
 
class  SingleQuantileVariable
 Calculate a single quantile for containers of variable length. More...
 
struct  VariableWeightParameters
 Parameters for compute_variable_weight(). More...
 

Enumerations

enum class  WeightPolicy : char { NONE , SIZE , VARIABLE , EQUAL }
 

Functions

double compute_variable_weight (const double s, const VariableWeightParameters &params)
 
template<typename Size_ , typename Weight_ >
void compute_weights (const std::size_t num_blocks, const Size_ *const sizes, const WeightPolicy policy, const VariableWeightParameters &variable, Weight_ *const weights)
 
template<typename Weight_ = double, typename Size_ >
std::vector< Weight_ > compute_weights (const std::vector< Size_ > &sizes, const WeightPolicy policy, const VariableWeightParameters &variable)
 
template<typename Stat_ , typename Output_ >
void parallel_means (const std::size_t n, std::vector< Stat_ * > in, Output_ *const out, const bool skip_nan)
 
template<typename Output_ = double, typename Stat_ >
std::vector< Output_ > parallel_means (const std::size_t n, std::vector< Stat_ * > in, const bool skip_nan)
 
template<typename Stat_ , typename Weight_ , typename Output_ >
void parallel_weighted_means (const std::size_t n, std::vector< Stat_ * > in, const Weight_ *const w, Output_ *const out, const bool skip_nan)
 
template<typename Output_ = double, typename Stat_ , typename Weight_ >
std::vector< Output_ > parallel_weighted_means (const std::size_t n, std::vector< Stat_ * > in, const Weight_ *const w, const bool skip_nan)
 
template<typename Stat_ , typename Output_ >
void parallel_quantiles (const std::size_t n, const std::vector< Stat_ * > &in, const double quantile, Output_ *const out, const bool skip_nan)
 
template<typename Output_ = double, typename Stat_ >
std::vector< Output_ > parallel_quantiles (const std::size_t n, const std::vector< Stat_ * > &in, const double quantile, const bool skip_nan)
 

Detailed Description

Blocking utilities for libscran.

Enumeration Type Documentation

◆ WeightPolicy

enum class scran_blocks::WeightPolicy : char
strong

Policy for weighting blocks based on their size, i.e., the number of cells in each block. This determines the nature of the weight calculations in compute_weights().

  • SIZE: blocks are weighted in proportion to their size. Larger blocks will contribute more to the weighted average.
  • EQUAL: each non-empty block is assigned equal weight, regardless of its size. Equivalent to averaging across non-empty blocks without weights.
  • VARIABLE: each batch is weighted using the logic in compute_variable_weight(). This penalizes small blocks with unreliable statistics while equally weighting all large blocks.
  • NONE: a deprecated alias for SIZE.

Function Documentation

◆ compute_variable_weight()

double scran_blocks::compute_variable_weight ( const double s,
const VariableWeightParameters & params )
inline

Assign a variable weight to each block of cells, for use in computing a weighted average across blocks. The weight for each block is calculated from the size of that block.

Blocks that are "large enough" (i.e., above the upper bound) are considered to be equally trustworthy and receive the same weight, ensuring that each block contributes equally to the weighted average. By comparison, very small blocks receive lower weight as their statistics are generally less stable.

Parameters
sSize of the block, in terms of the number of cells in that block.
paramsParameters for the weight calculation, consisting of the lower and upper bounds.
Returns
Weight of the block, to use for computing a weighted average across blocks.

◆ compute_weights() [1/2]

template<typename Size_ , typename Weight_ >
void scran_blocks::compute_weights ( const std::size_t num_blocks,
const Size_ *const sizes,
const WeightPolicy policy,
const VariableWeightParameters & variable,
Weight_ *const weights )

Compute weights for multiple blocks based on their size and the weighting policy. For variable weights, this function will call compute_variable_weight() for each block.

Weights should be interpreted as relative values within a single compute_weights() call, i.e., weights from different calls may not be comparable. They are typically used in functions like parallel_weighted_means() to compute a weighted average of statistics across blocks.

Template Parameters
Size_Numeric type of the block size.
Weight_Floating-point type of the output weights.
Parameters
num_blocksNumber of blocks.
[in]sizesPointer to an array of length num_blocks, containing the size of each block.
policyPolicy for weighting blocks of different sizes.
variableParameters for the variable block weights.
[out]weightsPointer to an array of length num_blocks. On output, this is filled with the weight of each block.

◆ compute_weights() [2/2]

template<typename Weight_ = double, typename Size_ >
std::vector< Weight_ > scran_blocks::compute_weights ( const std::vector< Size_ > & sizes,
const WeightPolicy policy,
const VariableWeightParameters & variable )

A convenience overload for compute_weights() that accepts and returns vectors.

Template Parameters
Size_Numeric type of the block size.
Weight_Floating-point type of the output weights.
Parameters
sizesVector containing the size of each block.
policyPolicy for weighting blocks of different sizes.
variableParameters for the variable block weights.
Returns
Vector of block weights.

◆ parallel_means() [1/2]

template<typename Output_ = double, typename Stat_ >
std::vector< Output_ > scran_blocks::parallel_means ( const std::size_t n,
std::vector< Stat_ * > in,
const bool skip_nan )

Overload of parallel_means() that allocates an output vector of averaged values.

Template Parameters
OutputFloating-point output type.
StatType of the input statistic, typically floating point.
Parameters
nLength of each array.
[in]inVector of pointers to input arrays of length n.
skip_nanWhether to check for NaNs. If true, NaNs are removed before computing the mean. If false, it is assumed that no NaNs are present.
Returns
Vector of length n, where the i-th element is the mean of (in.front()[i], in[1][i], ..., in.back()[i]).

◆ parallel_means() [2/2]

template<typename Stat_ , typename Output_ >
void scran_blocks::parallel_means ( const std::size_t n,
std::vector< Stat_ * > in,
Output_ *const out,
const bool skip_nan )

Mean of parallel elements across multiple arrays. This is equivalent to calling parallel_weighted_means() with equal weights for each array.

Template Parameters
Stat_Type of the input statistic, typically floating point.
Output_Floating-point output type.
Parameters
nLength of each array.
[in]inVector of pointers to input arrays of length n.
[out]outPointer to an output array of length n. On completion, out[i] is filled with the mean of (in.front()[i], in[1][i], ..., in.back()[i]).
skip_nanWhether to check for NaNs. If true, NaNs are removed before computing the mean. If false, it is assumed that no NaNs are present.

◆ parallel_quantiles() [1/2]

template<typename Output_ = double, typename Stat_ >
std::vector< Output_ > scran_blocks::parallel_quantiles ( const std::size_t n,
const std::vector< Stat_ * > & in,
const double quantile,
const bool skip_nan )

Overload of parallel_quantiles() that allocates memory for the output array.

Template Parameters
Output_Floating-point type of the output quantile.
Stat_Type of the input statistic, typically floating point.
Parameters
nLength of each array.
[in]inVector of pointers to input arrays of length n.
quantileQuantile to compute, in $[0, 1]$.
skip_nanWhether to check for NaNs. If true, NaNs are removed before computing the quantile. If false, it is assumed that no NaNs are present.
Returns
Vector of length n, where the i-th element is the quantile of (in.front()[i], in[1][i], ..., in.back()[i]).

◆ parallel_quantiles() [2/2]

template<typename Stat_ , typename Output_ >
void scran_blocks::parallel_quantiles ( const std::size_t n,
const std::vector< Stat_ * > & in,
const double quantile,
Output_ *const out,
const bool skip_nan )

Compute the quantile for parallel elements across multiple arrays. This can be used as an alternative to parallel_means() to summarize statistics across blocks, e.g., by computing the median with quantile = 0.5. The quantile is type 7, consistent with the default in R's quantile function.

Template Parameters
Stat_Type of the input statistic, typically floating point.
Output_Floating-point type of the output quantile.
Parameters
nLength of each array.
[in]inVector of pointers to input arrays of length n.
quantileQuantile to compute, in $[0, 1]$.
[out]outPointer to an output array of length n. On completion, out[i] is filled with the quantile of (in.front()[i], in[1][i], ..., in.back()[i]).
skip_nanWhether to check for NaNs. If true, NaNs are removed before computing the quantile. If false, it is assumed that no NaNs are present.

◆ parallel_weighted_means() [1/2]

template<typename Output_ = double, typename Stat_ , typename Weight_ >
std::vector< Output_ > scran_blocks::parallel_weighted_means ( const std::size_t n,
std::vector< Stat_ * > in,
const Weight_ *const w,
const bool skip_nan )

Overload of parallel_weighted_means() that allocates an output vector of averaged values.

Template Parameters
Output_Floating-point output type.
Weight_Type of the weight, typically floating point.
Stat_Type of the input statistic, typically floating point.
Parameters
nLength of each array.
[in]inVector of pointers to input arrays of length n.
[in]wPointer to an array of length equal to in.size(), containing the weight to use for each input array. Weights should be non-negative and finite.
skip_nanWhether to check for NaNs. If true, NaNs are removed before computing the mean. If false, it is assumed that no NaNs are present.
Returns
Vector of length n, where the i-th element is the weighted mean of (in.front()[i], in[1][i], ..., in.back()[i]).

◆ parallel_weighted_means() [2/2]

template<typename Stat_ , typename Weight_ , typename Output_ >
void scran_blocks::parallel_weighted_means ( const std::size_t n,
std::vector< Stat_ * > in,
const Weight_ *const w,
Output_ *const out,
const bool skip_nan )

Compute a weighted average of parallel elements across multiple arrays. For example, we can average statistics across blocks using weights computed with compute_weights().

Template Parameters
Stat_Type of the input statistic, typically floating point.
Weight_Type of the weight, typically floating point.
Output_Floating-point output type.
Parameters
nLength of each array.
[in]inVector of pointers to input arrays of length n.
[in]wPointer to an array of length equal to in.size(), containing the weight to use for each input array. Weights should be non-negative and finite.
[out]outPointer to an output array of length n. On completion, out[i] is filled with the weighted mean of (in.front()[i], in[1][i], ..., in.back()[i]).
skip_nanWhether to check for NaNs. If true, NaNs are removed before computing the mean. If false, it is assumed that no NaNs are present.