scran_aggregate
Aggregate expression values across cells
Loading...
Searching...
No Matches
Classes | Functions
scran_aggregate Namespace Reference

Aggregate single-cell expression values. More...

Classes

struct  AggregateAcrossCellsBuffers
 Buffers for aggregate_across_cells(). More...
 
struct  AggregateAcrossCellsOptions
 Options for aggregate_across_cells(). More...
 
struct  AggregateAcrossCellsResults
 Results of aggregate_across_cells(). More...
 
struct  AggregateAcrossGenesBuffers
 Buffers for aggregate_across_genes(). More...
 
struct  AggregateAcrossGenesOptions
 Options for aggregate_across_genes(). More...
 
struct  AggregateAcrossGenesResults
 Results of aggregate_across_genes(). More...
 

Functions

template<typename Data_ , typename Index_ , typename Factor_ , typename Sum_ , typename Detected_ >
void aggregate_across_cells (const tatami::Matrix< Data_, Index_ > &input, const Factor_ *factor, const AggregateAcrossCellsBuffers< Sum_, Detected_ > &buffers, const AggregateAcrossCellsOptions &options)
 
template<typename Sum_ = double, typename Detected_ = int, typename Data_ , typename Index_ , typename Factor_ >
AggregateAcrossCellsResults< Sum_, Detected_ > aggregate_across_cells (const tatami::Matrix< Data_, Index_ > &input, const Factor_ *factor, const AggregateAcrossCellsOptions &options)
 
template<typename Data_ , typename Index_ , typename Gene_ , typename Weight_ , typename Sum_ >
void aggregate_across_genes (const tatami::Matrix< Data_, Index_ > &input, const std::vector< std::tuple< size_t, const Gene_ *, const Weight_ * > > &gene_sets, const AggregateAcrossGenesBuffers< Sum_ > &buffers, const AggregateAcrossGenesOptions &options)
 
template<typename Sum_ = double, typename Data_ , typename Index_ , typename Gene_ , typename Weight_ >
AggregateAcrossGenesResults< Sum_ > aggregate_across_genes (const tatami::Matrix< Data_, Index_ > &input, const std::vector< std::tuple< size_t, const Gene_ *, const Weight_ * > > &gene_sets, const AggregateAcrossGenesOptions &options)
 
template<typename Factor_ , typename Output_ >
std::vector< Factor_ > clean_factor (size_t n, const Factor_ *factor, Output_ *cleaned)
 
template<typename Factor_ , typename Combined_ >
std::vector< std::vector< Factor_ > > combine_factors (size_t n, const std::vector< const Factor_ * > &factors, Combined_ *combined)
 
template<typename Factor_ , typename Number_ , typename Combined_ >
std::vector< std::vector< Factor_ > > combine_factors_unused (size_t n, const std::vector< std::pair< const Factor_ *, Number_ > > &factors, Combined_ *combined)
 

Detailed Description

Aggregate single-cell expression values.

Function Documentation

◆ aggregate_across_cells() [1/2]

template<typename Data_ , typename Index_ , typename Factor_ , typename Sum_ , typename Detected_ >
void scran_aggregate::aggregate_across_cells ( const tatami::Matrix< Data_, Index_ > &  input,
const Factor_ *  factor,
const AggregateAcrossCellsBuffers< Sum_, Detected_ > &  buffers,
const AggregateAcrossCellsOptions options 
)

Aggregate expression values across groups of cells for each gene. We report the sum of expression values and the number of cells with detected (i.e., positive) expression values in each group. This is typically used to create pseudo-bulk expression profiles for cluster/sample combinations. Expression values are generally expected to be counts, though the same function can be used to compute the average log-expression.

Template Parameters
Data_Type of data in the input matrix, should be numeric.
Index_Integer type of index in the input matrix.
Factor_Integer type of the factor.
Sum_Type of the sum, usually the same as Data.
Detected_Type for the number of detected cells, usually integer.
Parameters
inputThe input matrix where rows are features and columns are cells.
[in]factorGrouping factor. This is a pointer to an array of length equal to the number of columns of input, containing the factor level (i.e., assigned group) for each cell. All levels should be integers in \([0, N)\) where \(N\) is the number of unique levels/groups.
[out]buffersCollection of buffers in which to store the aggregate statistics (e.g., sums, number of detected cells) for each level and gene.
optionsFurther options.

◆ aggregate_across_cells() [2/2]

template<typename Sum_ = double, typename Detected_ = int, typename Data_ , typename Index_ , typename Factor_ >
AggregateAcrossCellsResults< Sum_, Detected_ > scran_aggregate::aggregate_across_cells ( const tatami::Matrix< Data_, Index_ > &  input,
const Factor_ *  factor,
const AggregateAcrossCellsOptions options 
)

Overload of aggregate_across_cells() that allocates memory for the results.

Template Parameters
Sum_Type of the sum, should be numeric.
Detected_Type for the number of detected cells, usually integer.
Data_Type of data in the input matrix, should be numeric.
Index_Integer type of index in the input matrix.
Factor_Integer type of the factor.
Parameters
inputThe input matrix where rows are features and columns are cells.
[in]factorGrouping factor. This is a pointer to an array of length equal to the number of columns of input, containing the factor level (i.e., assigned group) for each cell. All levels should be integers in \([0, N)\) where \(N\) is the number of unique levels/groups.
optionsFurther options.
Returns
Results of the aggregation, where the available statistics depend on AggregateAcrossCellsOptions.

◆ aggregate_across_genes() [1/2]

template<typename Data_ , typename Index_ , typename Gene_ , typename Weight_ , typename Sum_ >
void scran_aggregate::aggregate_across_genes ( const tatami::Matrix< Data_, Index_ > &  input,
const std::vector< std::tuple< size_t, const Gene_ *, const Weight_ * > > &  gene_sets,
const AggregateAcrossGenesBuffers< Sum_ > &  buffers,
const AggregateAcrossGenesOptions options 
)

Aggregate expression values across gene sets for each cell. This is used to compute a sum/mean of expression values for one or more gene sets/signatures. Each gene in the set can also be weighted, e.g., to account for the strength of regulatory relationships.

Template Parameters
Data_Type of data in the input matrix, should be numeric.
Index_Integer type of index in the input matrix.
Gene_Integer type for the indices of genes in each set.
Weight_Floating-point type for the weights of genes in each set.
Sum_Floating-point type of the sum.
Parameters
inputThe input matrix where rows are features and columns are cells.
gene_setsVector of gene sets. Each tuple corresponds to a set and contains (i) the number of genes in the set, (ii) a pointer to the row indices of the genes in the set, and (iii) a pointer to the weights of the genes in the set. The weight pointer may be NULL, in which case all weights are set to 1.
[out]buffersCollection of buffers in which to store the aggregate statistics (e.g., sums) for each gene set and cell.
optionsFurther options.

◆ aggregate_across_genes() [2/2]

template<typename Sum_ = double, typename Data_ , typename Index_ , typename Gene_ , typename Weight_ >
AggregateAcrossGenesResults< Sum_ > scran_aggregate::aggregate_across_genes ( const tatami::Matrix< Data_, Index_ > &  input,
const std::vector< std::tuple< size_t, const Gene_ *, const Weight_ * > > &  gene_sets,
const AggregateAcrossGenesOptions options 
)

Overload of aggregate_across_genes() that allocates memory for the results.

Template Parameters
Sum_Floating-point type of the sum.
Data_Type of data in the input matrix, should be numeric.
Index_Integer type of index in the input matrix.
Gene_Integer type for the indices of genes in each set.
Weight_Floating-point type for the weights of genes in each set.
Parameters
inputThe input matrix where rows are features and columns are cells.
gene_setsVector of gene sets. Each tuple corresponds to a set and contains (i) the number of genes in the set, (ii) a pointer to the row indices of the genes in the set, and (iii) a pointer to the weights of the genes in the set. The weight pointer may be NULL, in which case all weights are set to 1.
optionsFurther options.
Returns
Results of the aggregation.

◆ clean_factor()

template<typename Factor_ , typename Output_ >
std::vector< Factor_ > scran_aggregate::clean_factor ( size_t  n,
const Factor_ *  factor,
Output_ *  cleaned 
)

Clean up a categorical factor by removing unused levels. This yields the same results as combine_factors() with a single factor.

Template Parameters
Factor_Factor type. Any type may be used here as long as it is hashable and has an equality operator.
Output_Integer type for the cleaned factor.
Parameters
nNumber of observations (i.e., cells).
[in]factorPointer to an array of length n containing a factor.
[out]cleanedPointer to an array of length n in which the cleaned factor is to be stored. All values are integers in \([0, N)\) where \(N\) is the length of the output vector; all integers in this range are guaranteed to be present at least once in cleaned.
Returns
A sorted vector of the original levels that were observed at least once in factor. For any observation i, it is guaranteed that output[cleaned[i]] == factor[i].

◆ combine_factors()

template<typename Factor_ , typename Combined_ >
std::vector< std::vector< Factor_ > > scran_aggregate::combine_factors ( size_t  n,
const std::vector< const Factor_ * > &  factors,
Combined_ *  combined 
)
Template Parameters
Factor_Factor type. Any type may be used here as long as it implements the comparison operators.
Combined_Integer type for the combined factor. This should be large enough to hold the number of unique combinations.
Parameters
nNumber of observations (i.e., cells).
[in]factorsVector of pointers to arrays of length n, each containing a different factor.
[out]combinedPointer to an array of length n in which the combined factor is to be stored. On output, each entry determines the corresponding observation's combination of levels by indexing into the inner vectors of the returned object, i.e., j := combined[i] represents the combination (output[0][j], output[1][j], ...).
Returns
Vector of vectors containing each unique combinations of factor levels. Each inner vector corresponds to a factor in factors, and all inner vectors have the same length. Corresponding entries of the inner vectors define a particular combination of levels, i.e., the first combination is defined as (output[0][0], output[1][0], ...), the second combination is defined as (output[0][1], output[1][1], ...), and so on. Combinations are guaranteed to be sorted by the first factor, then the second, etc.

◆ combine_factors_unused()

template<typename Factor_ , typename Number_ , typename Combined_ >
std::vector< std::vector< Factor_ > > scran_aggregate::combine_factors_unused ( size_t  n,
const std::vector< std::pair< const Factor_ *, Number_ > > &  factors,
Combined_ *  combined 
)

This function is a variation of combine_factors() that considers unobserved combinations of factor levels.

Template Parameters
Factor_Factor type. Any type may be used here as long as it is comparable.
Number_Integer type for the number of levels in each factor.
Combined_Integer type for the combined factor. This should be large enough to hold the number of unique (possibly unused) combinations.
Parameters
nNumber of observations (i.e., cells).
[in]factorsVector of pairs, each of which corresponds to a factor. The first element of the pair is a pointer to an array of length n, containing the factor level for each observation. The second element is the total number of levels for this factor, which may be greater than the largeset observed level.
[out]combinedPointer to an array of length n in which the combined factor is to be stored. On output, each entry determines the corresponding observation's combination of levels by indexing into the inner vectors of the returned object; see the argument of the same name in combine_factors() for more details.
Returns
Vector of vectors containing each unique combinations of factor levels. This has the same structure as the output of combine_factors(), with the only difference being that unobserved combinations are also reported.