scran_aggregate
Aggregate expression values across cells
|
Aggregate single-cell expression values. More...
Classes | |
struct | AggregateAcrossCellsBuffers |
Buffers for aggregate_across_cells() . More... | |
struct | AggregateAcrossCellsOptions |
Options for aggregate_across_cells() . More... | |
struct | AggregateAcrossCellsResults |
Results of aggregate_across_cells() . More... | |
struct | AggregateAcrossGenesBuffers |
Buffers for aggregate_across_genes() . More... | |
struct | AggregateAcrossGenesOptions |
Options for aggregate_across_genes() . More... | |
struct | AggregateAcrossGenesResults |
Results of aggregate_across_genes() . More... | |
Functions | |
template<typename Data_ , typename Index_ , typename Factor_ , typename Sum_ , typename Detected_ > | |
void | aggregate_across_cells (const tatami::Matrix< Data_, Index_ > &input, const Factor_ *factor, const AggregateAcrossCellsBuffers< Sum_, Detected_ > &buffers, const AggregateAcrossCellsOptions &options) |
template<typename Sum_ = double, typename Detected_ = int, typename Data_ , typename Index_ , typename Factor_ > | |
AggregateAcrossCellsResults< Sum_, Detected_ > | aggregate_across_cells (const tatami::Matrix< Data_, Index_ > &input, const Factor_ *factor, const AggregateAcrossCellsOptions &options) |
template<typename Data_ , typename Index_ , typename Gene_ , typename Weight_ , typename Sum_ > | |
void | aggregate_across_genes (const tatami::Matrix< Data_, Index_ > &input, const std::vector< std::tuple< size_t, const Gene_ *, const Weight_ * > > &gene_sets, const AggregateAcrossGenesBuffers< Sum_ > &buffers, const AggregateAcrossGenesOptions &options) |
template<typename Sum_ = double, typename Data_ , typename Index_ , typename Gene_ , typename Weight_ > | |
AggregateAcrossGenesResults< Sum_ > | aggregate_across_genes (const tatami::Matrix< Data_, Index_ > &input, const std::vector< std::tuple< size_t, const Gene_ *, const Weight_ * > > &gene_sets, const AggregateAcrossGenesOptions &options) |
template<typename Factor_ , typename Output_ > | |
std::vector< Factor_ > | clean_factor (size_t n, const Factor_ *factor, Output_ *cleaned) |
template<typename Factor_ , typename Combined_ > | |
std::vector< std::vector< Factor_ > > | combine_factors (size_t n, const std::vector< const Factor_ * > &factors, Combined_ *combined) |
template<typename Factor_ , typename Number_ , typename Combined_ > | |
std::vector< std::vector< Factor_ > > | combine_factors_unused (size_t n, const std::vector< std::pair< const Factor_ *, Number_ > > &factors, Combined_ *combined) |
Aggregate single-cell expression values.
void scran_aggregate::aggregate_across_cells | ( | const tatami::Matrix< Data_, Index_ > & | input, |
const Factor_ * | factor, | ||
const AggregateAcrossCellsBuffers< Sum_, Detected_ > & | buffers, | ||
const AggregateAcrossCellsOptions & | options | ||
) |
Aggregate expression values across groups of cells for each gene. We report the sum of expression values and the number of cells with detected (i.e., positive) expression values in each group. This is typically used to create pseudo-bulk expression profiles for cluster/sample combinations. Expression values are generally expected to be counts, though the same function can be used to compute the average log-expression.
Data_ | Type of data in the input matrix, should be numeric. |
Index_ | Integer type of index in the input matrix. |
Factor_ | Integer type of the factor. |
Sum_ | Type of the sum, usually the same as Data . |
Detected_ | Type for the number of detected cells, usually integer. |
input | The input matrix where rows are features and columns are cells. | |
[in] | factor | Grouping factor. This is a pointer to an array of length equal to the number of columns of input , containing the factor level (i.e., assigned group) for each cell. All levels should be integers in \([0, N)\) where \(N\) is the number of unique levels/groups. |
[out] | buffers | Collection of buffers in which to store the aggregate statistics (e.g., sums, number of detected cells) for each level and gene. |
options | Further options. |
AggregateAcrossCellsResults< Sum_, Detected_ > scran_aggregate::aggregate_across_cells | ( | const tatami::Matrix< Data_, Index_ > & | input, |
const Factor_ * | factor, | ||
const AggregateAcrossCellsOptions & | options | ||
) |
Overload of aggregate_across_cells()
that allocates memory for the results.
Sum_ | Type of the sum, should be numeric. |
Detected_ | Type for the number of detected cells, usually integer. |
Data_ | Type of data in the input matrix, should be numeric. |
Index_ | Integer type of index in the input matrix. |
Factor_ | Integer type of the factor. |
input | The input matrix where rows are features and columns are cells. | |
[in] | factor | Grouping factor. This is a pointer to an array of length equal to the number of columns of input , containing the factor level (i.e., assigned group) for each cell. All levels should be integers in \([0, N)\) where \(N\) is the number of unique levels/groups. |
options | Further options. |
AggregateAcrossCellsOptions
. void scran_aggregate::aggregate_across_genes | ( | const tatami::Matrix< Data_, Index_ > & | input, |
const std::vector< std::tuple< size_t, const Gene_ *, const Weight_ * > > & | gene_sets, | ||
const AggregateAcrossGenesBuffers< Sum_ > & | buffers, | ||
const AggregateAcrossGenesOptions & | options | ||
) |
Aggregate expression values across gene sets for each cell. This is used to compute a sum/mean of expression values for one or more gene sets/signatures. Each gene in the set can also be weighted, e.g., to account for the strength of regulatory relationships.
Data_ | Type of data in the input matrix, should be numeric. |
Index_ | Integer type of index in the input matrix. |
Gene_ | Integer type for the indices of genes in each set. |
Weight_ | Floating-point type for the weights of genes in each set. |
Sum_ | Floating-point type of the sum. |
input | The input matrix where rows are features and columns are cells. | |
gene_sets | Vector of gene sets. Each tuple corresponds to a set and contains (i) the number of genes in the set, (ii) a pointer to the row indices of the genes in the set, and (iii) a pointer to the weights of the genes in the set. The weight pointer may be NULL, in which case all weights are set to 1. | |
[out] | buffers | Collection of buffers in which to store the aggregate statistics (e.g., sums) for each gene set and cell. |
options | Further options. |
AggregateAcrossGenesResults< Sum_ > scran_aggregate::aggregate_across_genes | ( | const tatami::Matrix< Data_, Index_ > & | input, |
const std::vector< std::tuple< size_t, const Gene_ *, const Weight_ * > > & | gene_sets, | ||
const AggregateAcrossGenesOptions & | options | ||
) |
Overload of aggregate_across_genes()
that allocates memory for the results.
Sum_ | Floating-point type of the sum. |
Data_ | Type of data in the input matrix, should be numeric. |
Index_ | Integer type of index in the input matrix. |
Gene_ | Integer type for the indices of genes in each set. |
Weight_ | Floating-point type for the weights of genes in each set. |
input | The input matrix where rows are features and columns are cells. |
gene_sets | Vector of gene sets. Each tuple corresponds to a set and contains (i) the number of genes in the set, (ii) a pointer to the row indices of the genes in the set, and (iii) a pointer to the weights of the genes in the set. The weight pointer may be NULL, in which case all weights are set to 1. |
options | Further options. |
std::vector< Factor_ > scran_aggregate::clean_factor | ( | size_t | n, |
const Factor_ * | factor, | ||
Output_ * | cleaned | ||
) |
Clean up a categorical factor by removing unused levels. This yields the same results as combine_factors()
with a single factor.
Factor_ | Factor type. Any type may be used here as long as it is hashable and has an equality operator. |
Output_ | Integer type for the cleaned factor. |
n | Number of observations (i.e., cells). | |
[in] | factor | Pointer to an array of length n containing a factor. |
[out] | cleaned | Pointer to an array of length n in which the cleaned factor is to be stored. All values are integers in \([0, N)\) where \(N\) is the length of the output vector; all integers in this range are guaranteed to be present at least once in cleaned . |
factor
. For any observation i
, it is guaranteed that output[cleaned[i]] == factor[i]
. std::vector< std::vector< Factor_ > > scran_aggregate::combine_factors | ( | size_t | n, |
const std::vector< const Factor_ * > & | factors, | ||
Combined_ * | combined | ||
) |
Factor_ | Factor type. Any type may be used here as long as it implements the comparison operators. |
Combined_ | Integer type for the combined factor. This should be large enough to hold the number of unique combinations. |
n | Number of observations (i.e., cells). | |
[in] | factors | Vector of pointers to arrays of length n , each containing a different factor. |
[out] | combined | Pointer to an array of length n in which the combined factor is to be stored. On output, each entry determines the corresponding observation's combination of levels by indexing into the inner vectors of the returned object, i.e., j := combined[i] represents the combination (output[0][j], output[1][j], ...) . |
factors
, and all inner vectors have the same length. Corresponding entries of the inner vectors define a particular combination of levels, i.e., the first combination is defined as (output[0][0], output[1][0], ...)
, the second combination is defined as (output[0][1], output[1][1], ...)
, and so on. Combinations are guaranteed to be sorted by the first factor, then the second, etc. std::vector< std::vector< Factor_ > > scran_aggregate::combine_factors_unused | ( | size_t | n, |
const std::vector< std::pair< const Factor_ *, Number_ > > & | factors, | ||
Combined_ * | combined | ||
) |
This function is a variation of combine_factors()
that considers unobserved combinations of factor levels.
Factor_ | Factor type. Any type may be used here as long as it is comparable. |
Number_ | Integer type for the number of levels in each factor. |
Combined_ | Integer type for the combined factor. This should be large enough to hold the number of unique (possibly unused) combinations. |
n | Number of observations (i.e., cells). | |
[in] | factors | Vector of pairs, each of which corresponds to a factor. The first element of the pair is a pointer to an array of length n , containing the factor level for each observation. The second element is the total number of levels for this factor, which may be greater than the largeset observed level. |
[out] | combined | Pointer to an array of length n in which the combined factor is to be stored. On output, each entry determines the corresponding observation's combination of levels by indexing into the inner vectors of the returned object; see the argument of the same name in combine_factors() for more details. |
combine_factors()
, with the only difference being that unobserved combinations are also reported.