scran_graph_cluster
Graph-based clustering of cells
|
Graph-based clustering of single-cell data. More...
Classes | |
struct | BuildSnnGraphOptions |
Options for SNN graph construction. More... | |
struct | BuildSnnGraphResults |
Results of SNN graph construction. More... | |
struct | ClusterLeidenOptions |
Options for cluster_leiden() . More... | |
struct | ClusterLeidenResults |
Result of cluster_leiden() . More... | |
struct | ClusterMultilevelOptions |
Options for cluster_multilevel() . More... | |
struct | ClusterMultilevelResults |
Result of cluster_multilevel() . More... | |
struct | ClusterWalktrapOptions |
Options for cluster_walktrap() . More... | |
struct | ClusterWalktrapResults |
Result of cluster_walktrap() . More... | |
Typedefs | |
typedef int | DefaultNode |
typedef double | DefaultWeight |
Enumerations | |
enum class | SnnWeightScheme : char { RANKED , NUMBER , JACCARD } |
Functions | |
template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, class GetNeighbors_ , class GetIndex_ > | |
void | build_snn_graph (size_t num_cells, GetNeighbors_ get_neighbors, GetIndex_ get_index, const BuildSnnGraphOptions &options, BuildSnnGraphResults< Node_, Weight_ > &output) |
template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, typename Index_ , typename Distance_ > | |
BuildSnnGraphResults< Node_, Weight_ > | build_snn_graph (const knncolle::NeighborList< Index_, Distance_ > &neighbors, const BuildSnnGraphOptions &options) |
template<typename Node_ = int, typename Weight_ = double, typename Index_ > | |
BuildSnnGraphResults< Node_, Weight_ > | build_snn_graph (const std::vector< std::vector< Index_ > > &neighbors, const BuildSnnGraphOptions &options) |
template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, typename Dim_ , typename Index_ , typename Float_ > | |
BuildSnnGraphResults< Node_, Weight_ > | build_snn_graph (const knncolle::Prebuilt< Dim_, Index_, Float_ > &prebuilt, const BuildSnnGraphOptions &options) |
template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, typename Dim_ , typename Index_ , typename Value_ , typename Float_ > | |
BuildSnnGraphResults< Node_, Weight_ > | build_snn_graph (Dim_ num_dims, Index_ num_cells, const Value_ *data, const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Value_ >, Float_ > &knn_method, const BuildSnnGraphOptions &options) |
void | cluster_leiden (const igraph_t *graph, const igraph_vector_t *weights, const ClusterLeidenOptions &options, ClusterLeidenResults &output) |
ClusterLeidenResults | cluster_leiden (const raiigraph::Graph &graph, const std::vector< igraph_real_t > &weights, const ClusterLeidenOptions &options) |
void | cluster_multilevel (const igraph_t *graph, const igraph_vector_t *weights, const ClusterMultilevelOptions &options, ClusterMultilevelResults &output) |
ClusterMultilevelResults | cluster_multilevel (const raiigraph::Graph &graph, const std::vector< igraph_real_t > &weights, const ClusterMultilevelOptions &options) |
void | cluster_walktrap (const igraph_t *graph, const igraph_vector_t *weights, const ClusterWalktrapOptions &options, ClusterWalktrapResults &output) |
ClusterWalktrapResults | cluster_walktrap (const raiigraph::Graph &graph, const std::vector< igraph_real_t > &weights, const ClusterWalktrapOptions &options) |
template<typename Vertex_ > | |
raiigraph::Graph | edges_to_graph (size_t double_edges, const Vertex_ *edges, size_t num_vertices, igraph_bool_t directed) |
Graph-based clustering of single-cell data.
Default type for the node indices. Set to igraph_integer_t
if igraph is available.
Default type for the edge weights. Set to igraph_real_t
if igraph is available.
Choices for the edge weighting scheme during graph construction. Let \(k\) be the number of nearest neighbors for each node.
RANKED
defines the weight between two nodes as \(k - r/2\) where \(r\) is the smallest sum of ranks for any shared neighboring node (Xu and Su, 2015). For the purposes of this ranking, each node has a rank of zero in its own nearest-neighbor set. More shared neighbors, or shared neighbors that are close to both observations, will generally yield larger weights.NUMBER
defines the weight between two nodes as the number of shared nearest neighbors between them. The weight can range from zero to \(k + 1\), as the node itself is included in its own nearest-neighbor set. This is a simpler scheme that is also slightly faster but does not account for the ranking of neighbors within each set.JACCARD
defines the weight between two nodes as the Jaccard index of their neighbor sets, motivated by the algorithm used by the Seurat R package. This weight can range from zero to 1, and is a monotonic transformation of the weight used by NUMBER
.void scran_graph_cluster::build_snn_graph | ( | size_t | num_cells, |
GetNeighbors_ | get_neighbors, | ||
GetIndex_ | get_index, | ||
const BuildSnnGraphOptions & | options, | ||
BuildSnnGraphResults< Node_, Weight_ > & | output | ||
) |
In a shared nearest-neighbor graph, pairs of cells are connected to each other by an edge with weight determined from their shared nearest neighbors. If two cells are close together but have distinct sets of neighbors, the corresponding edge is downweighted as the two cells are unlikely to be part of the same neighborhood. In this manner, highly weighted edges will form within highly interconnected neighborhoods where many cells share the same neighbors. This provides a more sophisticated definition of similarity between cells compared to a simpler (unweighted) nearest neighbor graph that just focuses on immediate proximity.
Node_ | Integer type for the node indices. |
Weight_ | Floating-point type for the edge weights. |
GetNeighbors_ | Function that accepts a size_t cell index and returns a (const reference to) a container-like object. The container should be iterable in a range-based for loop, support the [] operator, and have a size() method. |
GetIndex_ | Function that accepts an element of the container type returned by GetNeighbors_ and returns Node_ . |
num_cells | Number of cells in the dataset. | |
get_neighbors | Function that accepts an integer cell index in [0, num_cells) and returns a container of that cell's neighbors. Each element of the container corresponds to a neighbor, and neighbors should be sorted by increasing distance from the cell. The same number of neighbors should be identified for each cell. | |
get_index | Function to return the index of each neighbor, given an element of the container returned by get_neighbors . In trivial cases, this is the identity function but it can be more complex depending on the contents of the inner container. | |
options | Further options for graph construction. Note that BuildSnnGraphOptions::num_neighbors is ignored here. | |
[out] | output | On output, the edges and weights of the SNN graph. The input value is ignored so this can be re-used across multiple calls to build_snn_graph() . |
BuildSnnGraphResults< Node_, Weight_ > scran_graph_cluster::build_snn_graph | ( | const knncolle::NeighborList< Index_, Distance_ > & | neighbors, |
const BuildSnnGraphOptions & | options | ||
) |
Overload to enable convenient usage with pre-computed neighbors from knncolle. Distances are ignored here; only the ordering of neighbor indices is used.
Node_ | Integer type for the node indices. |
Weight_ | Floating-point type for the edge weights. |
Index_ | Integer type for the neighbor indices. |
Distance_ | Floating-point type for the distances. |
neighbors | Vector of nearest-neighbor search results for each cell. Each entry is a pair containing a vector of neighbor indices and a vector of distances to those neighbors. Neighbors should be sorted by increasing distance. The same number of neighbors should be present for each cell. |
options | Further options for graph construction. Note that BuildSnnGraphOptions::num_neighbors is ignored here. |
BuildSnnGraphResults< Node_, Weight_ > scran_graph_cluster::build_snn_graph | ( | const std::vector< std::vector< Index_ > > & | neighbors, |
const BuildSnnGraphOptions & | options | ||
) |
Overload to enable convenient usage with pre-computed neighbors from knncolle.
Node_ | Integer type for the node indices. |
Weight_ | Floating-point type for the edge weights. |
Index_ | Integer type for the neighbor indices. |
neighbors | Vector of vectors of indices for the neighbors for each cell, sorted by increasing distance. It is generally expected that the same number of neighbors are present for each cell, though differences between cells are supported. |
options | Further options for graph construction. Note that BuildSnnGraphOptions::num_neighbors is ignored here. |
BuildSnnGraphResults< Node_, Weight_ > scran_graph_cluster::build_snn_graph | ( | const knncolle::Prebuilt< Dim_, Index_, Float_ > & | prebuilt, |
const BuildSnnGraphOptions & | options | ||
) |
Overload to enable convenient usage with a prebuilt nearest-neighbor search index from knncolle.
Node_ | Integer type for the node indices. |
Weight_ | Floating-point type for the edge weights. |
Dim_ | Integer type for the dimension index. |
Index_ | Integer type for the cell index. |
Float_ | Floating-point type for the distances. |
[in] | prebuilt | A prebuilt nearest-neighbor search index on the cells of interest. |
options | Further options for graph construction. |
BuildSnnGraphResults< Node_, Weight_ > scran_graph_cluster::build_snn_graph | ( | Dim_ | num_dims, |
Index_ | num_cells, | ||
const Value_ * | data, | ||
const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Value_ >, Float_ > & | knn_method, | ||
const BuildSnnGraphOptions & | options | ||
) |
Overload to enable convenient usage with a column-major array of cell coordinates.
Node_ | Integer type for the node indices. |
Weight_ | Floating-point type for the edge weights. |
Dim_ | Integer type for the dimension index. |
Index_ | Integer type for the cell index. |
Value_ | Numeric type for the input data. |
Float_ | Floating-point type for the distances. |
num_dims | Number of dimensions for the cell coordinates. | |
num_cells | Number of cells in the dataset. | |
[in] | data | Pointer to a num_dims -by-num_cells column-major array of cell coordinates where rows are dimensions and columns are cells. |
knn_method | Specification of the nearest-neighbor search algorithm, e.g., knncolle::VptreeBuilder , knncolle::KmknnBuilder . | |
options | Further options for graph construction. |
|
inline |
Run the Leiden community detection algorithm on a pre-constructed graph to obtain communities of highly inter-connected nodes. See here for more details on the Leiden algorithm.
graph | An existing graph. | |
weights | Pointer to an array of weights of length equal to the number of edges in graph . This should be in the same order as the edge list in graph . Alternatively NULL , if the graph is unweighted. | |
options | Further options. | |
[out] | output | On output, this is filtered with the clustering results. The input value is ignored, so this object can be re-used across multiple calls to cluster_leiden() . |
|
inline |
Overload of cluster_leiden()
that accepts C++ containers instead of the raw igraph pointers.
graph | An existing graph. |
weights | Vector of weights of length equal to the number of edges in graph . This should be in the same order as the edge list in graph . |
options | Further options. |
|
inline |
Run the multi-level community detection algorithm on a pre-constructed graph to obtain communities of highly inter-connected nodes. See here for more details on the multi-level algorithm.
graph | An existing graph. | |
weights | Pointer to an array of weights of length equal to the number of edges in graph . This should be in the same order as the edge list in graph . Alternatively NULL , if the graph is unweighted. | |
options | Further options. | |
[out] | output | On output, this is filtered with the clustering results. The input value is ignored, so this object can be re-used across multiple calls to cluster_multilevel() . |
|
inline |
Overload of cluster_multilevel()
that accepts C++ containers instead of the raw igraph pointers.
graph | An existing graph. |
weights | Vector of weights of length equal to the number of edges in graph . This should be in the same order as the edge list in graph . |
options | Further options. |
|
inline |
Run the Walktrap community detection algorithm on a pre-constructed graph to obtain communities of highly inter-connected nodes. See here for more details on the Walktrap algorithm.
graph | An existing graph. | |
weights | Pointer to an array of weights of length equal to the number of edges in graph . This should be in the same order as the edge list in graph . Alternatively NULL , if the graph is unweighted. | |
options | Further options. | |
[out] | output | On output, this is filtered with the clustering results. The input value is ignored, so this object can be re-used across multiple calls to cluster_walktrap() . |
|
inline |
Overload of cluster_walktrap()
that accepts C++ containers instead of the raw igraph pointers.
graph | An existing graph. |
weights | Vector of weights of length equal to the number of edges in graph . This should be in the same order as the edge list in graph . |
options | Further options. |
raiigraph::Graph scran_graph_cluster::edges_to_graph | ( | size_t | double_edges, |
const Vertex_ * | edges, | ||
size_t | num_vertices, | ||
igraph_bool_t | directed | ||
) |
Create an raiigraph:Graph
object from the edges.
Vertex_ | Integer type for the vertex IDs. |
double_edges | The number of edges times two. | |
[in] | edges | Pointer to an array of length double_edges . edges[2*i] and edges[2*i+1] define the vertices for edge i . For directed graphs, the edge goes from the first vertex to the second vertex. |
num_vertices | Number of vertices in the graph. | |
directed | Whether the graph is directed. This should be one of IGRAPH_DIRECTED or IGRAPH_UNDIRECTED . |
edges
.