scran_graph_cluster
Graph-based clustering of cells
Loading...
Searching...
No Matches
Classes | Typedefs | Enumerations | Functions
scran_graph_cluster Namespace Reference

Graph-based clustering of single-cell data. More...

Classes

struct  BuildSnnGraphOptions
 Options for SNN graph construction. More...
 
struct  BuildSnnGraphResults
 Results of SNN graph construction. More...
 
struct  ClusterLeidenOptions
 Options for cluster_leiden(). More...
 
struct  ClusterLeidenResults
 Result of cluster_leiden(). More...
 
struct  ClusterMultilevelOptions
 Options for cluster_multilevel(). More...
 
struct  ClusterMultilevelResults
 Result of cluster_multilevel(). More...
 
struct  ClusterWalktrapOptions
 Options for cluster_walktrap(). More...
 
struct  ClusterWalktrapResults
 Result of cluster_walktrap(). More...
 

Typedefs

typedef int DefaultNode
 
typedef double DefaultWeight
 

Enumerations

enum class  SnnWeightScheme : char { RANKED , NUMBER , JACCARD }
 

Functions

template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, class GetNeighbors_ , class GetIndex_ >
void build_snn_graph (size_t num_cells, GetNeighbors_ get_neighbors, GetIndex_ get_index, const BuildSnnGraphOptions &options, BuildSnnGraphResults< Node_, Weight_ > &output)
 
template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, typename Index_ , typename Distance_ >
BuildSnnGraphResults< Node_, Weight_ > build_snn_graph (const knncolle::NeighborList< Index_, Distance_ > &neighbors, const BuildSnnGraphOptions &options)
 
template<typename Node_ = int, typename Weight_ = double, typename Index_ >
BuildSnnGraphResults< Node_, Weight_ > build_snn_graph (const std::vector< std::vector< Index_ > > &neighbors, const BuildSnnGraphOptions &options)
 
template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, typename Dim_ , typename Index_ , typename Float_ >
BuildSnnGraphResults< Node_, Weight_ > build_snn_graph (const knncolle::Prebuilt< Dim_, Index_, Float_ > &prebuilt, const BuildSnnGraphOptions &options)
 
template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, typename Dim_ , typename Index_ , typename Value_ , typename Float_ >
BuildSnnGraphResults< Node_, Weight_ > build_snn_graph (Dim_ num_dims, Index_ num_cells, const Value_ *data, const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Value_ >, Float_ > &knn_method, const BuildSnnGraphOptions &options)
 
void cluster_leiden (const igraph_t *graph, const igraph_vector_t *weights, const ClusterLeidenOptions &options, ClusterLeidenResults &output)
 
ClusterLeidenResults cluster_leiden (const raiigraph::Graph &graph, const std::vector< igraph_real_t > &weights, const ClusterLeidenOptions &options)
 
void cluster_multilevel (const igraph_t *graph, const igraph_vector_t *weights, const ClusterMultilevelOptions &options, ClusterMultilevelResults &output)
 
ClusterMultilevelResults cluster_multilevel (const raiigraph::Graph &graph, const std::vector< igraph_real_t > &weights, const ClusterMultilevelOptions &options)
 
void cluster_walktrap (const igraph_t *graph, const igraph_vector_t *weights, const ClusterWalktrapOptions &options, ClusterWalktrapResults &output)
 
ClusterWalktrapResults cluster_walktrap (const raiigraph::Graph &graph, const std::vector< igraph_real_t > &weights, const ClusterWalktrapOptions &options)
 
template<typename Vertex_ >
raiigraph::Graph edges_to_graph (size_t double_edges, const Vertex_ *edges, size_t num_vertices, igraph_bool_t directed)
 

Detailed Description

Graph-based clustering of single-cell data.

Typedef Documentation

◆ DefaultNode

Default type for the node indices. Set to igraph_integer_t if igraph is available.

◆ DefaultWeight

Default type for the edge weights. Set to igraph_real_t if igraph is available.

Enumeration Type Documentation

◆ SnnWeightScheme

Choices for the edge weighting scheme during graph construction. Let \(k\) be the number of nearest neighbors for each node.

  • RANKED defines the weight between two nodes as \(k - r/2\) where \(r\) is the smallest sum of ranks for any shared neighboring node (Xu and Su, 2015). For the purposes of this ranking, each node has a rank of zero in its own nearest-neighbor set. More shared neighbors, or shared neighbors that are close to both observations, will generally yield larger weights.
  • NUMBER defines the weight between two nodes as the number of shared nearest neighbors between them. The weight can range from zero to \(k + 1\), as the node itself is included in its own nearest-neighbor set. This is a simpler scheme that is also slightly faster but does not account for the ranking of neighbors within each set.
  • JACCARD defines the weight between two nodes as the Jaccard index of their neighbor sets, motivated by the algorithm used by the Seurat R package. This weight can range from zero to 1, and is a monotonic transformation of the weight used by NUMBER.
See also
Xu C and Su Z (2015). Identification of cell types from single-cell transcriptomes using a novel clustering method. Bioinformatics 31, 1974-80

Function Documentation

◆ build_snn_graph() [1/5]

template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, class GetNeighbors_ , class GetIndex_ >
void scran_graph_cluster::build_snn_graph ( size_t  num_cells,
GetNeighbors_  get_neighbors,
GetIndex_  get_index,
const BuildSnnGraphOptions options,
BuildSnnGraphResults< Node_, Weight_ > &  output 
)

In a shared nearest-neighbor graph, pairs of cells are connected to each other by an edge with weight determined from their shared nearest neighbors. If two cells are close together but have distinct sets of neighbors, the corresponding edge is downweighted as the two cells are unlikely to be part of the same neighborhood. In this manner, highly weighted edges will form within highly interconnected neighborhoods where many cells share the same neighbors. This provides a more sophisticated definition of similarity between cells compared to a simpler (unweighted) nearest neighbor graph that just focuses on immediate proximity.

Template Parameters
Node_Integer type for the node indices.
Weight_Floating-point type for the edge weights.
GetNeighbors_Function that accepts a size_t cell index and returns a (const reference to) a container-like object. The container should be iterable in a range-based for loop, support the [] operator, and have a size() method.
GetIndex_Function that accepts an element of the container type returned by GetNeighbors_ and returns Node_.
Parameters
num_cellsNumber of cells in the dataset.
get_neighborsFunction that accepts an integer cell index in [0, num_cells) and returns a container of that cell's neighbors. Each element of the container corresponds to a neighbor, and neighbors should be sorted by increasing distance from the cell. The same number of neighbors should be identified for each cell.
get_indexFunction to return the index of each neighbor, given an element of the container returned by get_neighbors. In trivial cases, this is the identity function but it can be more complex depending on the contents of the inner container.
optionsFurther options for graph construction. Note that BuildSnnGraphOptions::num_neighbors is ignored here.
[out]outputOn output, the edges and weights of the SNN graph. The input value is ignored so this can be re-used across multiple calls to build_snn_graph().

◆ build_snn_graph() [2/5]

template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, typename Index_ , typename Distance_ >
BuildSnnGraphResults< Node_, Weight_ > scran_graph_cluster::build_snn_graph ( const knncolle::NeighborList< Index_, Distance_ > &  neighbors,
const BuildSnnGraphOptions options 
)

Overload to enable convenient usage with pre-computed neighbors from knncolle. Distances are ignored here; only the ordering of neighbor indices is used.

Template Parameters
Node_Integer type for the node indices.
Weight_Floating-point type for the edge weights.
Index_Integer type for the neighbor indices.
Distance_Floating-point type for the distances.
Parameters
neighborsVector of nearest-neighbor search results for each cell. Each entry is a pair containing a vector of neighbor indices and a vector of distances to those neighbors. Neighbors should be sorted by increasing distance. The same number of neighbors should be present for each cell.
optionsFurther options for graph construction. Note that BuildSnnGraphOptions::num_neighbors is ignored here.
Returns
The edges and weights of the SNN graph.

◆ build_snn_graph() [3/5]

template<typename Node_ = int, typename Weight_ = double, typename Index_ >
BuildSnnGraphResults< Node_, Weight_ > scran_graph_cluster::build_snn_graph ( const std::vector< std::vector< Index_ > > &  neighbors,
const BuildSnnGraphOptions options 
)

Overload to enable convenient usage with pre-computed neighbors from knncolle.

Template Parameters
Node_Integer type for the node indices.
Weight_Floating-point type for the edge weights.
Index_Integer type for the neighbor indices.
Parameters
neighborsVector of vectors of indices for the neighbors for each cell, sorted by increasing distance. It is generally expected that the same number of neighbors are present for each cell, though differences between cells are supported.
optionsFurther options for graph construction. Note that BuildSnnGraphOptions::num_neighbors is ignored here.
Returns
The edges and weights of the SNN graph.

◆ build_snn_graph() [4/5]

template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, typename Dim_ , typename Index_ , typename Float_ >
BuildSnnGraphResults< Node_, Weight_ > scran_graph_cluster::build_snn_graph ( const knncolle::Prebuilt< Dim_, Index_, Float_ > &  prebuilt,
const BuildSnnGraphOptions options 
)

Overload to enable convenient usage with a prebuilt nearest-neighbor search index from knncolle.

Template Parameters
Node_Integer type for the node indices.
Weight_Floating-point type for the edge weights.
Dim_Integer type for the dimension index.
Index_Integer type for the cell index.
Float_Floating-point type for the distances.
Parameters
[in]prebuiltA prebuilt nearest-neighbor search index on the cells of interest.
optionsFurther options for graph construction.
Returns
The edges and weights of the SNN graph.

◆ build_snn_graph() [5/5]

template<typename Node_ = DefaultNode, typename Weight_ = DefaultWeight, typename Dim_ , typename Index_ , typename Value_ , typename Float_ >
BuildSnnGraphResults< Node_, Weight_ > scran_graph_cluster::build_snn_graph ( Dim_  num_dims,
Index_  num_cells,
const Value_ *  data,
const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Value_ >, Float_ > &  knn_method,
const BuildSnnGraphOptions options 
)

Overload to enable convenient usage with a column-major array of cell coordinates.

Template Parameters
Node_Integer type for the node indices.
Weight_Floating-point type for the edge weights.
Dim_Integer type for the dimension index.
Index_Integer type for the cell index.
Value_Numeric type for the input data.
Float_Floating-point type for the distances.
Parameters
num_dimsNumber of dimensions for the cell coordinates.
num_cellsNumber of cells in the dataset.
[in]dataPointer to a num_dims-by-num_cells column-major array of cell coordinates where rows are dimensions and columns are cells.
knn_methodSpecification of the nearest-neighbor search algorithm, e.g., knncolle::VptreeBuilder, knncolle::KmknnBuilder.
optionsFurther options for graph construction.
Returns
The edges and weights of the SNN graph.

◆ cluster_leiden() [1/2]

void scran_graph_cluster::cluster_leiden ( const igraph_t graph,
const igraph_vector_t weights,
const ClusterLeidenOptions options,
ClusterLeidenResults output 
)
inline

Run the Leiden community detection algorithm on a pre-constructed graph to obtain communities of highly inter-connected nodes. See here for more details on the Leiden algorithm.

Parameters
graphAn existing graph.
weightsPointer to an array of weights of length equal to the number of edges in graph. This should be in the same order as the edge list in graph.
optionsFurther options.
[out]outputOn output, this is filtered with the clustering results. The input value is ignored, so this object can be re-used across multiple calls to cluster_leiden().

◆ cluster_leiden() [2/2]

ClusterLeidenResults scran_graph_cluster::cluster_leiden ( const raiigraph::Graph &  graph,
const std::vector< igraph_real_t > &  weights,
const ClusterLeidenOptions options 
)
inline

Overload of cluster_leiden() that accepts C++ containers instead of the raw igraph pointers.

Parameters
graphAn existing graph.
weightsVector of weights of length equal to the number of edges in graph. This should be in the same order as the edge list in graph.
optionsFurther options.
Returns
Clustering results for the nodes of the graph.

◆ cluster_multilevel() [1/2]

void scran_graph_cluster::cluster_multilevel ( const igraph_t graph,
const igraph_vector_t weights,
const ClusterMultilevelOptions options,
ClusterMultilevelResults output 
)
inline

Run the multi-level community detection algorithm on a pre-constructed graph to obtain communities of highly inter-connected nodes. See here for more details on the multi-level algorithm.

Parameters
graphAn existing graph.
weightsPointer to an array of weights of length equal to the number of edges in graph. This should be in the same order as the edge list in graph.
optionsFurther options.
[out]outputOn output, this is filtered with the clustering results. The input value is ignored, so this object can be re-used across multiple calls to cluster_multilevel().

◆ cluster_multilevel() [2/2]

ClusterMultilevelResults scran_graph_cluster::cluster_multilevel ( const raiigraph::Graph &  graph,
const std::vector< igraph_real_t > &  weights,
const ClusterMultilevelOptions options 
)
inline

Overload of cluster_multilevel() that accepts C++ containers instead of the raw igraph pointers.

Parameters
graphAn existing graph.
weightsVector of weights of length equal to the number of edges in graph. This should be in the same order as the edge list in graph.
optionsFurther options.
Returns
Clustering results for the nodes of the graph.

◆ cluster_walktrap() [1/2]

void scran_graph_cluster::cluster_walktrap ( const igraph_t graph,
const igraph_vector_t weights,
const ClusterWalktrapOptions options,
ClusterWalktrapResults output 
)
inline

Run the Walktrap community detection algorithm on a pre-constructed graph to obtain communities of highly inter-connected nodes. See here for more details on the Walktrap algorithm.

Parameters
graphAn existing graph.
weightsPointer to an array of weights of length equal to the number of edges in graph. This should be in the same order as the edge list in graph.
optionsFurther options.
[out]outputOn output, this is filtered with the clustering results. The input value is ignored, so this object can be re-used across multiple calls to cluster_walktrap().

◆ cluster_walktrap() [2/2]

ClusterWalktrapResults scran_graph_cluster::cluster_walktrap ( const raiigraph::Graph &  graph,
const std::vector< igraph_real_t > &  weights,
const ClusterWalktrapOptions options 
)
inline

Overload of cluster_walktrap() that accepts C++ containers instead of the raw igraph pointers.

Parameters
graphAn existing graph.
weightsVector of weights of length equal to the number of edges in graph. This should be in the same order as the edge list in graph.
optionsFurther options.
Returns
Clustering results for the nodes of the graph.

◆ edges_to_graph()

template<typename Vertex_ >
raiigraph::Graph scran_graph_cluster::edges_to_graph ( size_t  double_edges,
const Vertex_ edges,
size_t  num_vertices,
igraph_bool_t  directed 
)

Create an raiigraph:Graph object from the edges.

Template Parameters
Vertex_Integer type for the vertex IDs.
Parameters
double_edgesThe number of edges times two.
[in]edgesPointer to an array of length double_edges. edges[2*i] and edges[2*i+1] define the vertices for edge i. For directed graphs, the edge goes from the first vertex to the second vertex.
num_verticesNumber of vertices in the graph.
directedWhether the graph is directed. This should be one of IGRAPH_DIRECTED or IGRAPH_UNDIRECTED.
Returns
A graph created from edges.