umappp
A C++ library for UMAP
Loading...
Searching...
No Matches
umappp Namespace Reference

Methods for UMAP. More...

Classes

struct  Options
 Options for initialize(). More...
 
class  Status
 Status of the UMAP optimization iterations. More...
 

Typedefs

template<typename Index_ , typename Float_ >
using NeighborList = knncolle::NeighborList<Index_, Float_>
 Lists of neighbors for each observation.
 

Enumerations

enum  InitializeMethod : char { SPECTRAL , SPECTRAL_ONLY , RANDOM , NONE }
 

Functions

template<typename Index_ , typename Float_ >
Status< Index_, Float_ > initialize (NeighborList< Index_, Float_ > x, int num_dim, Float_ *embedding, Options options)
 
template<typename Dim_ , typename Index_ , typename Float_ >
Status< Index_, Float_ > initialize (const knncolle::Prebuilt< Dim_, Index_, Float_ > &prebuilt, int num_dim, Float_ *embedding, Options options)
 
template<typename Dim_ , typename Index_ , typename Float_ >
Status< Index_, Float_ > initialize (Dim_ data_dim, Index_ num_obs, const Float_ *data, const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Float_ >, Float_ > &builder, int num_dim, Float_ *embedding, Options options)
 

Detailed Description

Methods for UMAP.

Typedef Documentation

◆ NeighborList

template<typename Index_ , typename Float_ >
using umappp::NeighborList = knncolle::NeighborList<Index_, Float_>

Lists of neighbors for each observation.

Template Parameters
Index_Integer type of the neighbor indices.
Float_Floating-point type for the distances.

This is a convenient alias for the knncolle::NeighborList class. Each inner vector corresponds to an observation and contains the list of nearest neighbors for that observation, sorted by increasing distance. Neighbors for each observation should be unique - there should be no more than one occurrence of each index in each inner vector. Also, the inner vector for observation i should not contain any Neighbor with index i.

Enumeration Type Documentation

◆ InitializeMethod

How should the initial coordinates of the embedding be obtained?

  • SPECTRAL: attempts initialization based on spectral decomposition of the graph Laplacian. If that fails, we fall back to random draws from a normal distribution.
  • SPECTRAL_ONLY: attempts spectral initialization as before, but if that fails, we use the existing values in the supplied embedding array.
  • RANDOM: fills the embedding with random draws from a normal distribution.
  • NONE: uses the existing values in the supplied embedding array.

Function Documentation

◆ initialize() [1/3]

template<typename Dim_ , typename Index_ , typename Float_ >
Status< Index_, Float_ > umappp::initialize ( const knncolle::Prebuilt< Dim_, Index_, Float_ > & prebuilt,
int num_dim,
Float_ * embedding,
Options options )
Template Parameters
Dim_Integer type for the dimensions of the input dataset.
Index_Integer type of the neighbor indices.
Float_Floating-point type for the distances.
Parameters
prebuiltA knncolle::Prebuilt instance constructed from the input dataset.
num_dimNumber of dimensions of the UMAP embedding.
[in,out]embeddingPointer to an array in which to store the embedding. This is treated as a column-major matrix where rows are dimensions (num_dim) and columns are observations (x.size()). Existing values in this array will be used as input if Options::initialize = InitializeMethod::NONE, and may be used as input if Options::initialize = InitializeMethod::SPECTRAL_ONLY; otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run().
optionsFurther options.
Returns
A Status object containing the initial state of the UMAP algorithm. Further calls to Status::run() will update the embeddings in embedding.

◆ initialize() [2/3]

template<typename Dim_ , typename Index_ , typename Float_ >
Status< Index_, Float_ > umappp::initialize ( Dim_ data_dim,
Index_ num_obs,
const Float_ * data,
const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Float_ >, Float_ > & builder,
int num_dim,
Float_ * embedding,
Options options )
Template Parameters
Dim_Integer type for the dimensions of the input dataset.
Index_Integer type of the neighbor indices.
Float_Floating-point type for the distances.
Parameters
data_dimNumber of dimensions of the input dataset.
num_obsNumber of observations in the input dataset.
[in]dataPointer to an array containing the input high-dimensional data as a column-major matrix. Each row corresponds to a dimension (data_dim) and each column corresponds to an observation (num_obs).
builderAlgorithm to use for the neighbor search.
num_dimNumber of dimensions of the embedding.
[in,out]embeddingPointer to an array in which to store the embedding. This is treated as a column-major matrix where rows are dimensions (num_dim) and columns are observations (x.size()). Existing values in this array will be used as input if Options::initialize = InitializeMethod::NONE, and may be used as input if Options::initialize = InitializeMethod::SPECTRAL_ONLY; otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run().
optionsFurther options.
Returns
A Status object containing the initial state of the UMAP algorithm. Further calls to Status::run() will update the embeddings in embedding.

◆ initialize() [3/3]

template<typename Index_ , typename Float_ >
Status< Index_, Float_ > umappp::initialize ( NeighborList< Index_, Float_ > x,
int num_dim,
Float_ * embedding,
Options options )
Template Parameters
Index_Integer type of the neighbor indices.
Float_Floating-point type for the distances.
Parameters
xIndices and distances to the nearest neighbors for each observation. Note the expectations in the NeighborList documentation.
num_dimNumber of dimensions of the embedding.
[in,out]embeddingPointer to an array in which to store the embedding. This is treated as a column-major matrix where rows are dimensions (num_dim) and columns are observations (x.size()). Existing values in this array will be used as input if Options::initialize = InitializeMethod::NONE, and may be used as input if Options::initialize = InitializeMethod::SPECTRAL_ONLY; otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run().
optionsFurther options. Note that Options::num_neighbors is ignored here.
Returns
A Status object containing the initial state of the UMAP algorithm. Further calls to Status::run() will update the embeddings in embedding.