umappp
A C++ library for UMAP
Loading...
Searching...
No Matches
Classes | Typedefs | Enumerations | Functions
umappp Namespace Reference

Methods for UMAP. More...

Classes

struct  Options
 Options for initialize(). More...
 
class  Status
 Status of the UMAP optimization iterations. More...
 

Typedefs

template<typename Index_ , typename Float_ >
using NeighborList = knncolle::NeighborList< Index_, Float_ >
 Lists of neighbors for each observation.
 

Enumerations

enum  InitializeMethod : char { SPECTRAL , SPECTRAL_ONLY , RANDOM , NONE }
 

Functions

template<typename Index_ , typename Float_ >
Status< Index_, Float_initialize (NeighborList< Index_, Float_ > x, int num_dim, Float_ *embedding, Options options)
 
template<typename Dim_ , typename Index_ , typename Float_ >
Status< Index_, Float_initialize (const knncolle::Prebuilt< Dim_, Index_, Float_ > &prebuilt, int num_dim, Float_ *embedding, Options options)
 
template<typename Dim_ , typename Index_ , typename Float_ >
Status< Index_, Float_initialize (Dim_ data_dim, Index_ num_obs, const Float_ *data, const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Float_ >, Float_ > &builder, int num_dim, Float_ *embedding, Options options)
 

Detailed Description

Methods for UMAP.

Typedef Documentation

◆ NeighborList

Lists of neighbors for each observation.

Template Parameters
Index_Integer type of the neighbor indices.
Float_Floating-point type for the distances.

This is a convenient alias for the knncolle::NeighborList class. Each inner vector corresponds to an observation and contains the list of nearest neighbors for that observation, sorted by increasing distance. Neighbors for each observation should be unique - there should be no more than one occurrence of each index in each inner vector. Also, the inner vector for observation i should not contain any Neighbor with index i.

Enumeration Type Documentation

◆ InitializeMethod

How should the initial coordinates of the embedding be obtained?

  • SPECTRAL: attempts initialization based on spectral decomposition of the graph Laplacian. If that fails, we fall back to random draws from a normal distribution.
  • SPECTRAL_ONLY: attempts spectral initialization as before, but if that fails, we use the existing values in the supplied embedding array.
  • RANDOM: fills the embedding with random draws from a normal distribution.
  • NONE: uses the existing values in the supplied embedding array.

Function Documentation

◆ initialize() [1/3]

Status< Index_, Float_ > umappp::initialize ( const knncolle::Prebuilt< Dim_, Index_, Float_ > &  prebuilt,
int  num_dim,
Float_ embedding,
Options  options 
)
Template Parameters
Dim_Integer type for the dimensions of the input dataset.
Index_Integer type of the neighbor indices.
Float_Floating-point type for the distances.
Parameters
prebuiltA knncolle::Prebuilt instance constructed from the input dataset.
num_dimNumber of dimensions of the UMAP embedding.
[in,out]embeddingPointer to an array in which to store the embedding, where rows are dimensions (num_dim) and columns are observations (x.size()). This is only used as input if Options::init == InitializeMethod::NONE, otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run().
optionsFurther options.
Returns
A Status object containing the initial state of the UMAP algorithm. Further calls to Status::run() will update the embeddings in embedding.

◆ initialize() [2/3]

Status< Index_, Float_ > umappp::initialize ( Dim_  data_dim,
Index_  num_obs,
const Float_ data,
const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Float_ >, Float_ > &  builder,
int  num_dim,
Float_ embedding,
Options  options 
)
Template Parameters
Dim_Integer type for the dimensions of the input dataset.
Index_Integer type of the neighbor indices.
Float_Floating-point type for the distances.
Parameters
data_dimNumber of dimensions of the input dataset.
num_obsNumber of observations in the input dataset.
[in]dataPointer to an array containing the input high-dimensional data as a column-major matrix. Each row corresponds to a dimension (data_dim) and each column corresponds to an observation (num_obs).
builderAlgorithm to use for the neighbor search.
num_dimNumber of dimensions of the embedding.
[in,out]embeddingPointer to an array in which to store the embedding, where rows are dimensions (num_dim) and columns are observations (x.size()). This is only used as input if Options::init == InitializeMethod::NONE, otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run().
optionsFurther options.
Returns
A Status object containing the initial state of the UMAP algorithm. Further calls to Status::run() will update the embeddings in embedding.

◆ initialize() [3/3]

Status< Index_, Float_ > umappp::initialize ( NeighborList< Index_, Float_ x,
int  num_dim,
Float_ embedding,
Options  options 
)
Template Parameters
Index_Integer type of the neighbor indices.
Float_Floating-point type for the distances.
Parameters
xIndices and distances to the nearest neighbors for each observation. Note the expectations in the NeighborList documentation.
num_dimNumber of dimensions of the embedding.
[in,out]embeddingPointer to an array in which to store the embedding, where rows are dimensions (num_dim) and columns are observations (x.size()). This is only used as input if Options::init == InitializeMethod::NONE, otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run().
optionsFurther options. Note that Options::num_neighbors is ignored here.
Returns
A Status object containing the initial state of the UMAP algorithm. Further calls to Status::run() will update the embeddings in embedding.