Methods for UMAP.
More...
|
template<typename Index_ , typename Float_ > |
Status< Index_, Float_ > | initialize (NeighborList< Index_, Float_ > x, std::size_t num_dim, Float_ *embedding, Options options) |
|
template<typename Index_ , typename Input_ , typename Float_ > |
Status< Index_, Float_ > | initialize (const knncolle::Prebuilt< Index_, Input_, Float_ > &prebuilt, std::size_t num_dim, Float_ *embedding, Options options) |
|
template<typename Index_ , typename Float_ , class Matrix_ = knncolle::Matrix<Index_, Float_>> |
Status< Index_, Float_ > | initialize (std::size_t data_dim, std::size_t num_obs, const Float_ *data, const knncolle::Builder< Index_, Float_, Float_, Matrix_ > &builder, std::size_t num_dim, Float_ *embedding, Options options) |
|
◆ NeighborList
template<typename Index_ , typename Float_ >
Lists of neighbors for each observation.
- Template Parameters
-
Index_ | Integer type of the neighbor indices. |
Float_ | Floating-point type for the distances. |
This is a convenient alias for the knncolle::NeighborList
class. Each inner vector corresponds to an observation and contains the list of nearest neighbors for that observation, sorted by increasing distance. Neighbors for each observation should be unique - there should be no more than one occurrence of each index in each inner vector. Also, the inner vector for observation i
should not contain any Neighbor
with index i
.
◆ InitializeMethod
How should the initial coordinates of the embedding be obtained?
SPECTRAL
: attempts initialization based on spectral decomposition of the graph Laplacian. If that fails, we fall back to random draws from a normal distribution.
SPECTRAL_ONLY
: attempts spectral initialization as before, but if that fails, we use the existing values in the supplied embedding array.
RANDOM
: fills the embedding with random draws from a normal distribution.
NONE
: uses the existing values in the supplied embedding array.
◆ initialize() [1/3]
template<typename Index_ , typename Input_ , typename Float_ >
Status< Index_, Float_ > umappp::initialize |
( |
const knncolle::Prebuilt< Index_, Input_, Float_ > & | prebuilt, |
|
|
std::size_t | num_dim, |
|
|
Float_ * | embedding, |
|
|
Options | options ) |
- Template Parameters
-
Index_ | Integer type of the observation indices. |
Input_ | Floating-point type of the input data for the neighbor search. This is not used other than to define the knncolle::Prebuilt type. |
Float_ | Floating-point type of the input data, neighbor distances and output embedding. |
- Parameters
-
| prebuilt | A neighbor search index built on the dataset of interest. |
| num_dim | Number of dimensions of the UMAP embedding. |
[in,out] | embedding | Pointer to an array in which to store the embedding. This is treated as a column-major matrix where rows are dimensions (num_dim ) and columns are observations (x.size() ). Existing values in this array will be used as input if Options::initialize = InitializeMethod::NONE , and may be used as input if Options::initialize = InitializeMethod::SPECTRAL_ONLY ; otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run() . |
| options | Further options. |
- Returns
- A
Status
object containing the initial state of the UMAP algorithm. Further calls to Status::run()
will update the embeddings in embedding
.
◆ initialize() [2/3]
template<typename Index_ , typename Float_ >
Status< Index_, Float_ > umappp::initialize |
( |
NeighborList< Index_, Float_ > | x, |
|
|
std::size_t | num_dim, |
|
|
Float_ * | embedding, |
|
|
Options | options ) |
- Template Parameters
-
Index_ | Integer type of the neighbor indices. |
Float_ | Floating-point type for the distances. |
- Parameters
-
| x | Indices and distances to the nearest neighbors for each observation. Note the expectations in the NeighborList documentation. |
| num_dim | Number of dimensions of the embedding. |
[in,out] | embedding | Pointer to an array in which to store the embedding. This is treated as a column-major matrix where rows are dimensions (num_dim ) and columns are observations (x.size() ). Existing values in this array will be used as input if Options::initialize = InitializeMethod::NONE , and may be used as input if Options::initialize = InitializeMethod::SPECTRAL_ONLY ; otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run() . |
| options | Further options. Note that Options::num_neighbors is ignored here. |
- Returns
- A
Status
object containing the initial state of the UMAP algorithm. Further calls to Status::run()
will update the embeddings in embedding
.
◆ initialize() [3/3]
template<typename Index_ , typename Float_ , class Matrix_ = knncolle::Matrix<Index_, Float_>>
Status< Index_, Float_ > umappp::initialize |
( |
std::size_t | data_dim, |
|
|
std::size_t | num_obs, |
|
|
const Float_ * | data, |
|
|
const knncolle::Builder< Index_, Float_, Float_, Matrix_ > & | builder, |
|
|
std::size_t | num_dim, |
|
|
Float_ * | embedding, |
|
|
Options | options ) |
- Template Parameters
-
Index_ | Integer type of the observation indices. |
Float_ | Floating-point type of the input data, neighbor distances and output embedding. |
Matrix_ | Class of the input matrix for the neighbor search. This should be a knncolle::SimpleMatrix or its base class (i.e., knncolle::Matrix ). |
- Parameters
-
| data_dim | Number of dimensions of the input dataset. |
| num_obs | Number of observations in the input dataset. |
[in] | data | Pointer to an array containing the input high-dimensional data as a column-major matrix. Each row corresponds to a dimension (data_dim ) and each column corresponds to an observation (num_obs ). |
| builder | Algorithm to use for the neighbor search. |
| num_dim | Number of dimensions of the embedding. |
[in,out] | embedding | Pointer to an array in which to store the embedding. This is treated as a column-major matrix where rows are dimensions (num_dim ) and columns are observations (x.size() ). Existing values in this array will be used as input if Options::initialize = InitializeMethod::NONE , and may be used as input if Options::initialize = InitializeMethod::SPECTRAL_ONLY ; otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run() . |
| options | Further options. |
- Returns
- A
Status
object containing the initial state of the UMAP algorithm. Further calls to Status::run()
will update the embeddings in embedding
.