Methods for UMAP.
More...
|
template<typename Index_ , typename Float_ > |
Status< Index_, Float_ > | initialize (NeighborList< Index_, Float_ > x, int num_dim, Float_ *embedding, Options options) |
|
template<typename Dim_ , typename Index_ , typename Float_ > |
Status< Index_, Float_ > | initialize (const knncolle::Prebuilt< Dim_, Index_, Float_ > &prebuilt, int num_dim, Float_ *embedding, Options options) |
|
template<typename Dim_ , typename Index_ , typename Float_ > |
Status< Index_, Float_ > | initialize (Dim_ data_dim, Index_ num_obs, const Float_ *data, const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Float_ >, Float_ > &builder, int num_dim, Float_ *embedding, Options options) |
|
◆ NeighborList
Lists of neighbors for each observation.
- Template Parameters
-
Index_ | Integer type of the neighbor indices. |
Float_ | Floating-point type for the distances. |
This is a convenient alias for the knncolle::NeighborList
class. Each inner vector corresponds to an observation and contains the list of nearest neighbors for that observation, sorted by increasing distance. Neighbors for each observation should be unique - there should be no more than one occurrence of each index in each inner vector. Also, the inner vector for observation i
should not contain any Neighbor
with index i
.
◆ InitializeMethod
How should the initial coordinates of the embedding be obtained?
SPECTRAL
: attempts initialization based on spectral decomposition of the graph Laplacian. If that fails, we fall back to random draws from a normal distribution.
SPECTRAL_ONLY
: attempts spectral initialization as before, but if that fails, we use the existing values in the supplied embedding array.
RANDOM
: fills the embedding with random draws from a normal distribution.
NONE
: uses the existing values in the supplied embedding array.
◆ initialize() [1/3]
- Template Parameters
-
Dim_ | Integer type for the dimensions of the input dataset. |
Index_ | Integer type of the neighbor indices. |
Float_ | Floating-point type for the distances. |
- Parameters
-
| prebuilt | A knncolle::Prebuilt instance constructed from the input dataset. |
| num_dim | Number of dimensions of the UMAP embedding. |
[in,out] | embedding | Pointer to an array in which to store the embedding. This is treated as a column-major matrix where rows are dimensions (num_dim ) and columns are observations (x.size() ). Existing values in this array will be used as input if Options::initialize = InitializeMethod::NONE , and may be used as input if Options::initialize = InitializeMethod::SPECTRAL_ONLY ; otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run() . |
| options | Further options. |
- Returns
- A
Status
object containing the initial state of the UMAP algorithm. Further calls to Status::run()
will update the embeddings in embedding
.
◆ initialize() [2/3]
Status< Index_, Float_ > umappp::initialize |
( |
Dim_ |
data_dim, |
|
|
Index_ |
num_obs, |
|
|
const Float_ * |
data, |
|
|
const knncolle::Builder< knncolle::SimpleMatrix< Dim_, Index_, Float_ >, Float_ > & |
builder, |
|
|
int |
num_dim, |
|
|
Float_ * |
embedding, |
|
|
Options |
options |
|
) |
| |
- Template Parameters
-
Dim_ | Integer type for the dimensions of the input dataset. |
Index_ | Integer type of the neighbor indices. |
Float_ | Floating-point type for the distances. |
- Parameters
-
| data_dim | Number of dimensions of the input dataset. |
| num_obs | Number of observations in the input dataset. |
[in] | data | Pointer to an array containing the input high-dimensional data as a column-major matrix. Each row corresponds to a dimension (data_dim ) and each column corresponds to an observation (num_obs ). |
| builder | Algorithm to use for the neighbor search. |
| num_dim | Number of dimensions of the embedding. |
[in,out] | embedding | Pointer to an array in which to store the embedding. This is treated as a column-major matrix where rows are dimensions (num_dim ) and columns are observations (x.size() ). Existing values in this array will be used as input if Options::initialize = InitializeMethod::NONE , and may be used as input if Options::initialize = InitializeMethod::SPECTRAL_ONLY ; otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run() . |
| options | Further options. |
- Returns
- A
Status
object containing the initial state of the UMAP algorithm. Further calls to Status::run()
will update the embeddings in embedding
.
◆ initialize() [3/3]
- Template Parameters
-
Index_ | Integer type of the neighbor indices. |
Float_ | Floating-point type for the distances. |
- Parameters
-
| x | Indices and distances to the nearest neighbors for each observation. Note the expectations in the NeighborList documentation. |
| num_dim | Number of dimensions of the embedding. |
[in,out] | embedding | Pointer to an array in which to store the embedding. This is treated as a column-major matrix where rows are dimensions (num_dim ) and columns are observations (x.size() ). Existing values in this array will be used as input if Options::initialize = InitializeMethod::NONE , and may be used as input if Options::initialize = InitializeMethod::SPECTRAL_ONLY ; otherwise it is only used as output. The lifetime of the array should be no shorter than the final call to Status::run() . |
| options | Further options. Note that Options::num_neighbors is ignored here. |
- Returns
- A
Status
object containing the initial state of the UMAP algorithm. Further calls to Status::run()
will update the embeddings in embedding
.