|
topicks
Pick top genes for downstream analyses
|
The topicks library implements a pick_top_genes() function to pick the top genes based on some statistic. The idea is to use this to choose highly variable genes based on their variances (e.g., from scran_variances), or for picking the best markers based on a differential expression statistic (e.g., from scran_markers). This functionality is surprisingly complex when we need to consider ties, absolute bounds, and whether to return a boolean filter or an array of indices. We also implement the TopQueue class, a tie- and bound-aware priority queue for retaining the top genes.
We can obtain an array of booleans indicating whether each gene was picked based on its stats:
Alternatively we can obtain an array of integer indices:
By default, ties at the selection boundary are retained so the actual number of chosen genes may be greater than what was requested. This can be disabled via the PickTopGenesOptions options:
We can also set an absolute bound on the statistic, e.g., to ensure that we never select marker genes with log-fold changes below some threshold:
Perhaps we don't have an entire array of statistics, and we are only computing each gene's statistics as needed. We can use the TopQueue class to choose the top genes in a running manner:
Check out the reference documentation for more details.
FetchContentIf you're using CMake, you just need to add something like this to your CMakeLists.txt:
Then you can link to topicks to make the headers available during compilation:
find_package()To install the library, use:
By default, this will use FetchContent to fetch all external dependencies. If you want to install them manually, use -DTOPICKER_FETCH_EXTERN=OFF. See the tags in extern/CMakeLists.txt to find compatible versions of each dependency.
If you're not using CMake, the simple approach is to just copy the files in include/ - either directly or with Git submodules - and include their path during compilation with, e.g., GCC's -I. This requires the external dependencies listed in extern/CMakeLists.txt.