-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrading from beachmat to tatami #52
Comments
Hey Aaron, Very kind of you to think of me as your guinea pig 😄 Best, |
Great. The timeframe is fine, I'll probably need to sort out the hiccups on the BBS anyway. |
Hey, I read through the documentation of tatami and rewrote the It seems to work fine, but there are some bits where I am not sure that I did them optimally:
|
Nice.
Technically, it is possible to do this inside the C++ function - and in fact, previous versions of the tatami_r library did have a From a package developer perspective,
The amount of boilerplate should be equivalent to the old beachmat3 code with workspaces. I'm not sure if you were using workspaces previously, but the idea is to cache information between repeated The new extractors are analogous to the old workspaces but with certain enhancements during their construction:
Technically you don't need to allocate a buffer, you can just call If it helps, I think of the extractors as being (kind of, philosophically) like iterators for the usual
Do you just want to counts_worker->fetch_copy(gene_idx, static_cast<double*>(counts_vec.begin())); This will fetch and copy into Having said all that... the use of R data structures does make it a bit harder to parallelize, along with the fact that the loop involves possibly calling back into an R function. It can be done but does require some care - LMK if you are interested. |
Thanks for the thorough explanation.
No, I don't think I used workspaces. Previously I did: auto Y_bm = beachmat::create_matrix<NumericType>(Y_SEXP);
typename NumericType::vector counts(n_samples);
Y_bm->get_row(gene_idx, counts.begin()); Now the equivalent code appears to be: Rtatami::BoundNumericPointer Y_parsed(Y);
const auto& Y_bm = Y_parsed->ptr;
auto counts_worker = Y_bm->dense_row();
std::vector<double> counts_buffer(n_samples);
const double* counts = counts_worker->fetch(gene_idx, counts_buffer.data()); Or was the original code actually problematic and copied the data from the matrix into
True, that is pretty cool :)
Well, ideally I would want
The possibility to parallelize execution sounds really neat. How does the current implementation play with |
Technically, it's just one extra line if you do
I think it was fine, it was just kinda inefficient in certain cases, e.g., row-wise iteration through a sparse matrix.
Yeah, that's what the (Note that
Not at all. The parallelization is done using C++11's In theory, we could also use OpenMP, but it's not supported on older versions of Clang, and the R-from-C++ evaluation is not protected with OpenMP because I don't know how to do the relevant shenanigans. I just realized that the documentation for this functionality is not that great, but in your case, it would be something like: // Do R-related allocations in serial section OR inside mexec.run().
std::vector<Rcpp::NumericVector> buffers;
buffers.reserve(num_threads);
for (int t = 0; t < num_threads; ++t) {
buffers.emplace_back(n_samples);
}
std::vector<double*> buffer_ptrs;
buffer_ptrs.reserve(num_threads);
for (int t = 0; t < num_threads; ++t) {
buffer_ptrs.emplace_back(static_cast<double*>(buffers[t].begin()));
}
// The executor ensures that R-related operations run on the main thread.
auto& mexec = tatami_r::executor();
tatami_r::parallelize([&](size_t thread, int start, int length) -> void {
auto ext = mat->dense_row();
// OR, for the advanced: request extraction of [start, start + length) dense rows in this thread.
// auto ext = tatami::consecutive_extractor</* row = */ true, /* sparse = */ false>(mat.get(), start, length);
auto& buffer = buffers[thread];
auto bptr = buffer_ptrs[thread];
for (int r = start, end = start + length; r < end; ++r) {
auto ptr = ext->fetch_copy(r, bptr); // Do something non-R related with ptr/bptr, they're the same here.
mexec.run([&]() -> void { // Now for something potentially R-related.
some_kind_of_R_operation(buffer);
});
}
}, mat->nrow(), num_threads); Ooof. I'll just say that parallel tatami code is usually much cleaner; it's just that your use case needs to interact with the R API and that requires some gymnastics to protect R from simultaneous activity on multiple threads.
You'd have to call that from a serial section, so the code would be a little more complicated. For example, you'd need to parallelize 100 rows at a time in each thread, then return to the serial section to check the interrupt, and then parallelize the next |
Nudge - any progress? Need help? |
Hey, sorry for the silence. I was traveling last week and am currently at ICML. I didn't manage to spend any more time on it. Maybe we can discuss next week in person :) |
Just a heads up that, during the next release, I will be slowly removing some of the tests related to the old beachmat API (not any functionality, just the tests) in order to reduce beachmat's build time on the BioC machines. This may be a good opportunity to switch to new tatami API, which will be much more comprehensively tested than beachmat's old API, especially once I start trimming back the test suite. |
I think you're still using the old-school beachmat interface, which is fine and will still be around.
But I wonder if you'll consider experimenting with the new tatami interface, which will hopefully struggle its way onto BioC-devel via beachmat (v2.17.3) in the next few days.
This does away with the need for integer/numeric templates, it makes it easier to parallelize at the C++ level, and some delayed operations are captured natively (i.e., some DelayedArrays are translated to their C++ equivalents). Other R matrices are still supported, of course, but are handled more slowly by calling back into
DelayedArray::extract_array()
.It also greatly simplifies compilation. In the coolest and most experimental part of this update, beachmat will generate external pointers to matrix instances that can be passed to glmGamPoi's shared libraries. This means that glmGamPoi doesn't need to actually need to compile support for the various matrix implementations in its own
*.so
file, which makes it easier to extend tatami's C++ support to, e.g., HDF5, TileDB, Zarr-backed matrices.I've already switched SingleR to this latest paradigm, hopefully it'll pass BioC-devel in the next few days 🤞 If it works, I'd like a non-Aaron package developer to test out this new approach, and you're the lucky one.
The text was updated successfully, but these errors were encountered: