Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallelize Ranked Bray-Curtis Calculation in ExtractCore Function #24

Open
4 tasks
jibarozzo opened this issue Mar 11, 2025 · 0 comments
Open
4 tasks
Assignees
Labels
enhancement New feature or request Normal Priority When can we get this started/done?

Comments

@jibarozzo
Copy link
Collaborator

Issue: Parallelize Ranked Bray-Curtis Calculation in ExtractCore Function

Description:
The ranked Bray-Curtis (BC) portion of the ExtractCore function is computationally intensive and does not utilize all available cores on the HPCC. This issue proposes parallelizing the BC calculations to improve performance.


Problem

  • The current implementation processes OTUs sequentially, leading to long runtimes for datasets with many OTUs.
  • The HPCC's available cores are underutilized during execution.

Proposed Solution

  1. Parallelize calculate_bc():

    • Use the parallel or doParallel package to distribute BC calculations across multiple cores.
    • Focus on the loop where ranked OTUs are added iteratively to the BC matrix.
  2. Refactor Ranked BC Logic:

    • Introduce a new helper function, rank_bc(), to handle the ranked OTU calculations.
    • Optimize the loop to stop early if BC contributions flatline (e.g., after the first 1000 OTUs).

Discussion Points

  • Should parallelization focus on:
    • The calculate_bc() function itself?
    • The loop that iterates over ranked OTUs?
    • Both?
  • How to handle early termination when BC contributions plateau?

Next Steps

  • Implement parallelization using parallel or doParallel.
  • Refactor ranked BC logic into rank_bc().
  • Add early termination logic for flatlining BC contributions.
  • Benchmark performance improvements on HPCC.

Collaborators:

@jibarozzo jibarozzo added enhancement New feature or request Normal Priority When can we get this started/done? labels Mar 11, 2025
@jibarozzo jibarozzo added this to the All-BRC-Analysis milestone Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Normal Priority When can we get this started/done?
Projects
None yet
Development

No branches or pull requests

2 participants