Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Matrix completion for robust CLR in decostand #619

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

antagomir
Copy link
Contributor

@antagomir antagomir commented Feb 12, 2024

This is intended to solve #570

I am opening the PR in order to request feedback before final checks & possible polishing.

Do the maintainers see any shortcomings / issues to be addressed on this?

The matrix completion step is straighforward. The code has been modified from the ROptSpace package. The code was moved to vegan in order to avoid new dependencies, although using the actual original package directly would simplify this and I will happy to change into that if this idea receives support. The ROptSpace package has been cited and its license (MIT) is compatible with vegan's license (GPL) and the code from ROptSpace can be included in vegan.

Unit tests and examples have been updated, too.

Comparisons with an independent Python implementation (deicode) are ongoing, I will check those as well to make sure that this works as expected.

Ah, should the PRs be done againsta vegandevs::master, or another branch?

@gavinsimpson
Copy link
Contributor

One thing that jumped out at me is that we are now depending on {testthat} with this PR. I personally don't mind, but @jarioksa might? We've typically avoid running too many tests in the package, and I don't see testthat used anywhere in the tests.

@antagomir
Copy link
Contributor Author

Ah, true. Accidental. I now removed that testthat dependency. DECOID validations still to do but open for other feedback meanwhile.

Copy link
Contributor

@jarioksa jarioksa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first go-through and only concerns style. I'll test the working of the function later. However, looks fine at the first look, and I'll head for a CRAN release soon after this is merged.

DESCRIPTION Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please no version upgrades except with releases (that is looming after this is merged, vegan3d released + vegan upgraded for this, and some other pending fixes). So still 2.6-5, and the CRAN release would be 2.6-6.




# .OptSpace : an algorithm for matrix reconstruction from a partially revealed set
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not necessary, but I'd prefer doubled ## for comments on their own line.

# Matrix Completion From a Few Entries.
# IEEE Transactions on Information Theory 56(6):2980--2998.
#
.OptSpace <- function(A, ropt=NA, niter=50, tol=1e-6, showprogress=FALSE){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other vegan functions uses trace = FALSE instead of showprogress. Wouldn't hurt here, although not crucial. These functions also usually cat() tracing instead of message().

Vegan usually has opening { on a new line in function definition (again: not crucial).

R-ext manual says that stop() (and warning()) messages should be seen as a part of a sentence and should not have period / full stop (.) at the end.

}
niter = round(niter)

m0 = 10000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again nitpicking: I have followed C convention where arbitrary constants are written uppercase (M0, RHO).


# initialize
dist = array(0,c(1,(niter+1)))
dist[1] = norm((M_E - (X%*%S%*%t(Y)))*E,'f')/sqrt(nnZ.E)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formatting again: use spaces around %*% for readability (still quite OK here, but becomes awkward when you have list elements with $-signs.

@@ -34,6 +35,10 @@ a1 <- vegan::vegdist(testdata, method = "aitchison", pseudocount=1)
a2 <- vegan::vegdist(testdata+1, method = "aitchison")
max(abs(a1-a2)) < 1e-6 # Tolerance

# Robust rclr+Euclid should give same result than robust.aitchison
d1 <- vegan::vegdist(vegan::decostand(testdata, "rclr"), "euclidean")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In vegan, do not use vegan::function(): Ripley can catch this and complain (that has happened to me).

@antagomir
Copy link
Contributor Author

Thanks! We will go through these and update.

I would still compare this with results from an independent Python implementation. That currently seems to have broader installation problems. There seem to be some differences at the moment and we are troubleshooting. This will be good to complete before merging.

@jarioksa
Copy link
Contributor

I'm heading for a CRAN release. It will still take some time: tests are ahead. In good case could be before May Day. Would this suit your plans with CLR?

@antagomir
Copy link
Contributor Author

The main bottleneck currently is that we cannot compare with external Python implementation to make sure the results are comparable. We were waiting for that (external issue) to become fixed. It would be good to get this fixed, I can try to see this week if this could be doable nevertheless. We will report back this week how it looks. Otherwise, it can wait until the next CRAN release.

@antagomir
Copy link
Contributor Author

This PR will be updated & reconsidered after #667 is ready.

If we add the matrix completion step in vegan this would be require a new dependency, the "filling" package, which has additional dependencies (CVXR, Rcpp, Rdpack, ROptSpace, RSpectra, nabor, stats, utils).

Then the question is whether such new dependency would be acceptable?

Entirely reimplementing the filling operation in vegan might not be feasible but we can have a look if it is not possible add new dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants