Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with Alpha filtration #14

Open
kiranvad opened this issue Jan 23, 2020 · 6 comments
Open

Error with Alpha filtration #14

kiranvad opened this issue Jan 23, 2020 · 6 comments

Comments

@kiranvad
Copy link

Hey,
First of all, thanks for a great package. I find all of the scikit-tda packages really useful.
I'm trying to compute Alpha filtration of a 3-dimensional data (attached as a text file) and I get the following error(but return an empty persistence diagram)
I couldn't find enough resources online as to why this can be the case.
Any help on this would be greatly appreciated.

Doing spatial.Delaunay triangulation...
Finished spatial.Delaunay triangulation (Elapsed Time 0.00938)
Building alpha filtration...
Finished building alpha filtration (Elapsed Time 4.01)
Constructing boundary matrix...
Error: Not a proper filtration: [107 111 207] added before (107, 111)
Computing persistence pairs...
Finished computing persistence pairs (Elapsed Time 9.16e-05)

points.txt

@ctralie
Copy link
Member

ctralie commented Jan 24, 2020 via email

@ctralie
Copy link
Member

ctralie commented Jan 24, 2020 via email

@ctralie
Copy link
Member

ctralie commented Jul 14, 2020

Hello Kiran,
My apologies for the incredibly long lag on this. I suspect you have moved onto other things now, but I still want to fix this, and also to add the weighted capabilities that you opened in the other pull request (the code there is good overall, but I want to try to address some potential numerical stability issues with your implementation). I finally have a little bit of time, so I'm going to try to knock this out over the next couple of weeks.

First, starting with this issue, I noticed that the points are not in general position; they must have been sampled from some sort of grid.

Points

So this is causing numerical problems both with our library and with GUDHI actually (which you may have already noticed).
One way to deal with this problem is to perturb the points slightly. Here's some code I wrote to do this:

import numpy as np
import matplotlib.pyplot as plt
import cechmate as cm
import gudhi
from persim import plot_diagrams

def convertGUDHIPD(pers):
    Is = [[], [], []]
    for i in range(len(pers)):
        (dim, (b, d)) = pers[i]
        Is[dim].append([b, d])
    return [np.sqrt(np.array(I)) for I in Is]

def alphaBug():
    np.random.seed(0)
    X = np.loadtxt("points.txt")
    XP = X + 0.01*np.random.randn(X.shape[0], X.shape[1])

    alpha = cm.Alpha()
    alpha_filtration = alpha.build(XP)
    dgmsalpha = alpha.diagrams(alpha_filtration)

    alpha_complex = gudhi.AlphaComplex(points=XP)
    simplex_tree = alpha_complex.create_simplex_tree()
    dgmsalpha2 = simplex_tree.persistence()
    dgmsalpha2 = convertGUDHIPD(dgmsalpha2)

    plt.figure(figsize=(8, 4))
    plt.subplot(121)
    if len(dgmsalpha) > 0:
        plot_diagrams(dgmsalpha)
    plt.title("Cechmate")
    plt.subplot(122)
    plot_diagrams(dgmsalpha2)
    plt.xlim([-1, 6])
    plt.ylim([-1, 6])
    plt.title("GUDHI")
    plt.tight_layout()
    plt.savefig("Perturbed.png", bbox_inches='tight')

alphaBug()

When I run this code, I get a completely satisfactory answer that agrees with GUDHI
Perturbed

If I go back and run both cechmat and GUDHI on the original point cloud, GUDHI also has some issues. In particular, it has some classes that are born in H1 and H2 very late but which die almost instantly
Original

if I zoom into the different range, though, the rest of the diagram looks fine:
Original_Zoom
So overall, GUDHI definitely handles this more gracefully, but it still has some strange numerical issues.

Anyway, this problem could be pretty difficult to resolve, but I would like to figure it out. I noticed that this example also throws Cech into an infinite loop!

For now, if you want to use alpha, you should just add some random noise to this dataset. But that's clearly not a satisfactory answer in general

ctralie added a commit that referenced this issue Jul 15, 2020
@ctralie
Copy link
Member

ctralie commented Jul 15, 2020

Hello @kiranvad,
Here's an update after a lot of work today.

First of all, I found a bug with the simplex condition (which was the first problem you noticed)/

But also, it's more and more clear that the points you provided are not in general position. I verified that a lot of the tetrahedra it tries to add into the filtration are numerically flat. Here's one, for example:
DegenerateTetrahedron

Bearing this in mind, I did a few things to improve numerical precision. The biggest help was to normalize the simplices before computing the circumcenter, and then to scale back at the end

scaleSqr = np.max(np.sum(Y**2, 1))

I also avoid adding a simplex if it's numerically flat after this normalization.

Here's what happens for slightly perturbed points (same as before)
Bugfix_Perturbed

Here's the result for both of the libraries now (notice how cechmate is not crashing anymore
Bugfix

Zooming in, though, there's a problem I didn't notice before with GUDHI, and there's a different problem with cechmate
Bugfix_Zoom
Cechmate has two extra points of non-negligible persistence in H1, while GUDHI has three extra points of non-negligible persistence in H2. So it seems we're on par with GUDHI now in the ways that we both mess up.

I still want to leave this issue open, though, because there are definitely more elegant ways of dealing with this. I want to see if I can use "Simulations of Simplicity" when computing circumradii, which is a more systematic way of perturbing points with theoretical guarantees
https://arxiv.org/pdf/math/9410209.pdf
For now, if you want to use either cechmate or GUDHI, you should be aware of this issue, and you should perturb your points slightly manually.

@kiranvad
Copy link
Author

Hi Chris,
Thanks a lot for your response. This is really helpful.
It is interesting that it adds flat tetrahedron simplices although the point cloud I had this issue with has a nice geometric structure to it.
Perturbing points makes sense for the data I was using for this work. That's a physically meaningful solution.

@ctralie
Copy link
Member

ctralie commented Jul 26, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants