Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implemented majority of MOMP [#1031] #1046

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

joehiggi1758
Copy link
Contributor

Pull Request Checklist

Below is a simple checklist but please do not hesitate to ask for assistance!

  • Fork, clone, and checkout the newest version of the code
  • Create a new branch
  • Make necessary code changes
  • Install black (i.e., python -m pip install black or conda install -c conda-forge black)
  • Install flake8 (i.e., python -m pip install flake8 or conda install -c conda-forge flake8)
  • Install pytest-cov (i.e., python -m pip install pytest-cov or conda install -c conda-forge pytest-cov)
  • Run black --exclude=".*\.ipynb" --extend-exclude=".venv" --diff ./ in the root stumpy directory
  • Run flake8 --extend-exclude=.venv ./ in the root stumpy directory
  • Run ./setup.sh dev && ./test.sh in the root stumpy directory
  • Reference a Github issue (and create one if one doesn't already exist)

Hey @seanlaw - hope you had a great Saturday!

  • Just realized I pushed the wrong code on the other PR
  • Major apologies - I prefer to keep things clean for the team and myself (but am still learning a great deal and need to sharpen my skills)
  • I've closed it to try and clean this effort up, please find the correct code ready for your review within this PR
  • It's 85-90% complete, but could use feedback and direction

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link

codecov bot commented Nov 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.33%. Comparing base (9616686) to head (da0739a).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1046   +/-   ##
=======================================
  Coverage   97.33%   97.33%           
=======================================
  Files          89       89           
  Lines       15027    15027           
=======================================
  Hits        14626    14626           
  Misses        401      401           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -0,0 +1,839 @@
{
Copy link
Contributor

@seanlaw seanlaw Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this example from the paper? If not, then instead of using a random input, please try to reproduce something (i.e., a figure) from the paper where possible. This is the key deliverable in a "notebook reproducer" as it allows us to visually see if we have successfully reproduced the work without needing to dive into the code (yet).


Reply via ReviewNB

@@ -0,0 +1,839 @@
{
Copy link
Contributor

@seanlaw seanlaw Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when we make the time series of length 1 million and a motif length of 50? How long does momp take?


Reply via ReviewNB

@seanlaw
Copy link
Contributor

seanlaw commented Nov 19, 2024

@joehiggi1758 I've left some comments and suggestions for you to consider. Thank you for all of your hard work!

@@ -0,0 +1,839 @@
{
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #24.                dist = np.sqrt(np.sum((T[rr:rr + m] - T[cc:cc + m]) ** 2))

The authors used the name "ED" for the distance function in the table 3. Did they mean Euclidean Distance or z-normalized Euclidean Distance? If it is the latter, then I think this line of code needs to be revised.


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @NimaSarajpoor!

@@ -0,0 +1,839 @@
{
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #27.                    dist = amp[i] - ip[i] - ip[j]

Since amp contains z-normalized distances here, IMO it makes sense to say that ip[i] and ip[j] should be z-normalized distances. Therefore, I think the distance computed in computeKTIP should be the z-normalized distance. Better to check paper to make sure of it.


Reply via ReviewNB

@@ -0,0 +1,839 @@
{
Copy link
Collaborator

@NimaSarajpoor NimaSarajpoor Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #29.        mp, _ = stumpy.mstump(stacked_segments, m)  # Compute the matrix profile for the stacked segments

mstump is multi-dimensional stump. Does the paper try to analyze the two segments in multi-dimensional way? (I do not know what that means here since mstump is complicated)

Or, does it try to compute the mp for the subsequences in those two segments, something like stumpy.stump(np.r_[segA, np.nan, segB], m) ?


Reply via ReviewNB

Copy link
Contributor

@seanlaw seanlaw Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NimaSarajpoor. I totally missed this. I'm guessing that mstump is NOT what we want here!

@NimaSarajpoor
Copy link
Collaborator

@joehiggi1758
Great work on implementing many algorithms including the ones referenced and used in the paper!! A few things got my attention as I was going through your notebook. I provided some comments to bring them to your attention.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants