Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalization is inefficient in memory and time #5219

Closed
1 task
markotoplak opened this issue Jan 28, 2021 · 2 comments · Fixed by #6202
Closed
1 task

Normalization is inefficient in memory and time #5219

markotoplak opened this issue Jan 28, 2021 · 2 comments · Fixed by #6202
Labels
bug A bug confirmed by the core team meal This will take a day or two

Comments

@markotoplak
Copy link
Member

myplot

Orange's normalization, which, by default only needs the mean and standard deviation, computes these very inefficiently. As intermediate result it builds a distribution for each variable, which is, for continuous variables, mostly a sorted list of values.

Because normalization is the default for some learners and unsupervised methods, we should make it faster. This should be huge speedup for some Orange's learners, k-means and PCA.

@markotoplak markotoplak added bug report Bug is reported by user, not yet confirmed by the core team bug A bug confirmed by the core team and removed bug report Bug is reported by user, not yet confirmed by the core team labels Jan 28, 2021
@janezd janezd closed this as completed Apr 2, 2021
@janezd janezd reopened this Apr 2, 2021
@janezd janezd added feast This may require a few weeks of work needs discussion Core developers need to discuss the issue and removed needs discussion Core developers need to discuss the issue labels Oct 22, 2021
@janezd
Copy link
Contributor

janezd commented Oct 22, 2021

Check whether we still compute the mean via contintengency table; fix if necessary.

@janezd janezd added meal This will take a day or two and removed feast This may require a few weeks of work labels Oct 22, 2021
@markotoplak
Copy link
Member Author

How the same graph looks now. Green is current master, red is the old one.

myplot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug confirmed by the core team meal This will take a day or two
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants