Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

will it work for spares one hot data - only 0s and 1s in data #1

Open
Sandy4321 opened this issue Jan 27, 2022 · 3 comments
Open

will it work for spares one hot data - only 0s and 1s in data #1

Sandy4321 opened this issue Jan 27, 2022 · 3 comments

Comments

@Sandy4321
Copy link

Hello Dr. Roberts
great code and talk
https://www.youtube.com/watch?v=RvEZURqfaC4

thank you very much

but will it work for big very sparse one hot data - only 0s and 1s in data

https://machinelearningmastery.com/why-one-hot-encode-data-in-machine-learning/
https://en.wikipedia.org/wiki/One-hot
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html

by the way do you have print friendly for
your presentation
Derivative-free optimisation for least-squares problems

https://lindonroberts.github.io/talk/unsw_202004/roberts_unsw.pdf

for example word format ?
or less simple slides to understand only idea
or another introductory video...

Thanks in advance ...

@lindonroberts
Copy link
Collaborator

Hi, and thanks!

This code should work in general for any loss function that can be written as a sum of squares, so this probably should be fine with one-hot data: because your loss function is then $\min_{w} \sum_{i} (model(w,x_i) - y_i)^2$, where the $y_i$ targets are one-hot encoded (or some other sensible measure of discrepancy). If you can write your problem in this format, then DFBGN should be suitable.

If your problem is not large scale (e.g. <= 100 unknowns you want to optimize), then I would recommend DFO-LS.

Unfortunately there are not a lot of accessible resources on the topic, but depending on your background I would recommend:

  • Book Numerical Optimization by Nocedal and Wright - it has an accessible introduction to least-squares problems and some basics of general derivative-free optimization methods
  • Book Derivative-Free and Blackbox Optimization by Audet and Hare - it has a more modern but still reasonably accessible introduction to derivative-free optimization (in particular the section on model-based methods)
  • A newer talk of mine which specifically covers the DFBGN method for large-scale problems. This was designed for a general numerical analysis audience
  • The first 2 chapters of my PhD thesis have some introductory material on derivative-free optimization and least-squares problems (but more technical than the above)
  • For full technical details, the paper associated with DFBGN is here

Unfortunately I don't have a print-friendly version of the presentation you mention. That talk covered more the DFO-LS software, so you could look at the papers mentioned in the readme (and the online documentation) for more details. These would be print friendly

@Sandy4321
Copy link
Author

great thanks for soon answer
the matter is

If your problem is not large scale (e.g. <= 100 unknowns you want to optimize), then I would recommend DFO-LS.
usually one hot tabular data has huge scale and huge sparsity ( 90% of data are zeros and 10% are ones)
like 20000 features (unknowns )
and
100000 rows

would your code work in a such a case?

@lindonroberts
Copy link
Collaborator

No, I don't think DFO-LS would be the right choice for problems that large (it isn't able to make use of sparsity). However, you should be able to use this code (DFBGN) ok, it would just be a matter of picking the fixed_block input small enough.

Note that there is usually a tradeoff: larger fixed_block values will optimize quicker (i.e. fewer iterations/evaluations of the objective function), but each iteration will take longer to run. You should pick a value that seems to provide a good balance for your problem (I can't give good advice on that, but I have tried values of fixed_block as small as n/100, where n is the number of unknowns).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants