You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We want to explore Gaussian Process regression models on hybrid labs data. For that, a python module needed to be called to fit, and predict. The module should:
similar to pycaret to compare different models, see examples
have a notebook in docs folder to show a simple example
have tests
have an easy interface and run fast (to be discussed)
Data:
The data folder exp699_032024_TUDelft is available on Atlas SharePoint at Documents > HybridLabs > Example_data. NOTE: The data folder exp699_032024_TUDelft is not public and cannot be shared with others. The trained model on this data cannot be also shared. For now, no need to store the trained model.
The data folder exp699_032024_TUDelft includes:
readme.md: it contains the experiment details and ML input/output
channel.csv: it contains the variable names and units
exp699.mat: it contains the data and it is in matlab format.
As found by #17, the Gaussian process is computationally expensive with large data (lots of samples). Techniques like PCA can help a bit by reducing the number of features, but they’re not enough on their own. There are other approaches to tackle the issue e.g. Sparse Gaussian Processes, but scikit-learn does not natively support this. I suggest checking out some other techniques and packages like GPyTorch, see GPyTorch Regression Tutorial.
Yeah. In fact it seems that PCA has no effect whatsoever on this particular issue, because the issue isn't caused by the size of the data (observations x features), but only by the number of observations. PCA does not reduce the number of observations.
I did not explore any implementations outside of mlflow, which uses sklearn, because that was the objective for this sprint. I agree that other packages may provide out-of-the-box solutions for this issue.
We want to explore Gaussian Process regression models on hybrid labs data. For that, a python module needed to be called to fit, and predict. The module should:
docs
folder to show a simple exampleData:
The data folder
exp699_032024_TUDelft
is available on Atlas SharePoint atDocuments > HybridLabs > Example_data
. NOTE: The data folderexp699_032024_TUDelft
is not public and cannot be shared with others. The trained model on this data cannot be also shared. For now, no need to store the trained model.The data folder
exp699_032024_TUDelft
includes:readme.md
: it contains the experiment details and ML input/outputchannel.csv:
it contains the variable names and unitsexp699.mat:
it contains the data and it is in matlab format.For reading data in python, see this notebook.
Literature:
Some literature is available at on Atlas SharePoint at
Documents > HybridLabs > Literature
. The two most related to FOWT are:The text was updated successfully, but these errors were encountered: