-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suggestions for project #6
Conversation
Hi, first of all thanks for contributing to this project. I'm writing a few reviews for these commits and I'll merge once we settle on a good pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would actually like to keep the output of the training notebook, to give an example of training and visualization upon loading the repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes this should be linted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to keep the benchmark repo and this transformer repo independent. I am currently working on a slightly different dataset than the one used for the benchmark, this is why this file is different. I may consider creating a "datachallenge" branch on this repo with the original labels.json
file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All benchmark models can be found in the benchmark.py
file. You fill find the LSTM among them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To sum up the main modifications that are still needed:
- Removing empty docstring
- Removing in-code pylint settings
- Reverting changes to the
tst
package, unless there is a substancial modification
Here is an example that interacts with the main changes I am proposing: import numpy as np
from matplotlib import pyplot as plt
from src.transformer_tsp import TransformerTimeSeriesPredictor
from tests.flights_dataset import FlightsDataset
if __name__ == "__main__":
plot_config = {}
plot_config['training progress'] = False
plot_config['prediction on training data'] = False
plot_config['forecast'] = True
forecast_config = {}
forecast_config['include history'] = True
forecast_config['months ahead'] = 24
predictor_config = {}
config = {}
config['plot'] = plot_config
config['forecast'] = forecast_config
config['predictor'] = predictor_config
config['predict enabled'] = False
config['forecast enabled'] = True
tsp = TransformerTimeSeriesPredictor()
hist_loss = tsp.fit(FlightsDataset())
# training_dataframe = tsp.get_training_dataframe()
if config['plot']['training progress']:
plt.figure()
plt.plot(hist_loss, 'o-', label='train')
plt.show()
if config['predict enabled']:
# Select training example
idx = np.random.randint(0, len(tsp.dataloader.dataset))
x, y = tsp.dataloader.dataset[idx]
# Run predictions
netout = tsp.predict(x)
if config['plot']['prediction on training data']:
plt.figure()
d_output = netout.shape[2]
for idx_output_var in range(d_output):
# Select real passengers data
y_true = y[:, idx_output_var]
y_pred = netout[0, :, idx_output_var]
if config['plot']['prediction on training data']:
plt.subplot(d_output, 1, idx_output_var+1)
plt.plot(y_true, label="Truth")
plt.plot(y_pred, label="Prediction")
plt.title(tsp.dataloader.dataset.labels['y'][idx_output_var])
plt.legend()
if config['plot']['prediction on training data']:
plt.show()
# Run forecast
if config['forecast enabled']:
netout = tsp.forecast(config['forecast']['months ahead'],
include_history=config['forecast']['include history'])
if config['plot']['forecast']:
plt.figure()
d_output = netout.shape[2]
# Select any training example just for comparison
idx = np.random.randint(0, len(tsp.dataloader.dataset))
x, y = tsp.dataloader.dataset[idx]
for idx_output_var in range(d_output):
# Select real passengers data
y_true = y[:, idx_output_var]
y_pred = netout[0, :, idx_output_var]
if config['plot']['forecast']:
plt.subplot(d_output, 1, idx_output_var+1)
if config['forecast']['include history']:
plot_args = [y_pred]
else:
plot_args = [
[i+tsp.dataset.get_x_shape()[1]+1 for i in range(len(y_pred))], y_pred]
plt.plot(*plot_args, label="Prediction")
plt.plot(y_true, label="Truth")
plt.title(tsp.dataloader.dataset.labels['y'][idx_output_var])
plt.legend()
if config['plot']['forecast']:
plt.show() |
I created a package than can reduce the complexity. I'll be working on integrating it in the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, I made some remarks about a few lines bothering me, but the main point is the change of the package name: I'd like to keep it as is ! This way, I can enjoy my commit history not being shambled. The fact that the name tst
is taken is not really a limitation, I never had intention to upload it to pip. Instead, the tst package can be installed via pip install git+https://github.com/maxjcohen/transformer/
.
Thanks for you contributions !
language: python | ||
cache: | ||
pip: true | ||
directories: | ||
# - datasets | ||
before_install: | ||
- python --version | ||
- pip install -U pip | ||
- pip install codecov | ||
install: | ||
- pip install -e .[test] # install package + test dependencies | ||
script: | ||
- pytest --cov=time_series_transformer # run tests | ||
after_success: | ||
- codecov # submit coverage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather avoid using Travis here, this is not so much a development project that it requires CI/CD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a good practice to whenever there is a code change, to run a checklist of all the desired functionality covering all corners of the code base. If you look at codecov with the preliminary tests I came up with 65% of the code was already covered. Now that number can reach 100% with the addition of a few more tests, so that you can see it is always working as intended when there is a code change. And if the build breaks for any reason you are immediatelly informed by email. Then you can go there and fix it fast. Sometimes a dependency breaks the code base, as was the case with numpy 1.19.4 in Windows. The only solution as of now was to stick with a previous numpy version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I understand, but we currently don't have any kind of test, and dependency problems are left to the user by not mentioning the precise version of each package in the requirements.
If at some point we develop some meaningful tests, we could consider using Travis, but wright now it's just a waste of CPU time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1,67 +1,34 @@ | |||
# Transformers for Time Series |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we removing such big chunks of the Readme ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The purpose was to simplify the package documentation so that only essential info is there for SW developers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes but this content is important, it show how and why I developed that repo. The README is not the documentation, I think we can get away with having a detailed file here.
I secured time_series_transformer package name, I can hand that over to you. Just send me an email at daniel(at)kryptonunite(dot)com. We can better discuss development and academic cooperation by email if you will. |
Closed in favor of #44 |
The following measures were taken: