Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestions PR #44

Draft
wants to merge 65 commits into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
8d87d82
:art: Lint, format, remove not used imports, sort imports.
May 3, 2020
1d9f542
:bug: Fix search.py import bug.
May 3, 2020
4ece89a
:shirt: Clear training.ipynb output.
May 3, 2020
71c9edb
:shirt: lint make_npz.py
May 3, 2020
bce27ba
:shirt: lint datasets/README.md
May 3, 2020
f1fd606
:shirt: lint test/multiHeadAttention.py
May 3, 2020
8ab9035
:construction: Update labels.json to reflect the challenge.
May 3, 2020
de9274c
:construction: Add model.py.
May 3, 2020
6c35365
:construction: Add dataset2.py.
May 3, 2020
b1fb712
:construction: Improve pull request accordingly to feedback.
May 3, 2020
c780755
:shirt: dopout should actually be dropout.
May 3, 2020
2a8ba6d
Restore training.ipynb
May 3, 2020
de92c1a
:heavy_minus_sign: Remove csv2npz and make_npz.py
May 3, 2020
5a7fd4e
:heavy_plus_sign: Add pylint to requirements.txt.
May 4, 2020
6359785
:shirt: lint src/utils/search.py
May 4, 2020
6e86025
:shirt: lint src/utils/search.py
May 4, 2020
6de0f16
:construction: Add to ignore list *.csv
May 4, 2020
2fe0b18
:construction: Development path
May 6, 2020
d4fe40c
:construction: Development path.
May 28, 2020
5285fd4
:construction: Development path.
May 28, 2020
0af294d
:art: Resolution of conflicts.
May 28, 2020
31e8d4a
:shirt: lint code.
May 29, 2020
906d883
:rocket: Speed up transformer pytest.
May 29, 2020
3304457
:rocket: Speed up pytest with seaborn passsengers dataset.
May 30, 2020
d7c1be3
:bug: Fix FlightsDataset shape convention bug.
May 30, 2020
d1019b3
:heavy_plus_sign: Add FlightsDataset labels.
May 30, 2020
ad80d93
:heavy_plus_sign: Add FlightsDataset labels.
May 30, 2020
1384707
:art: Improve flights_dataset.
May 30, 2020
1329e82
:art: Start using the specialized MinMaxScaler object to scale the da…
Jun 1, 2020
ad0479c
:heavy_plus_sign: Implement forecast functionality.
Jun 2, 2020
e032fbb
:bug: Fix make_future_dataframe method of TransformerTimeSeriesPredic…
Jun 2, 2020
2d42ffe
:construction: Development path.
Dec 20, 2020
e096b00
:fire: Clean code base.
Dec 21, 2020
2a6d5aa
:fire: Clean code base.
Dec 21, 2020
0aac45e
:construction: Update version number in bumpversion.cfg
Dec 21, 2020
339fe18
:construction: Update version number in deploy.ps1
Dec 21, 2020
d3417a1
:fire: Remove doc conf leftover in bumpversion.cfg
Dec 21, 2020
8e88a81
Bump version: 0.3.0 → 0.4.0
Dec 21, 2020
7db1943
Add to git ignore list build and dist folders.
Dec 21, 2020
f6188f7
Bump version: 0.4.0 → 0.4.1
Dec 21, 2020
5cd8133
:heavy_plus_sign: Add travis.
Dec 21, 2020
4e6f58e
:memo: Add badges to README.md.
Dec 21, 2020
a3d4795
:bug: Should fix codecov.
Dec 21, 2020
c8de445
Bump version: 0.4.1 → 0.4.2
Dec 21, 2020
78e00c4
:white_check_mark: Update main_test.
Dec 21, 2020
9e7f016
:construction: Development path.
Jan 4, 2021
9837bbc
:rocket: Speed up fitting.
Jan 4, 2021
50e7ca9
Bump version: 0.4.2 → 0.4.3
Jan 4, 2021
2df8280
:construction: Development path in order to progressively improve all…
Jan 5, 2021
8ced2a8
:construction: Alignment path.
Jan 5, 2021
eb52a00
:construction: Alignment path.
Jan 5, 2021
4bfb3d1
Merge branch 'master' of github.com:maxjcohen/transformer into sugges…
Jan 5, 2021
33a289f
:construction: d_model, output_model and input_model are back instead…
Jan 5, 2021
27a9c3b
:construction: d_model, output_model and input_model are back instead…
Jan 5, 2021
8bc90dc
:fire: No need to move to device positional_encoding.
Jan 5, 2021
668d430
:memo: Update README.md
Jan 5, 2021
c85d2ac
:memo: Update README.md
Jan 5, 2021
5351d21
:fire: Remove bumpversion cfg file.
Jan 5, 2021
0102cc9
rebase B
Jan 5, 2021
09066c2
:fire: Remove bumpversion config file and deploy script.
Jan 5, 2021
0064064
Rebase setup.py
Jan 5, 2021
6e46a9b
:art: Simplified test code with pytest parametrize.
Jan 5, 2021
0d399b0
:bug: Fix main test to really set the device properly.
Jan 6, 2021
d6ee0f0
:white_check_mark: Add test_transformer_tsp_multisamples.
Jan 6, 2021
141bdb8
:white_check_mark: Add test_transformer_tsp_multisamples.
Jan 6, 2021
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 14 additions & 2 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ build
*.egg-info

# Dataset
*.npz
datasets/

# Models
*.pth
Expand All @@ -25,5 +25,17 @@ _build
*.png
*.jpg

# Outputs
*.csv

# Credentials
.env.test.local

# Pytest cache
.pytest_cache/
**/__pycache__

# Logs
logs
log

dist
9 changes: 9 additions & 0 deletions .pylintrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[MASTER]
extension-pkg-whitelist=numpy,torch

[TYPECHECK]

# List of members which are set dynamically and missed by pylint inference
# system, and so shouldn't trigger E1101 when accessed. Python regular
# expressions are accepted.
generated-members=numpy.*,torch.*
17 changes: 17 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
language: python
python:
- "3.7.9"
cache:
pip: true
directories:
# - datasets
before_install:
- python --version
- pip install -U pip
- pip install codecov
install:
- pip install -e .[test] # install package + test dependencies
script:
- pytest --cov=time_series_transformer # run tests
after_success:
- codecov # submit coverage
65 changes: 17 additions & 48 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,67 +1,36 @@
# Transformers for Time Series
[![PyPI version](https://badge.fury.io/py/time-series-transformer.svg)](https://badge.fury.io/py/time-series-transformer) [![travis](https://travis-ci.org/DanielAtKrypton/time_series_transformer.svg?branch=master)](https://travis-ci.org/github/DanielAtKrypton/time_series_transformer) [![codecov](https://codecov.io/gh/DanielAtKrypton/time_series_transformer/branch/master/graph/badge.svg)](https://codecov.io/gh/DanielAtKrypton/time_series_transformer) [![GitHub license](https://img.shields.io/github/license/DanielAtKrypton/time_series_transformer)](https://github.com/DanielAtKrypton/time_series_transformer) [![Requirements Status](https://requires.io/github/DanielAtKrypton/time_series_transformer/requirements.svg?branch=master)](https://requires.io/github/DanielAtKrypton/time_series_transformer/requirements/?branch=master)

[![Documentation Status](https://readthedocs.org/projects/timeseriestransformer/badge/?version=latest)](https://timeseriestransformer.readthedocs.io/en/latest/?badge=latest) [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) [![Latest release](https://img.shields.io/github/release/maxjcohen/transformer.svg)](https://github.com/maxjcohen/transformer/releases/latest)

Implementation of Transformer model (originally from [Attention is All You Need](https://arxiv.org/abs/1706.03762)) applied to Time Series (Powered by [PyTorch](https://pytorch.org/)).

## Transformer model

Transformer are attention based neural networks designed to solve NLP tasks. Their key features are:

- linear complexity in the dimension of the feature vector ;
- paralellisation of computing of a sequence, as opposed to sequential computing ;
- long term memory, as we can look at any input time sequence step directly.

This repo will focus on their application to times series.

## Dataset and application as metamodel

Our use-case is modeling a numerical simulator for building consumption prediction. To this end, we created a dataset by sampling random inputs (building characteristics and usage, weather, ...) and got simulated outputs. We then convert these variables in time series format, and feed it to the transformer.

## Adaptations for time series

In order to perform well on time series, a few adjustments had to be made:

- The embedding layer is replaced by a generic linear layer ;
- Original positional encoding are removed. A "regular" version, better matching the input sequence day/night patterns, can be used instead ;
- A window is applied on the attention map to limit backward attention, and focus on short term patterns.
## Documentation
- [Read The Docs](https://readthedocs.org/projects/timeseriestransformer/badge/?version=latest).

## Installation

All required packages can be found in `requirements.txt`, and expect to be run with `python3.7`. Note that you may have to install pytorch manually if you are not using pip with a Debian distribution : head on to [PyTorch installation page](https://pytorch.org/get-started/locally/). Here are a few lines to get started with pip and virtualenv:

```bash
$ apt-get install python3.7
$ pip3 install --upgrade --user pip virtualenv
$ virtualenv -p python3.7 .env
$ . .env/bin/activate
(.env) $ pip install -r requirements.txt
```terminal
.\scripts\init-env.ps1
```

## Usage

### Downloading the dataset

The dataset is not included in this repo, and must be downloaded manually. It is comprised of two files, `dataset.npz` contains all input and outputs value, `labels.json` is a detailed list of the variables. Please refer to [#2](https://github.com/maxjcohen/transformer/issues/2) for more information.

### Running training script

Using jupyter, run the default `training.ipynb` notebook. All adjustable parameters can be found in the second cell. Careful with the `BATCH_SIZE`, as we are using it to parallelize head and time chunk calculations.

### Outside usage

The `Transformer` class can be used out of the box, see the [docs](https://timeseriestransformer.readthedocs.io/en/latest/?badge=latest) for more info.

```python
from flights_time_series_dataset import FlightsDataset
from time_series_predictor import TimeSeriesPredictor
from tst import Transformer

net = Transformer(d_input, d_model, d_output, q, v, h, N, TIME_CHUNK, pe)
tsp = TimeSeriesPredictor(
Transformer(),
max_epochs=50,
train_split=None,
)

tsp.fit(FlightsDataset())
```

### Building the docs
### Test

To build the doc:
To test the package simply run the following command from project's root folder.

```bash
(.env) $ cd docs && make html
pytest -s
```
398 changes: 0 additions & 398 deletions benchmark.ipynb

This file was deleted.

70 changes: 0 additions & 70 deletions cross_validation.py

This file was deleted.

20 changes: 0 additions & 20 deletions docs/Makefile

This file was deleted.

7 changes: 0 additions & 7 deletions docs/requirements.txt

This file was deleted.

1 change: 0 additions & 1 deletion docs/source/README.md

This file was deleted.

61 changes: 0 additions & 61 deletions docs/source/_static/css/theme_modifs.css

This file was deleted.

79 changes: 0 additions & 79 deletions docs/source/conf.py

This file was deleted.

6 changes: 0 additions & 6 deletions docs/source/decoder.rst

This file was deleted.

Loading