Skip to content

Latest commit

 

History

History
160 lines (124 loc) · 4.06 KB

File metadata and controls

160 lines (124 loc) · 4.06 KB

RAMP starting kit on the bike counters dataset

GH Actions

Getting started

Download the data,

Download the data files,

and put them to into the data folder.

Install

To run a submission and the notebook you will need the dependencies listed in requirements.txt.

It is recommended to create a new environment for this project and to install those dependencies inside this new environment.

You can create a new conda environement named bikes-ramp using:

conda create --name bikes-ramp python=3.9

and then use this new environment install the dependencies in it using pip:

conda activate bikes-ramp
pip install -r requirements.txt

Later on when you work on your project, you need to use the bikes-ramp environement in any terminal session. This is once again done with:

conda activate bikes-ramp

Challenge description

Get started on this RAMP with the dedicated notebook.

First install Jupyter:

pip install jupyter

then launch the notebook using:

jupyter notebook ./bike_counters_starting_kit.ipynb

Test a submission

The submissions need to be located in the submissions folder. For instance for my_submission, it should be located in submissions/my_submission.

To run a specific submission, you can use the ramp-test command line:

ramp-test --submission my_submission

For instance, you can run the provided starting_kit submission example with:

ramp-test --submission starting_kit

You should get an output similar to the following one:

Example output
Testing Bike count prediction
Reading train and test files from ./data/ ...
Reading cv ...
Training submissions/starting_kit ...
CV fold 0
        score   rmse      time
        train  0.610  0.084952
        valid  0.983  0.408040
        test   0.703  0.033141
CV fold 1
        score   rmse      time
        train  0.663  0.106090
        valid  0.852  0.399937
        test   0.759  0.032243
CV fold 2
        score   rmse      time
        train  0.682  0.170388
        valid  0.891  0.324898
        test   0.771  0.025760
CV fold 3
        score   rmse      time
        train  0.705  0.208704
        valid  0.844  0.324345
        test   0.875  0.024143
CV fold 4
        score   rmse      time
        train  0.728  0.233596
        valid  0.804  0.319224
        test   0.872  0.024262
CV fold 5
        score   rmse      time
        train  0.737  0.280230
        valid  0.939  0.320182
        test   0.863  0.024391
CV fold 6
        score   rmse      time
        train  0.763  0.327653
        valid  1.131  0.316819
        test   0.843  0.025528
CV fold 7
        score   rmse      time
        train  0.793  0.376762
        valid  0.896  0.324821
        test   0.767  0.024473
----------------------------
Mean CV scores
----------------------------
        score             rmse         time
        train   0.71 +- 0.0546   0.2 +- 0.1
        valid  0.917 +- 0.0962  0.3 +- 0.04
        test   0.807 +- 0.0607   0.0 +- 0.0
----------------------------
Bagged scores
----------------------------
        score   rmse
        valid  0.923
        test   0.765

You can get more information regarding this command line:

ramp-test --help

To go further

You can find more information regarding ramp-workflow in the dedicated documentation

You can find the description of the columns present in the external_data.csv in parameter-description-weather-external-data.pdf. For more information about this dataset see the Meteo France website (in French).