Skip to content
This repository has been archived by the owner on Jun 22, 2022. It is now read-only.

have to clean cache manually #39

Open
mromaniukcdl opened this issue May 8, 2018 · 2 comments
Open

have to clean cache manually #39

mromaniukcdl opened this issue May 8, 2018 · 2 comments

Comments

@mromaniukcdl
Copy link

Having to call step.clean_cache() is error-prone. Ideally, we should have automatic cache invalidation.

@kamil-kaczmarek kamil-kaczmarek changed the title Problem: having to clean cache manually havie to clean cache manually Jun 3, 2018
@kamil-kaczmarek kamil-kaczmarek changed the title havie to clean cache manually have to clean cache manually Jun 3, 2018
@thomasjpfan
Copy link

thomasjpfan commented Jun 23, 2018

Since caching depends on the input, the steps that sets cache_output=True would need the input object as well. From this insight, I propose the following API:

data_fit = {'input':...
            'id': 'data_fit'}

data_val = {'intput':...,
            'id': 'data_val'}

new_tfidf_step = Step(name='TF-IDF',
                      transformer=StepsTfidfTransformer(),
                      input_steps=[new_count_vec_step],        
                      input_data=['input'],
                      experiment_directory=EXPERIMENT_DIR_B,
                      cache_output=True)

The new_tfidf_step step will use the value of id as a key for caching. How do you feel about this API?

@kamil-kaczmarek
Copy link
Member

@thomasjpfan thank you for you idea and PR (and sorry for late reply)!

I will take a closer look at it next week and let you know how we will proceed with it

kant added a commit to kant/steppy that referenced this issue Sep 8, 2018
@kant kant mentioned this issue Sep 8, 2018
kamil-kaczmarek pushed a commit that referenced this issue Sep 8, 2018
@kamil-kaczmarek kamil-kaczmarek added this to the v0.2 milestone Sep 14, 2018
@bcottman bcottman mentioned this issue Oct 4, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants