Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Facilitate TDD of components #570

Open
Tracked by #563
PhilippeMoussalli opened this issue Oct 30, 2023 · 2 comments
Open
Tracked by #563

Facilitate TDD of components #570

PhilippeMoussalli opened this issue Oct 30, 2023 · 2 comments

Comments

@PhilippeMoussalli
Copy link
Contributor

TDD can make the development of component faster (no need to run within a pipeline) and more robust (unit test). We can have a command that generated a generic boilerplate of a test script based on the component spec:

  • Generate expected input and output df (or at least the structure)
  • Add a placeholder for the transform method, assertion test, component definition, ... similar to this
@mrchtr
Copy link
Contributor

mrchtr commented Oct 31, 2023

I like this idea. I think it would be better not to have an extra command for generating test boilerplate code. Instead I would include the generation of the test code within the component boilerplate generation. By doing this the implementation of unit tests becomes somewhat mandatory.

Before we tackle this, it would probably be good to note down some guidelines for good test design. Components that apply non-ML related transformations to the dataframes are quite easy to test. Components which include the usage of ML models, loading, or writing data need more abstraction in the test design though.

In my opinion, TDD can significantly speed up development when the test design is not too complicated and the tests can be executed fast locally and in the cicd pipeline. I think, we should propose which parts of the component we are going to mock. In general, I would suggest mocking the output of machine learning models to ensure a fast test execution.
Mocking of specific services adds some complexity which we can't cover in the boilerplate generation I guess.

We already had first discussions about general component test design some weeks ago here #301

@RobbeSneyders
Copy link
Member

Since this is mostly focused on iterative development, we could also generate a notebook with example data. We could use some notebook magics so the user can actually develop their component code in the notebook and it gets automatically written to file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Breakdown
Development

No branches or pull requests

3 participants