Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve notebook #44

Merged
merged 19 commits into from
Jan 8, 2024
Merged

Improve notebook #44

merged 19 commits into from
Jan 8, 2024

Conversation

PhilippeMoussalli
Copy link
Contributor

No description provided.

@@ -70,5 +70,5 @@ fondant --help

There are two options to run the pipeline:

- [Via python files and the Fondant CLI](./src/README.md): how you should run Fondant in production
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This did not point to anything

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like I deleted it in this PR.

It might make sense to re-add it just for the indexing pipeline. WDYT?
If not, I would still add a link for the CLI to the documentation, and keep the link to the notebook.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be best to go for the second approach since we don't have a file ready to launch the pipeline from (currently organized in a function that creates the pipeline). Updated

Copy link
Member

@RobbeSneyders RobbeSneyders left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @PhilippeMoussalli

Could you remove the notebook outputs from git by running this command?

git config filter.strip-notebook-output.clean 'jupyter nbconvert --ClearOutputPreprocessor.enabled=True --to=notebook --stdin --stdout --log-level=ERROR' 

Github won't even show me the diffs because they are too large 😅

@@ -70,5 +70,5 @@ fondant --help

There are two options to run the pipeline:

- [Via python files and the Fondant CLI](./src/README.md): how you should run Fondant in production
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like I deleted it in this PR.

It might make sense to re-add it just for the indexing pipeline. WDYT?
If not, I would still add a link for the CLI to the documentation, and keep the link to the notebook.

@@ -3,7 +3,7 @@ services:
weaviate:
image: semitechnologies/weaviate:1.20.5
ports:
- 8080:8080
- 8081:8080
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Port 8080 is occupied when using jupyter on vertex workbench

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't see the diff on Github, but from inspecting it locally, I think you inserted some images inline which doesn't work well. Can you add them as separate images to the repo and reference them by link in the notebook like we do for the other images?

Copy link
Contributor Author

@PhilippeMoussalli PhilippeMoussalli Jan 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I embedded them because for some reason they are not rendered properly neither in the IDE visualizer nor when you run them with the local jupyter notebook

This is an example from the parameter search notebook, I launched the notebook command from the src directory

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. It only works when starting from the root directory indeed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't add the comment on the line.
I think this line:
"evaluation_llm_kwargs": {"openai_api_key": os.environ["OPENAI_KEY"], model_name : "gpt-3.5-turbo"} should be changed into:
"evaluation_llm_kwargs": {"openai_api_key": os.environ["OPENAI_KEY"], "model_name" : "gpt-3.5-turbo"}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch, updated

@PhilippeMoussalli PhilippeMoussalli merged commit 02259e5 into main Jan 8, 2024
1 check passed
@PhilippeMoussalli PhilippeMoussalli deleted the improve-notebook branch January 8, 2024 09:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants