feat: Dynamically scrape Ollama model names #14

ericmjl · 2023-10-31T13:07:38Z

Added a function to dynamically scrape Ollama model names from the Ollama website.
The function uses BeautifulSoup and requests to parse the HTML and find the model names.
If the website cannot be reached, a static list of model names is used as a fallback.
The static list of model names is stored in a new file, ollama_model_names.txt.
The function is cached using lru_cache to improve performance.
Updated the create_model function to use the new dynamic function.
Added a new Jupyter notebook to test the scraping function.
Included the new text file in the MANIFEST.in file for distribution.
This change improves the flexibility and maintainability of the code by
allowing it to adapt to changes in the Ollama model library.

- Added a function to dynamically scrape Ollama model names from the Ollama website. - The function uses BeautifulSoup and requests to parse the HTML and find the model names. - If the website cannot be reached, a static list of model names is used as a fallback. - The static list of model names is stored in a new file, ollama_model_names.txt. - The function is cached using lru_cache to improve performance. - Updated the create_model function to use the new dynamic function. - Added a new Jupyter notebook to test the scraping function. - Included the new text file in the MANIFEST.in file for distribution. - This change improves the flexibility and maintainability of the code by allowing it to adapt to changes in the Ollama model library.

This commit adds a newline at the end of the v0.0.86 release notes file. This change is in line with the standard file formatting conventions.

This commit separates the 'Commit release notes' step from the 'Write release notes' step in the release-python-package workflow. The 'pre-commit' package installation has been moved to the 'Commit release notes' step.

- Added beautifulsoup4, lxml, and requests to the environment.yml file. These packages are necessary for the automatic scraping of ollama models.

This commit adds the content.code.copy feature to the theme configuration in mkdocs.yaml. This feature allows users to easily copy code snippets from the documentation.

…for model names The method ollama_model_keywords() in model_dispatcher.py has been refactored. The dynamic scraping of model names from the Ollama website has been removed. Instead, the model names are now read from a static text file distributed with the package. This change simplifies the code and removes the dependency on the BeautifulSoup and requests libraries.

This commit introduces a new feature that automatically updates the list of Ollama models. A new Python script has been added to the hooks in the pre-commit configuration file. This script scrapes the Ollama AI library webpage to get the latest model names and writes them to a text file. The dependencies required for this script are now specified in the pre-commit configuration file instead of the environment file.

ericmjl · 2023-10-31T13:57:16Z

GitBot Summary of Changes

The pull request includes several changes:

In the GitHub workflow file, the installation of pre-commit has been moved to a new step named "Commit release notes".
A new pre-commit hook named "Autoupdate Ollama Models" has been added to the .pre-commit-config.yaml file. This hook runs a Python script that automatically updates the list of Ollama models.
The file llamabot/bot/ollama_model_names.txt has been added to the MANIFEST.in file to be included in the distribution.
The list of Ollama model keywords in the model_dispatcher.py file has been replaced with a function that reads the model names from the newly added file llamabot/bot/ollama_model_names.txt.
A new feature "content.code.copy" has been added to the mkdocs.yaml file.
A new Jupyter notebook file named scrape_ollama_models.ipynb has been added. This notebook contains code to scrape the Ollama model names from the Ollama website.
A new Python script named autoupdate_ollama_models.py has been added. This script is similar to the Jupyter notebook but is intended to be run as a pre-commit hook. It updates the list of Ollama models in the llamabot/bot/ollama_model_names.txt file.

cc: @ericmjl, please check for correctness!

ericmjl added 7 commits October 31, 2023 09:06

docs(releases): add newline at end of v0.0.86 release notes

c22810a

This commit adds a newline at the end of the v0.0.86 release notes file. This change is in line with the standard file formatting conventions.

chore(release-workflow): separate commit release notes step

4613ac6

This commit separates the 'Commit release notes' step from the 'Write release notes' step in the release-python-package workflow. The 'pre-commit' package installation has been moved to the 'Commit release notes' step.

feat(environment): add dependencies for ollama model scraping

2737a98

- Added beautifulsoup4, lxml, and requests to the environment.yml file. These packages are necessary for the automatic scraping of ollama models.

feat(mkdocs.yaml): add content.code.copy feature to theme

594d16a

This commit adds the content.code.copy feature to the theme configuration in mkdocs.yaml. This feature allows users to easily copy code snippets from the documentation.

ericmjl merged commit a69eb73 into main Oct 31, 2023

ericmjl deleted the ollama-model-names branch October 31, 2023 13:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Dynamically scrape Ollama model names #14

feat: Dynamically scrape Ollama model names #14

ericmjl commented Oct 31, 2023

ericmjl commented Oct 31, 2023

feat: Dynamically scrape Ollama model names #14

feat: Dynamically scrape Ollama model names #14

Conversation

ericmjl commented Oct 31, 2023

ericmjl commented Oct 31, 2023

GitBot Summary of Changes