Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Minor fixes + Stylistic enhancements for TQDM and Multiprocessing #141

Merged
merged 7 commits into from
Jan 7, 2025

Conversation

bhavnicksm
Copy link
Collaborator

This pull request includes several changes to the chonkie package, focusing on updating progress bar descriptions, improving error messages, and adding docstrings for better code clarity. The most important changes include modifications to progress bar descriptions in multiple files, updates to error messages in semantic.py, and the addition of docstrings in auto.py and base.py.

Progress bar description updates:

  • src/chonkie/chunker/base.py: Updated progress bar descriptions in _process_batch_sequential and _process_batch_multiprocessing methods to use "doc" instead of "texts" and changed the bar format and ASCII style. [1] [2]
  • src/chonkie/chunker/token.py: Updated progress bar description in chunk_batch method to use "batch" instead of "batches" and changed the bar format and ASCII style.

Error message improvements:

  • src/chonkie/chunker/semantic.py: Improved error messages in the __init__ method to provide more specific information about invalid embedding models.
  • src/chonkie/embeddings/auto.py: Changed the error handling in get_embeddings method to raise a ValueError instead of issuing a warning when an embedding class fails to load.

Code clarity enhancements:

Copy link

codecov bot commented Jan 7, 2025

Codecov Report

Attention: Patch coverage is 66.66667% with 1 line in your changes missing coverage. Please review.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/chonkie/embeddings/auto.py 0.00% 1 Missing ⚠️
Flag Coverage Δ
python-3.10 67.37% <66.66%> (+0.02%) ⬆️
python-3.11 67.37% <66.66%> (+0.02%) ⬆️
python-3.12 67.37% <66.66%> (+0.02%) ⬆️
python-3.13 67.37% <66.66%> (+0.02%) ⬆️
python-3.9 67.29% <66.66%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/chonkie/chunker/base.py 58.55% <100.00%> (ø)
src/chonkie/chunker/semantic.py 59.11% <ø> (ø)
src/chonkie/chunker/token.py 70.73% <ø> (ø)
src/chonkie/embeddings/base.py 59.45% <100.00%> (+1.12%) ⬆️
src/chonkie/embeddings/auto.py 62.96% <0.00%> (ø)

@bhavnicksm bhavnicksm merged commit 3042b0d into main Jan 7, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant