Skip to content

v0.4.1

Latest
Compare
Choose a tag to compare
@bhavnicksm bhavnicksm released this 07 Jan 13:22
· 11 commits to main since this release
e9440ac

Highlights

  • Now you can see a progress bar when chunking a lot of texts with batch chunking
from chonkie import RecursiveChunker

chunker  = RecursiveChunker()

chunks = chunker([...], show_progress_bar=True)    # progress bar is enabled by default

# πŸ¦› choooooooooooooooooooonk 100% β€’ 200/200 docs chunked [00:00<00:00, 229.65doc/s] 🌱

What's Changed

  • Add CONTRIBUTING.md, update issue templates, CI, Codecov and more... by @bhavnicksm in #119
  • [FEAT] Add TQDM to default installs + CONTRIBUTING.md + other minor updates by @bhavnicksm in #120
  • [fix] CI: reports were not being uploaded to Codecov by @bhavnicksm in #121
  • Update CONTRIBUTING.md with first issue hyperlink by @shreyashnigam in #122
  • [FIX] Support class methods as token_counter objects for CustomEmbeddings (#92) by @bhavnicksm in #127
  • [Fix] Add fix for #92: Support class.method as a Tokenizer for CustomEmbedding +. minor changes by @bhavnicksm in #128
  • [FIX] #116: Incorrectstart_index when chunk_overlap is not 0 by @Udayk02 in #126
  • [FIX] start_index incorrect when chunk_overlap is not 0 (#116) by @bhavnicksm in #132
  • [FIX] Remove tests for Py3.8 β€” Incompatible for support by @bhavnicksm in #134
  • [fix] High chunk_overlap causes last chunk to be entirely redundant by @bhavnicksm in #136
  • [FIX] Handle edge case for RecursiveChunker (#131) by @bhavnicksm in #137
  • [DOCS] Update readme intro to match docs. by @shreyashnigam in #135
  • [FEAT] Add TQDM progress bars for chunk_batch + Update README.md by @bhavnicksm in #138
  • Replace dead discord link with infinite lifetime by @shreyashnigam in #140
  • [FIX] Minor fixes + Stylistic enhancements for TQDM and Multiprocessing by @bhavnicksm in #141
  • [chore] Bump up the package version to v0.4.1 by @bhavnicksm in #143

New Contributors

Full Changelog: v0.4.0...v0.4.1