v0.10.0
What's changed
- Modified gen_processing_model tokenizer model to output int64, unifying output datatype of all tokenizers.
- Implemented support for post-processing of YOLO v8 within the Python extensions package.
- Introduced 'fairseq' flag to enhance compatibility with certain Hugging Face tokenizers.
- Incorporated 'added_token' attribute into the BPE tokenizer to improve CodeGen tokenizer functionality.
- Enhanced the SentencePiece tokenizer by integrating token indices into the output.
- Added support for the custom operator implemented with CUDA kernels, including two example operators.
- Added more tests on the Hugging Face tokenizer and fixed identified bugs.
Contributions
Contributors to ONNX Runtime Extensions include members across teams at Microsoft, along with our community members: @wenbingl @sayanshaw24 @skottmckay @mszhanyi @edgchen1 @YUNQIUGUO @RandySheriffH @samwebster @hyoshioka0128 @baijumeswani @dizcza @Craigacp @jslhcl