Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
export.py		export.py
main.py		main.py
qa_dynamic.yaml		qa_dynamic.yaml
requirements.txt		requirements.txt
run_benchmark.sh		run_benchmark.sh
run_tuning.sh		run_tuning.sh
trainer_qa.py		trainer_qa.py
utils_qa.py		utils_qa.py

README.md

Evaluate performance of ONNX Runtime(Huggingface Question Answering)

ONNX runtime quantization is under active development. please use 1.6.0+ to get more quantization support.

This example load a language translation model and confirm its accuracy and speed based on SQuAD task.

Environment

Please use latest onnx and onnxruntime version.

Prepare dataset

You should download SQuAD dataset from SQuAD dataset link.

Prepare model

Supported model identifier from huggingface.co:

Model Identifier
mrm8488/spanbert-finetuned-squadv1
salti/bert-base-multilingual-cased-finetuned-squad

python export.py --model_name_or_path=mrm8488/spanbert-finetuned-squadv1 \ # or other supported model identifier

Quantization

Dynamic quantize:

bash run_tuning.sh --input_model=/path/to/model \ # model path as *.onnx
                   --output_model=/path/to/model_tune \
                   --config=qa_dynamic.yaml

Benchmark

bash run_benchmark.sh --input_model=/path/to/model \ # model path as *.onnx
                      --config=qa_dynamic.yaml
                      --mode=performance # or accuracy

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ptq

ptq

README.md

Evaluate performance of ONNX Runtime(Huggingface Question Answering)

Environment

Prepare dataset

Prepare model

Quantization

Benchmark

Files

ptq

Directory actions

More options

Directory actions

More options

Latest commit

History

ptq

Folders and files

parent directory

README.md

Evaluate performance of ONNX Runtime(Huggingface Question Answering)

Environment

Prepare dataset

Prepare model

Quantization

Benchmark