You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
And this split causes difference in the way response is treated due to their handler code coming from different repositories.
Possible Solution
A solution is fixing https://github.com/aws/sagemaker-containers/blob/master/src/sagemaker_containers/_transformer.py#L207 so instead of raw result it should be return _worker.Response(response=result, mimetype=request.accept) which should force the response to have the correct response header instead of text/html.
However, this package is archived and in read only mode.
So I patched out the problematic file in my dockerfile. COPY inference/patches/_transformer.py /miniconda3/lib/python3.8/site-packages/sagemaker_containers/_transformer.py
And this seemed to yield the desired results.
Background
I have a custom inference container implementing the inference lifecycle methods described in https://docs.aws.amazon.com/sagemaker/latest/dg/adapt-inference-container.html.
Functionally, I have no issues, but when it comes to using features such as https://docs.aws.amazon.com/sagemaker/latest/dg/clarify-online-explainability.html which requires correct response header I run into issues due to the response type always being text/html instead of one which I specify (tested via aws sagemaker-runtime invoke-endpoint)
After digging through the source and testing, I realized MME works as expected and returns proper response header.
This is because https://github.com/aws/sagemaker-inference-toolkit/blob/2ed6f3d19daffccab390d8e9142c9c39c9932271/src/sagemaker_inference/transformer.py#L154 handles the response header correctly.
So the issue is with the non MME case.
The reason for this behavior is due to the split on how different types of servers are ran
sagemaker-scikit-learn-container/src/sagemaker_sklearn_container/serving.py
Line 146 in a5015fe
And this split causes difference in the way response is treated due to their handler code coming from different repositories.
Possible Solution
A solution is fixing https://github.com/aws/sagemaker-containers/blob/master/src/sagemaker_containers/_transformer.py#L207 so instead of raw result it should be return _worker.Response(response=result, mimetype=request.accept) which should force the response to have the correct response header instead of text/html.
However, this package is archived and in read only mode.
So I patched out the problematic file in my dockerfile.
COPY inference/patches/_transformer.py /miniconda3/lib/python3.8/site-packages/sagemaker_containers/_transformer.py
And this seemed to yield the desired results.
I haven't dug deep enough to say this is the correct solution, but I don't think the current implementation is in an ideal state.
XGBoost also suffers from similar issue https://github.com/aws/sagemaker-xgboost-container
The change lines up with the existing documentation as well https://docs.aws.amazon.com/sagemaker/latest/dg/adapt-inference-container.html
The text was updated successfully, but these errors were encountered: