Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adaptive batching leads to parameters being cut off #1541

Open
tobbber opened this issue Jan 17, 2024 · 5 comments
Open

Adaptive batching leads to parameters being cut off #1541

tobbber opened this issue Jan 17, 2024 · 5 comments

Comments

@tobbber
Copy link

tobbber commented Jan 17, 2024

Hi, I observed some weird behavior when using the REST API with adaptive batching enabled.
When sending a single request to the v2 REST endpoint /v2/models/<MODEL>/infer the Parameters within the responseOutput are cut off. If a parameter is not an iterable, a TypeError is raised: e.g. TypeError: 'int' object is not iterable

Note that this only happens when:

  1. Adaptive batching is enabled
  2. A single request is sent within the max_batch_time time window

How to Reproduce:

# model.py 
from mlserver import MLModel
from mlserver.types import InferenceResponse, ResponseOutput, InferenceRequest

class EchoModel(MLModel):
	async def load(self):
		return True

        async def predict(self, payload: InferenceRequest):
		request_input = payload.inputs[0]
		# return the payload input as output
		output = ResponseOutput(**request_input.dict())
		return InferenceResponse(model_name=self.name, outputs=[output])
// model-settings.json
{
	"name": "echoModel",
	"max_batch_time": 2,
	"max_batch_size": 32,
	"implementation": "model.EchoModel"
}

Request Body:

// POST to localhost:8080/v2/models/echoModel/infer
{
	"inputs": [{
		"name": "docs",
		"shape": [2],
		"datatype": "INT32",
		"parameters": {
			"id": "123"
		},
		"data": [10,11]
	}]
}

Expected behavior: EchoModel returns the RequestInput as Output.

Actual behavior: Parameter in the output are cut off or TypeError is raised

Examples:

  • input parameters: {"custom-param": "123"} --> output parameters: {"custom-param": "1"}
  • input parameters: {"custom-params": ["123", "456"]} --> output parameters: {"custom-param": "123"}
  • input parameters: {"custom-param": 123 } --> TypeError: 'int' object is not iterable

It seems like the Parameters are unbatched even if they were never batched in the first place.

@yaliqin
Copy link

yaliqin commented Feb 12, 2024

Hi @tobbber Can you share the Dockerfile used? I tried to wrap up my code as a similar way and set up the batch settings. Then I met the error of prometheus_client issue as below
File "/opt/conda/lib/python3.8/site-packages/prometheus_client/metrics.py", line 121, in __init__ registry.register(self) File "/opt/conda/lib/python3.8/site-packages/prometheus_client/registry.py", line 29, in register raise ValueError( ValueError: Duplicated timeseries in CollectorRegistry: {'batch_request_queue_count', 'batch_request_queue_bucket', 'batch_request_queue_created', 'batch_request_queue_sum'}
I used mlserver build and the generated Dockerfile use seldonio/mlserver:1.3.5-slim

@tobbber
Copy link
Author

tobbber commented Feb 13, 2024

Hi @yaliqin, i used the Mlserver CLI directly with mlserver start mlserver_example/ with structure:

mlserver_example/
├── model-settings.json
└── model.py

To install mlserver i used pip install mlserver==1.3.5

@yaliqin
Copy link

yaliqin commented Feb 13, 2024 via email

@tobbber
Copy link
Author

tobbber commented Feb 14, 2024

I am using python 3.11.6 on a arm64 machine (M1 mac)

@yaliqin
Copy link

yaliqin commented Feb 14, 2024

Thanks @tobbber. mlserver start . worked but the docker run failed. Will check the difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants