Using ChatMistralAI with structured output : Pydantic model with a datetime.date value using `json_schema` raises a 400 bad request #29604

Aaryia · 2025-02-05T12:34:05Z

Checked other resources

I added a very descriptive title to this issue.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.
I am sure that this is a bug in LangChain rather than my code.
The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

from pydantic import BaseModel
from langchain_mistralai.chat_models import ChatMistralAI
from datetime import date

class DummyClass(BaseModel):
    date: date


llm = ChatMistralAI(model='mistral-small-latest',temperature=0).with_structured_output(DummyClass, method='json_schema')

result: DummyClass = llm.invoke('Answer me with a date. When was the first man on the moon ?')

Error Message and Stack Trace (if applicable)

HTTPStatusError Traceback (most recent call last)
Cell In[124], line 12
7 date: date
10 llm = ChatMistralAI(api_key=api_key, model='mistral-small-latest',temperature=0).with_structured_output(DummyClass, method='json_schema')
---> 12 result: DummyClass = llm.invoke('Answer me with a date. When was the first man on the moon ?')

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_core/runnables/base.py:3014, in RunnableSequence.invoke(self, input, config, **kwargs)
3012 context.run(_set_config_context, config)
3013 if i == 0:
-> 3014 input = context.run(step.invoke, input, config, **kwargs)
3015 else:
3016 input = context.run(step.invoke, input, config)

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_core/runnables/base.py:5352, in RunnableBindingBase.invoke(self, input, config, **kwargs)
5346 def invoke(
5347 self,
5348 input: Input,
5349 config: Optional[RunnableConfig] = None,
5350 **kwargs: Optional[Any],
5351 ) -> Output:
-> 5352 return self.bound.invoke(
5353 input,
5354 self._merge_configs(config),
5355 **{**self.kwargs, **kwargs},
5356 )

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py:284, in BaseChatModel.invoke(self, input, config, stop, **kwargs)
273 def invoke(
274 self,
275 input: LanguageModelInput,
(...)
279 **kwargs: Any,
280 ) -> BaseMessage:
281 config = ensure_config(config)
282 return cast(
283 ChatGeneration,
--> 284 self.generate_prompt(
285 [self._convert_input(input)],
286 stop=stop,
287 callbacks=config.get("callbacks"),
288 tags=config.get("tags"),
289 metadata=config.get("metadata"),
290 run_name=config.get("run_name"),
291 run_id=config.pop("run_id", None),
292 **kwargs,
293 ).generations[0][0],
294 ).message

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py:860, in BaseChatModel.generate_prompt(self, prompts, stop, callbacks, **kwargs)
852 def generate_prompt(
853 self,
854 prompts: list[PromptValue],
(...)
857 **kwargs: Any,
858 ) -> LLMResult:
859 prompt_messages = [p.to_messages() for p in prompts]
--> 860 return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py:690, in BaseChatModel.generate(self, messages, stop, callbacks, tags, metadata, run_name, run_id, **kwargs)
687 for i, m in enumerate(messages):
688 try:
689 results.append(
--> 690 self._generate_with_cache(
691 m,
692 stop=stop,
693 run_manager=run_managers[i] if run_managers else None,
694 **kwargs,
695 )
696 )
697 except BaseException as e:
698 if run_managers:

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_core/language_models/chat_models.py:925, in BaseChatModel._generate_with_cache(self, messages, stop, run_manager, **kwargs)
923 else:
924 if inspect.signature(self._generate).parameters.get("run_manager"):
--> 925 result = self._generate(
926 messages, stop=stop, run_manager=run_manager, **kwargs
927 )
928 else:
929 result = self._generate(messages, stop=stop, **kwargs)

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_mistralai/chat_models.py:547, in ChatMistralAI._generate(self, messages, stop, run_manager, stream, **kwargs)
545 message_dicts, params = self._create_message_dicts(messages, stop)
546 params = {**params, **kwargs}
--> 547 response = self.completion_with_retry(
548 messages=message_dicts, run_manager=run_manager, **params
549 )
550 return self._create_chat_result(response)

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_mistralai/chat_models.py:466, in ChatMistralAI.completion_with_retry(self, run_manager, **kwargs)
463 _raise_on_error(response)
464 return response.json()
--> 466 rtn = _completion_with_retry(**kwargs)
467 return rtn

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_mistralai/chat_models.py:463, in ChatMistralAI.completion_with_retry.._completion_with_retry(**kwargs)
461 else:
462 response = self.client.post(url="/chat/completions", json=kwargs)
--> 463 _raise_on_error(response)
464 return response.json()

File ~/rag-project/rag-sandbox/.venv/lib/python3.12/site-packages/langchain_mistralai/chat_models.py:170, in _raise_on_error(response)
168 if httpx.codes.is_error(response.status_code):
169 error_message = response.read().decode("utf-8")
--> 170 raise httpx.HTTPStatusError(
171 f"Error response {response.status_code} "
172 f"while fetching {response.url}: {error_message}",
173 request=response.request,
174 response=response,
175 )

HTTPStatusError: Error response 400 while fetching https://api.mistral.ai/v1/chat/completions: {"object":"error","message":"Received unsupported keyword format in schema.","type":"invalid_request_error","param":null,"code":null}

Description

I am trying to use langchain to identify dates for downstream filtering. I used the with_structured_output and it seemed to work out-of-the-box but I encountered some issues with the method='function_calling' approach (sometimes the model was not properly following the pydantic schema), so I tried using the method='json_schema' to constrain it another way.

I expected json_schema to work the same or better, but it did not. I got the stacktrace above.
I followed the problem down to the _convert_pydantic_to_openai_function method.

Using pydantic model_json_schema returns the following :

{
  "properties": {
    "date": { "format": "date", "title": "Date", "type": "string" },
    "required": ["date"],
    "title": "DummyClass",
    "type": "object"
  }
}

The problem lies in the format key. This key is not supported by mistral, nor openai as documented here.

Deleting this format key and providing the following description :

schema['properties']['date']['description'] = 'format:date'

does the call properly.

I believe that the _rm_titles function should be extended to remove all keys that are unsupported per the openai documentation.

System Info

System Information

OS: Linux
OS Version: #15-Ubuntu SMP PREEMPT_DYNAMIC Fri Jan 10 23:48:25 UTC 2025
Python Version: 3.12.7 (main, Jan 17 2025, 16:55:27) [GCC 14.2.0]

Package Information

langchain_core: 0.3.33
langchain: 0.3.2
langchain_community: 0.3.1
langsmith: 0.1.147
langchain_huggingface: 0.1.2
langchain_mistralai: 0.2.6
langchain_text_splitters: 0.3.5

Optional packages not installed

langserve

Other Dependencies

aiohttp: 3.11.11
async-timeout: Installed. No version info available.
dataclasses-json: 0.6.7
httpx: 0.28.1
httpx-sse: 0.4.0
huggingface-hub: 0.27.1
jsonpatch: 1.33
langsmith-pyo3: Installed. No version info available.
numpy: 1.26.4
orjson: 3.10.15
packaging: 24.2
pydantic: 2.10.5
pydantic-settings: 2.7.1
PyYAML: 6.0.2
requests: 2.32.3
requests-toolbelt: 1.0.0
sentence-transformers: 3.3.1
SQLAlchemy: 2.0.37
tenacity: 8.5.0
tokenizers: 0.21.0
transformers: 4.48.1
typing-extensions: 4.12.2

The text was updated successfully, but these errors were encountered:

Aaryia · 2025-02-05T15:28:22Z

I made a short PR which fixes the issue. It removes unsupported keywords from the schema using the _rm_titles function. I did not venture so far as to modify function names or parameter names, not knowing what impact it could have.

The main problem I have with this issue is that it does not constrain the generation, it simply converts it to the basic json schema without formatting or additional information. There might be a band-aid way of doing it, adding to the description value a string representation of the unsupported keywords. This still would not be a constraint, but it would help alignment of the LLM answer with our expected output.

dosubot bot added the 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature label Feb 5, 2025

Aaryia mentioned this issue Feb 5, 2025

core: remove unsupported keywords from pydantic schema #29611

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using ChatMistralAI with structured output : Pydantic model with a datetime.date value using `json_schema` raises a 400 bad request #29604

Using ChatMistralAI with structured output : Pydantic model with a datetime.date value using `json_schema` raises a 400 bad request #29604

Aaryia commented Feb 5, 2025

Aaryia commented Feb 5, 2025

Using ChatMistralAI with structured output : Pydantic model with a datetime.date value using json_schema raises a 400 bad request #29604

Using ChatMistralAI with structured output : Pydantic model with a datetime.date value using json_schema raises a 400 bad request #29604

Comments

Aaryia commented Feb 5, 2025

Checked other resources

Example Code

Error Message and Stack Trace (if applicable)

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

Aaryia commented Feb 5, 2025

Using ChatMistralAI with structured output : Pydantic model with a datetime.date value using `json_schema` raises a 400 bad request #29604

Using ChatMistralAI with structured output : Pydantic model with a datetime.date value using `json_schema` raises a 400 bad request #29604