Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 851 Bytes

README.md

File metadata and controls

7 lines (4 loc) · 851 Bytes

Concept

This is a series of experiments on how effectively an LLM agent can analyze and improve an academic argument. The experiments focus on historical arguments as an excellent playground for non-technical arguments grounded in verifiable facts. In particular, I noticed that LLM answers to historical questions are promising but broad enough that they could not effectively be assessed without significant further information. Moreover, the claims often break down when an LLM is asked for clarification without tools. I aim to see how far methods such as LLM self-examination, structured data generation, and access to external data can go toward improving a model's effective reasoning capabilities on complex questions.

Tech Stack

The project is a Jupyter Notebook using the LangChain API, using Google's low-end model gemini-1.5-flash.