Concept

This is a series of experiments on how effectively an LLM agent can analyze and improve an academic argument. The experiments focus on historical arguments as an excellent playground for non-technical arguments grounded in verifiable facts. In particular, I noticed that LLM answers to historical questions are promising but broad enough that they could not effectively be assessed without significant further information. Moreover, the claims often break down when an LLM is asked for clarification without tools. I aim to see how far methods such as LLM self-examination, structured data generation, and access to external data can go toward improving a model's effective reasoning capabilities on complex questions.

Tech Stack

The project is a Jupyter Notebook using the LangChain API, using Google's low-end model gemini-1.5-flash.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.ipynb_checkpoints		.ipynb_checkpoints
README.md		README.md
Tough Critic.ipynb		Tough Critic.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Concept

Tech Stack

About

Releases

Packages

Languages

jordynass/tough-critic

Folders and files

Latest commit

History

Repository files navigation

Concept

Tech Stack

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages