提示模板是指生成提示的可在现的方法。它包含一个文本字符串(”模板“),该字符串可以从最终用户中获取一组参数并生成提示。
所以,模板由三部分组成:
- 给语言模型的指令
- 一些例子
- 对语言模型的提问
下面给出一个例子代码:
from langchian import PromptTemplate
template = ""
I want you to act as a naming consultant for new companies.
Here are some examples of good company names:
- search engine,Google
- social media,Facebook
- video sharing,Youtube
The name should be short, catchy and easy to remember.
What is a good name for a company that make {product}?
prompt = PromptTemplate(
input_variables = ["product"],
template = template,
)
## 没有参数
no_input_prompt = PromptTemplate(
input_variables =[],
template = "Tell me a joke.")
print(no_input_prompt.format())
## 一个参数
one_input_prompt = PromptTemplate(
input_variables = ["adjective"],
template = "Tell me a {adjective} joke."
)
print(one_input_prompt.format(adjective="funny"))
## 多个参数
multiple_input_prompt = PromptTemplate(
input_variables = ["adjective","content"],
template = "Tell me a {adjective} joke about {content}."
)
print(multiple_input_prompt.format(adjective = "funny",content = "chickens"))
from langchain.prompts import load_prompt
prompt = load.prompt("lc://prompts/conversation/prompt.json")
prompt.format(history="", input="What is 1 + 1?")
## First, create the list of fewshot example
examples = [
{"word": "happy", "antonym": "sad"},
{"word": "tall", "antonym": "short"},
]
## Next, specify the template to format the example we have provided
## we can use the prompt Template class for this
example_formatter_template="""
Word:{word}
Antonym:{antonym}
"""
example_prompt = PromptTemplate(
input_variables= ["word","antonym"],
template = example_formatter_template,
)
## Finally, we create the fewShotPromptTemplate
few_shot_prompt = FewShotPromptTemplate(
examples = examples,
example_prompt = example_prompt,
# this prefix is some text that goes before the examples in the prompt.
prefix = "Give the antonym of every input",
# the suffix is some text that goes after the examples in the prompt.
suffix = "Word:{input} \n Antonym:",
input_variables = ["input"],
example_separator = "\n\n",
)
print(few_shot_prompt.format(input="big"))
from langchain.prompts.example_selector import LengthBasedExampleSelector
examples = [
{"word": "happy", "antonym": "sad"},
{"word": "tall", "antonym": "short"},
{"word": "energetic", "antonym": "lethargic"},
{"word": "sunny", "antonym": "gloomy"},
{"word": "windy", "antonym": "calm"},
]
example_formatter_template = """
Word: {word}
Antonym: {antonym}\n
"""
example_prompt = PromptTemplate(
input_variables=["word", "antonym"],
template=example_formatter_template,
)
example_selector = LengthBasedExampleSelector(
# These are the examples is has available to choose from.
examples=examples,
# This is the PromptTemplate being used to format the examples.
example_prompt=example_prompt,
# This is the maximum length that the formatted examples should be.
# Length is measured by the get_text_length function below.
max_length=25,
)
dynamic_prompt = FewShotPromptTemplate(
# We provide an ExampleSelector instead of examples.
example_selector=example_selector,
example_prompt=example_prompt,
prefix="Give the antonym of every input",
suffix="Word: {input}\nAntonym:",
input_variables=["input"],
example_separator="\n",
)
print(dynamic_prompt.format(input="big"))
为了自定义模板,有两条要求:
- 要有input_variables属性,暴露了提示模板所期望的输入变量
- 暴露一个格式化方法,该方法接收与预期输入变量对应的关键字参数,并返回格式化的提示。
下面,创建了一个自定义的提示模板,将函数名称作为输入,并将提示格式化以提供函数的源代码。为了达到这个目的,我们首先创建一个函数。它将返回给定的函数的源代码。
import inspect
from langchain.prompts import StringPromptTemplate
from pydantic import BaseModel, validator
def get_source_code(function_name):
return inspect.getsource(function_name)
class FunctionExplainerPromptTemplate(StringPromptTemplate,BaseModel):
@validator("input_variables")
def validate_input_variales(cls,v):
if len(v)!=1 or "function_name" not in v:
raise ValueError("function_name must be the only input_variable.")
return v
def format(self,**kwargs) -> str:
source_code = get_source_code(kwargs["function_name"])
## generate the prompt to be sent to the language model
prompt = f"""
Given the function name and source code, generate an English language explanation of the function.
Function Name: {kwargs["function_name"].__name__}
Source Code:
{source_code}
Explanation:
"""
return prompt
def _prompt_type(self):
return "function-explainer"
if __name__ == '__main__':
fn_explainer = FunctionExplainerPromptTemplate(
input_varibales = ["function_name"]
)
prompt = fn_explainer.format(function_name=get_source_code)
print(prompt)
一个例子选择器必须实现两个方法:
- 一个add_example方法,它接收一个例子并将其调价到ExampleSelector中
- 一个select_example方法,该方法接收输入变量,并返回一个例子列表
下面用一个例子说明:
## 创建
from langchain.prompts.example_selector.base import BaseExampleSelector
from typing import Dict,List
import numpy as np
class CustomExampleSelector(BaseExampleSelector):
def __init__(self,examples:List[Dict[str,str]]):
self.examples = examples
def add_example(self, example: Dict[str, str]) -> None:
self.examples.append(example)
def select_examples(self, input_variables: Dict[str, str]) -> List[dict]:
return np.random.choice(self.examples, size=2, replace=False)
## 使用
examples = [
{"foo": "1"},
{"foo": "2"},
{"foo": "3"}
]
example_selector = CustomExampleSelector(examples)
# Select examples
example_selector.select_examples({"foo": "foo"})
# Add new example to the set of examples
example_selector.add_example({"foo": "4"})
example_selector.examples
# -> [{'foo': '1'}, {'foo': '2'}, {'foo': '3'}, {'foo': '4'}]
# Select examples
example_selector.select_examples({"foo": "foo"})
# -> array([{'foo': '1'}, {'foo': '4'}], dtype=object)
- 定义shot的格式
- 定义example_prompt的格式
- 定义ExampleSelector
- 将exampleselector 传给 FewShotPromptTemplate
下面是一个例子:
from langchain.prompts.few_shot import FewShotPromptTemplate
from langchain.prompts.prompt import PromptTemplate
from langchain.prompts.example_selector import SemanticSimilarityExampleSelector
from langchain.vectorstores import Chroma
from langchain.embeddings import OpenAIEmbeddings
example = [
{
"question": "Who lived longer, Muhammad Ali or Alan Turing?",
"answer":
"""
Are follow up questions needed here: Yes.
Follow up: How old was Muhammad Ali when he died?
Intermediate answer: Muhammad Ali was 74 years old when he died.
Follow up: How old was Alan Turing when he died?
Intermediate answer: Alan Turing was 41 years old when he died.
So the final answer is: Muhammad Ali
"""
},
{
"question": "When was the founder of craigslist born?",
"answer":
"""
Are follow up questions needed here: Yes.
Follow up: Who was the founder of craigslist?
Intermediate answer: Craigslist was founded by Craig Newmark.
Follow up: When was Craig Newmark born?
Intermediate answer: Craig Newmark was born on December 6, 1952.
So the final answer is: December 6, 1952
"""
},
{
"question": "Who was the maternal grandfather of George Washington?",
"answer":
"""
Are follow up questions needed here: Yes.
Follow up: Who was the mother of George Washington?
Intermediate answer: The mother of George Washington was Mary Ball Washington.
Follow up: Who was the father of Mary Ball Washington?
Intermediate answer: The father of Mary Ball Washington was Joseph Ball.
So the final answer is: Joseph Ball
"""
},
{
"question": "Are both the directors of Jaws and Casino Royale from the same country?",
"answer":
"""
Are follow up questions needed here: Yes.
Follow up: Who is the director of Jaws?
Intermediate Answer: The director of Jaws is Steven Spielberg.
Follow up: Where is Steven Spielberg from?
Intermediate Answer: The United States.
Follow up: Who is the director of Casino Royale?
Intermediate Answer: The director of Casino Royale is Martin Campbell.
Follow up: Where is Martin Campbell from?
Intermediate Answer: New Zealand.
So the final answer is: No
"""
}
]
example_prompt = PromptTemplate(
input_variables = ["question","answer"],
template = "Question:{question}\n{answer}"
)
example_selector = SemanticSimilarityExampleSelector.from_examples(
example,
OpenAIEmbeddings(),
Chroma,
k=1
)
prompt = FewShotPromptTemplate(
example_selector = example_selector,
example_prompt = example_prompt,
suffix = "Question:{input}",
input_variables = ["input"]
)
print(prompt.format(input="Who was the father of Mary Ball Washington?"))
通常情况下,最好不要把提示信息作为python代码来存储,而是作为文件来存储。这可以使分享、存储和版本提示变得容易。
在高层次上,以下设计原则适用于序列化:
- 同时支持JSON和YAML。
- 支持在一个文件中指定所有内容,或者将不同的组件存储在不同的文件中。
新建一个文件simple_prompt.yaml,放置提示模板的信息
_type: prompt
input_variables:
["adjective", "content"]
template:
Tell me a {adjective} joke about {content}.
然后python代码调用就行
from langchain.prompts import load_prompt
if __name__=="__main__":
prompt = load_prompt("simple_prompt.yaml")
print(prompt.format(adjective="funny"))
新建一个文件examples.json,放置shot
[
{"input": "happy", "output": "sad"},
{"input": "tall", "output": "short"}
]
新建一个文件few_shot_prompt.yaml,放置提示模板的配置信息
_type: few_shot
input_variables:
["adjective"]
prefix:
Write antonyms for the following words.
example_prompt:
_type: prompt
input_variables:
["input", "output"]
template:
"Input: {input}\nOutput: {output}"
examples:
examples.json
suffix:
"Input: {adjective}\nOutput:"
最后python代码调用就行
from langchain.prompts import load_prompt
if __name__=="__main__":
prompt = load_prompt("few_shot_prompt.yaml")
print(prompt.format(adjective="funny"))
如果有大量的例子,我们需要选择使用哪些例子来包括在提示中,ExampleSelector是负责做这的类。目前包含5种例子选择器。
如果有大量的例子,需要选择哪些例子来包括在现实中。ExampleSelector是负责这样做的类。其基本接口定义如下:
class BaseExampleSelector(ABC):
"""Interface for selecting examples to include in prompts"""
@abstractmethod
def select_examples(self,input_variables:Dict[str,str]) -> List[Dict]:
"""select with examples to use based on the inputs"""
它唯一需要公开的方法是一个select_examples的方法。这个方法接收输入的变量,然后返回一个例子的列表。至于如何选择这些例子,取决于每个具体的实现。
这个例子选择器是根据长度选择要使用的例子。当你担心构建一个超过上下文窗口长度的提示时,这很有用。对于长的输入,它将选择较少的例子,而对于短的输入,它将选择更多的例子。
这个例子选择其根据哪些例子与输入最相近来选择例子,它通过寻找与输入有最大余弦相似度的嵌入的例子来实现这一点。
这个例子选择器是根据与输入最相似的例子的组合来选择例子,同时也对多样性进行优化。它通过寻找与输入有最大余弦相似性的嵌入的例子,然后迭代添加他们,同时惩罚他们与已经选择的例子的接近程度。
这个例子选择器是根据ngram重叠分数,根据哪些例子与输入最相似,来选择和排序例子。ngram重叠得分是一个介于0.0和1.0之间的浮点数,包括在内。
该选择器件允许设置一个阈值分数。ngram重叠分数小于或等于阈值的例子被排除。
语言模型输出文本。但是很多时候,我们可能希望得到更多的结构化信息,而不仅仅是文本。这就是输出分析器的作用。
输出解析器是帮助结构语言模型相应的类。有两个主要的方法是输出解析器必须实现的:
- get_format_instructions() -> str:它返回一个字符串,其中包括语言模型的输出应该如何格式化的说明
- parser(str)->Any:它接收一个字符串(假定是来自语言模型的相应),并将其解析为某种结构。
还有一个可选的实现方法:
- parser_with_prompt(str)->Any:它接收一个字符串(假定是来自语言模型的相应)和一个提示(假设是产生这种相应的提示),并将其解析为一些结构。
这个输出解析器允许用户指定一个任意的JSON模式,并查询LLM的符合该模式的JSON输出。
请记住,大预言模型是有漏洞的抽象概念!我们必须使用有足够能力的LLM来生成良好的JSON。我们必须使用一个有足够能力的LLM来生成格式良好的JSON。在OpenAI家族中,Davinci可以做到,但是Curies的能力已经急剧嘉奖了。
使用Pydantic来生命你的数据模型。Pydantic的BaseModel就像Python的数据类,但是有实际的类型检查+强制力。
from langchain.prompts import PromptTemplate,ChatPromptTemplate,HumanMessagePromptTemplate
from langchain.llms import OpenAI
from langchain.chat_models import ChatOpenAI
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel,Field, validator
from typing import List
model_name = 'text-davinci-003'
temperture = 0.0
model = OpenAI(model_name=model_name,temperture=temperture)
# D定义我们期待的数据结构
class Joke(BaseModel):
setup:str = Field(description="question to set up a joke")
punchine:str = Field(description="answer to resolve the joke")
@validator('setup')
def question_ends_with_question_mark(cls,field):
if field[-1]!='?':
raise ValueError("Badly formed question~")
return field
joke_query = "Tell me a joke."
parser = PydanticOutputParser(pydantic_object=Joke)
prompt = PromptTemplate(
template="Answer the user query.\n{format_instructions}\n{query}\n",
input_variables=["query"],
partial_variables={"format_instructions": parser.get_format_instructions()}
)
_input = prompt.format_prompt(query=joke_query)
print(_input)
output = model(_input.to_string())
print(parser.parse(output))
上述只是试图解析LLM相应,如果不能正确解析,就会报错。但是除了抛出错误外,我们还可以做其他的事情。具体来说,我们可以把格式错误的输出,连同格式化的指令,传递给模型,并要求它修复它。
对于这个例子,我们时殷弘上面的OutputParser。下面是如果我们把一个不符号模型的结果传递给它的情况
from langchain.output_parsers impot OutputFixingParser
new _parser = OutputFixingParser.from_llm(parser=parser,llm= ChatOpenAI())
new_parser.parse(misformatted)
虽然在某些情况下,只看输出就可以修复任何解析错误,但是在其他情况下却不能。