Skip to content

Latest commit

 

History

History
234 lines (177 loc) · 11.7 KB

README.md

File metadata and controls

234 lines (177 loc) · 11.7 KB

Prompterator

License: MIT Streamlit Scc Count Badge Scc Count Badge

Prompterator is a Streamlit-based prompt-iterating IDE. It runs locally but connects to external APIs exposed by various LLMs.


A screenshot of the prompterator interface, with highligted features / areas of interest: 1. Data Upload, 2. Compose Prompt, 3. Run Prompt, 4. Evaluate and 5. Prompt History.

Requirements

Create a virtual environment that uses Python 3.10. Then, install project requirements:

  1. pip install poetry==1.4.2
  2. poetry install --no-root

How to run

1. Set environment variables (optional)

If you use PyCharm, consider storing these in your run configuration.

  • OPENAI_API_KEY: Optional. Only if you want to use OpenAI models (ChatGPT, GPT-4, etc.) via OpenAI APIs.
  • AZURE_OPENAI_API_KEY and AZURE_OPENAI_API_BASE: Optional. Only if you want to use OpenAI models (ChatGPT, GPT-4, etc.) via Azure OpenAI APIs.
  • PROMPTERATOR_DATA_DIR: Optional. Where to store the files with your prompts and generated texts. Defaults to ~/prompterator-data. If you plan to work on prompts for different tasks or datasets, it's a good idea to use a separate directory for each one.

If you do not happen to have access to an OpenAI API key, feel free to use the mock-gpt-3.5-turbo model, which is a mocked version of the OpenAI's GPT-3.5 model. This is also very helpful when developing Prompterator itself.

2. Run the Streamlit app

From the root of the repository, run:

make run

If you want to run the app directly from PyCharm, create a run configuration:

  1. Right-click prompterator/main.py -> More Run/Debug -> Modify Run Configuration
  2. Under "Interpreter options", enter -m poetry run streamlit run
  3. Optionally, configure environment variables as described above
  4. Save and use 🚀

Using model-specific configuration

To use the models Prompterator supports out of the box, you generally need to at least specify an API key and/or the endpoint Prompterator ought to use when contacting them.

The sections below specify how to do that for each supported model family.

OpenAI

  • To use OpenAI APIs, set the OPENAI_API_KEY environment variable as per the docs.
  • To use Azure OpenAI APIs, set:
    • AZURE_OPENAI_API_KEY
    • AZURE_OPENAI_API_BASE -- the base endpoint URL, excluding the /openai/deployments/... ending
    • AZURE_OPENAI_API_VERSION if your version differs from the default 2023-05-15; see the docs

Google Vertex

  • Set the GOOGLE_VERTEX_AUTH_TOKEN environment variable to the output of gcloud auth print-access-token.
  • Set the TEXT_BISON_URL environment variable to the URL that belongs to your PROJECT_ID, as per the docs

AWS Bedrock

To use the AWS Bedrock-provided models, a version of boto3 that supports AWS Bedrock needs to be installed.

Cohere

  • Set the COHERE_API_KEY environment variable to your Cohere api key as per the docs.

Note that to use the Cohere models, the Cohere package needs to be installed as well.

Using vision models and displaying images

Prompterator supports vision models (for now gpt-4-vision-preview) that can take text and an image as input and output text. To use them, you need to upload a csv file with the following columns:

  • text: is basic requirement just like in other models
  • image: full base64 encoding of an image (example: ...) The image will be rendered inside the displayed dataframe and next to the "generated text" area

(Note: you also need an OPENAI_API_KEY environment variable to use gpt-4-vision-preview)

Usage guide

Input format

Prompterator accepts CSV files as input. Additionally, the CSV data should follow these rules:

  • be parseable using a pd.read_csv call with the default argument values. This means e.g. having column names in the first row, using comma as the separator, and enclosing values (where needed) in double quotes (")
  • have a column named text

Using input data in prompts

The user/system prompt textboxes support Jinja templates. Don't worry if you're new to Jinja -- Prompterator can show you a real-time "compiled" preview of your prompts to help you write the templates.

Given a column named text in your uploaded CSV data, you can include values from this column by writing the simple {{text}} template in your prompt.

If the values in your column represent more complex objects, you can still work with them but make sure they are either valid JSON strings or valid Python expressions accepted by ast.literal_eval.

To parse string representations of objects, use:

  • fromjson: for valid JSON strings, e.g. '["A", "B"]'
  • fromAstString: for Python expressions such as dicts/lists/tuples/... (see the accepted types of ast.literal_eval), e.g. "{'key': 'value'}"

For example, given a CSV column texts with a value "[""A"", ""B"", ""C""]", you can utilise this template to enumerate the individual list items in your prompt:

{% for item in fromjson(texts) -%}
- {{ item }}
{% endfor %}

which would lead to this in your prompt:

- A
- B
- C

Using structured output (function calling and structured output)

Structured Outputs is a feature that ensures the model will always generate responses that adhere to your supplied JSON Schema, so you don't need to worry about the model omitting a required key, or hallucinating an invalid enum value. - OpenAI website

How to get it to work

  1. Check with the model developer/provider whether the model supports some kind of structured output.
  2. Toggle structured output switch
  3. Select one of the supported structured output methods (a model might support all of them but also none of them):
    • None - no structured output is used (equals to toggle being in off state)
    • Function calling - hacky way of implementing structured outputs before Response format was implemented into API
    • Response format - new way of implementing structured outputs
  4. Provide JSON schema in json schema text input (can be generated from pydantic model or zod if you use nodejs) where title must satisfy '^[a-zA-Z0-9_-]+$':
    {
        "title": "get_delivery_date",
        "description": "Get the delivery date for a customer's order. Call this whenever you need to know the delivery date, for example when a customer asks 'Where is my package'",
        "type": "object",
        "properties": {
          "order_id": {
            "type": "string"
          }
        },
        "required": ["order_id"],
        "additionalProperties": false
    }

Postprocessing the model outputs

When working with LLMs, you would often postprocess the raw generated text. Prompterator supports this use case so that you can iterate your prompts based on inspecting/annotating postprocessed model outputs.

By default, no postprocessing is carried out. You can change this by rewriting the postprocess function in prompterator/postprocess_output.py. The function will receive one raw model-generated text at a time and should output its postprocessed version. Both the raw and the postprocessed text are kept and saved.

Reusing labels for repeatedly encountered examples

While iterating your prompt on a dataset, you may find yourself annotating a model output that you already annotated in an earlier round. You can choose to automatically reuse such previously assigned labels by toggling "reuse past labels". To speed up your annotation process even more, you can toggle "skip past label rows" so that you only go through the rows for which no previously assigned label was found.

How this feature works:

  • Existing labels are searched for in the current list of files in the sidebar, where a match requires both the response and all the input columns' values to match.
  • If multiple different labels are found for a given input+output combination (a sign of inconsistent past annotation work), the most recent label is re-used.

Paper

You can find more information on Prompterator in the associated paper: https://aclanthology.org/2023.emnlp-demo.43/

If you found Prompterator helpful in your research, please consider citing it:

@inproceedings{sucik-etal-2023-prompterator,
    title = "Prompterator: Iterate Efficiently towards More Effective Prompts",
    author = "Su{\v{c}}ik, Samuel  and
      Skala, Daniel  and
      {\v{S}}vec, Andrej  and
      Hra{\v{s}}ka, Peter  and
      {\v{S}}uppa, Marek",
    editor = "Feng, Yansong  and
      Lefever, Els",
    booktitle = "Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
    month = dec,
    year = "2023",
    address = "Singapore",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.emnlp-demo.43",
    doi = "10.18653/v1/2023.emnlp-demo.43",
    pages = "471--478",
    abstract = "With the advent of Large Language Models (LLMs) the process known as prompting, which entices the LLM to solve an arbitrary language processing task without the need for finetuning, has risen to prominence. Finding well-performing prompts, however, is a non-trivial task which requires experimentation in order to arrive at a prompt that solves a specific task. When a given task does not readily reduce to one that can be easily measured with well established metrics, human evaluation of the results obtained by prompting is often necessary. In this work we present prompterator, a tool that helps the user interactively iterate over various potential prompts and choose the best performing one based on human feedback. It is distributed as an open source package with out-of-the-box support for various LLM providers and was designed to be easily extensible.",
}