Merge pull request NVIDIA#370 from botitai/feature/gotitai-truthchecker

Add Got It AI's Truthchecking service for RAG applications
piotrm0 · Apr 26, 2024 · 9e65111 · 9e65111
2 parents 84977a0 + fcb9d5c
commit 9e65111
Show file tree

Hide file tree

Showing 10 changed files with 287 additions and 2 deletions.
diff --git a/README.md b/README.md
@@ -158,6 +158,7 @@ rails:
       - self check facts
       - self check hallucination
       - activefence moderation
+      - gotitai rag truthcheck
 
   config:
     # Configure the types of entities that should be masked on user input.
@@ -208,7 +209,7 @@ NeMo Guardrails comes with a set of [built-in guardrails](docs/user_guides/guard
 
 > **NOTE**: The built-in guardrails are only intended to enable you to get started quickly with NeMo Guardrails. For production use cases, further development and testing of the rails are needed.
 
-Currently, the guardrails library includes guardrails for: [jailbreak detection](docs/user_guides/guardrails-library.md#jailbreak-detection), [output moderation](docs/user_guides/guardrails-library.md#output-moderation), [fact-checking](docs/user_guides/guardrails-library.md#fact-checking), [sensitive data detection](docs/user_guides/guardrails-library.md#sensitive-data-detection), [hallucination detection](docs/user_guides/guardrails-library.md#hallucination-detection) and [input moderation using ActiveFence](docs/user_guides/guardrails-library.md#active-fence).
+Currently, the guardrails library includes guardrails for: [jailbreak detection](docs/user_guides/guardrails-library.md#jailbreak-detection), [output moderation](docs/user_guides/guardrails-library.md#output-moderation), [fact-checking](docs/user_guides/guardrails-library.md#fact-checking), [sensitive data detection](docs/user_guides/guardrails-library.md#sensitive-data-detection), [hallucination detection](docs/user_guides/guardrails-library.md#hallucination-detection), [input moderation using ActiveFence](docs/user_guides/guardrails-library.md#active-fence) and [hallucination detection for RAG applications using Got It AI's TruthChecker API](docs/user_guides/guardrails-library.md#got-it-ai).
 
 ## CLI
 
@@ -271,7 +272,7 @@ Evaluating the safety of a LLM-based conversational application is a complex tas
 
 ## How is this different?
 
-There are many ways guardrails can be added to an LLM-based conversational application. For example: explicit moderation endpoints (e.g., OpenAI, ActiveFence), critique chains (e.g. constitutional chain), parsing the output (e.g. guardrails.ai), individual guardrails (e.g., LLM-Guard).
+There are many ways guardrails can be added to an LLM-based conversational application. For example: explicit moderation endpoints (e.g., OpenAI, ActiveFence), critique chains (e.g. constitutional chain), parsing the output (e.g. guardrails.ai), individual guardrails (e.g., LLM-Guard), hallucination detection for RAG applications (e.g., Got It AI).
 
 NeMo Guardrails aims to provide a flexible toolkit that can integrate all these complementary approaches into a cohesive LLM guardrails layer. For example, the toolkit provides out-of-the-box integration with ActiveFence, AlignScore and LangChain chains.
 

diff --git a/docs/user_guides/guardrails-library.md b/docs/user_guides/guardrails-library.md
@@ -16,6 +16,7 @@ NeMo Guardrails comes with a library of built-in guardrails that you can easily
 
 3. Third-Party APIs
    - [ActiveFence Moderation](#activefence)
+   - [Got It AI RAG TruthChecker](#got-it-ai)
    - OpenAI Moderation API - *[COMING SOON]*
 
 4. Other
@@ -685,6 +686,37 @@ define bot inform cannot engage in abusive or harmful behavior
   "I will not engage in any abusive or harmful behavior."
 ```
 
+### Got It AI
+
+Got It AI's Hallucination Manager helps you to detect and manage hallucinations in your AI models.
+The [TruthChecker API for RAG applications](https://www.app.got-it.ai/hallucination-manager) is a part of the Hallucination Manager suite of APIs.
+
+Existing fact-checking methods are not sufficient to detect hallucinations in AI models for real-world RAG applications. The TruthChecker API performs a dual task to determine whether a response is a `hallucination` or not:
+1. Check for faithfulness of the generated response to the retrieved knowledge chunks.
+2. Check for the relevance of the response to the user query and the conversation history.
+
+The TruthChecker API can be configured to work for open-domain use-case or for a specific domain or knowledge base. By default, the TruthChecker API is configured to work for open-domain and we expect it to deliver strong performance on specific domains. However, for an enhanced experience for a specific domain or knowledge base, you can fine-tuning the model on the knowledge base and unlock benefits like secure on-premise model deployments.
+
+Please [contact the Got It AI team](https://www.app.got-it.ai/) for more information on how to fine-tune the truthchecker api for your specific domain or knowledge base.
+
+[Got It AI's TruthChecker API for RAG applications](https://www.app.got-it.ai/hallucination-manager) can be used in Nemo Guardrails as an output rail out-of-the-box (you need to have the `GOTITAI_API_KEY` environment variable set).
+
+```yaml
+rails:
+  output:
+    flows:
+      - gotitai rag truthcheck
+```
+
+To trigger the fact-checking rail, you have to set the `$check_facts` context variable to `True` before a bot message that requires fact-checking, e.g.:
+
+```colang
+define flow
+  user ask about report
+  $check_facts = True
+  bot provide report answer
+```
+
 ## Other
 
 ### Jailbreak Detection Heuristics

diff --git a/docs/user_guides/llm-support.md b/docs/user_guides/llm-support.md
@@ -34,6 +34,7 @@ If you want to use an LLM and you cannot see a prompt in the [prompts folder](ht
  | AlignScore fact-checking _(LLM independent)_       | :heavy_check_mark: (0.89) | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark: | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:                 |
 | ActiveFence moderation _(LLM independent)_         | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark: | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:                 |
 | Llama Guard moderation _(LLM independent)_         | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark: | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:                 |
+| Got It AI RAG TruthChecker _(LLM independent)_         | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark:        | :heavy_check_mark: | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:   | :heavy_check_mark:                 |
 
 Table legend:
 - :heavy_check_mark: - Supported (_The feature is fully supported by the LLM based on our experiments and tests_)

diff --git a/examples/sample_config.yml b/examples/sample_config.yml
@@ -37,6 +37,7 @@ rails:
       - check hallucination
       - activefence moderation
       - check sensitive data
+      - gotitai rag truthcheck
 
   # Execution rails are triggered before and after an action is invoked
   # TODO
diff --git a/nemoguardrails/library/gotitai/__init__.py b/nemoguardrails/library/gotitai/__init__.py
@@ -0,0 +1,14 @@
+# SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
diff --git a/nemoguardrails/library/gotitai/actions.py b/nemoguardrails/library/gotitai/actions.py
@@ -0,0 +1,93 @@
+# SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+import json
+import logging
+import os
+from typing import Optional
+
+import aiohttp
+
+from nemoguardrails.actions import action
+
+log = logging.getLogger(__name__)
+
+
+@action(name="call gotitai truthchecker api", is_system_action=True)
+async def call_gotitai_truthchecker_api(context: Optional[dict] = None):
+    api_key = os.environ.get("GOTITAI_API_KEY")
+
+    if api_key is None:
+        raise ValueError("GOTITAI_API_KEY environment variable not set.")
+
+    if context is None:
+        raise ValueError(
+            "Context is empty. `user_message`, `bot_response` and `relevant_chunks` keys are required to call the GotIt AI Truthchecker api."
+        )
+
+    user_message = context.get("user_message", "")
+    response = context.get("bot_message", "")
+    knowledge = context.get("relevant_chunks_sep", [])
+
+    retval = {"hallucination": None}  # in case the api call is skipped
+
+    if not isinstance(knowledge, list):
+        log.error(
+            "Could not run Got It AI Truthchecker. `relevant_chunks_sep` must be a list of knowledge."
+        )
+        return retval
+
+    if not knowledge:
+        log.error(
+            "Could not run Got It AI Truthchecker. At least 1 relevant chunk is required."
+        )
+        return retval
+
+    url = "https://api.got-it.ai/api/v1/hallucination-manager/truthchecker"
+    headers = {
+        "Content-Type": "application/json",
+        "Authorization": "Bearer " + api_key,
+    }
+    data = {
+        "knowledge": [
+            {
+                "text": chunk,
+            }
+            for chunk in knowledge
+        ],
+        "prompt": user_message,
+        "generated_text": response,
+        # Messages is empty for now since there is no standard way to get them.
+        # This should be updated once 0.8.0 is released.
+        # Reference: https://github.com/NVIDIA/NeMo-Guardrails/issues/246
+        "messages": [],
+    }
+
+    async with aiohttp.ClientSession() as session:
+        async with session.post(
+            url=url,
+            headers=headers,
+            json=data,
+        ) as response:
+            if response.status != 200:
+                log.error(
+                    f"GotItAI TruthChecking call failed with status code {response.status}.\n"
+                    f"Details: {await response.json()}"
+                )
+            response_json = await response.json()
+            log.info(json.dumps(response_json, indent=True))
+            hallucination = response_json["hallucination"]
+            retval = {"hallucination": hallucination}
+
+            return retval
diff --git a/nemoguardrails/library/gotitai/flows.co b/nemoguardrails/library/gotitai/flows.co
@@ -0,0 +1,10 @@
+define subflow gotitai rag truthcheck
+  """Guardrail based on the maximum risk score."""
+  if $check_facts == True
+    $check_facts = False
+
+    $result = execute call gotitai truthchecker api
+
+    if $result.hallucination == "yes"
+        bot inform answer unknown
+        stop
diff --git a/tests/test_configs/gotitai_truthchecker/config.yml b/tests/test_configs/gotitai_truthchecker/config.yml
@@ -0,0 +1,8 @@
+models:
+  - type: main
+    engine: openai
+    model: gpt-3.5-turbo-instruct
+rails:
+  output:
+    flows:
+        - gotitai rag truthcheck
diff --git a/tests/test_configs/gotitai_truthchecker/truthcheck.co b/tests/test_configs/gotitai_truthchecker/truthcheck.co
@@ -0,0 +1,10 @@
+define user ask general question
+  "Do you ship within 2 days?"
+
+define flow
+  user ask general question
+  $check_facts = True
+  bot provide answer
+
+define bot inform answer unknown
+    "I don't know the answer to that."
diff --git a/tests/test_gotitai_output_rail.py b/tests/test_gotitai_output_rail.py
@@ -0,0 +1,115 @@
+# SPDX-FileCopyrightText: Copyright (c) 2023 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+import os
+
+import pytest
+from aioresponses import aioresponses
+
+from nemoguardrails import RailsConfig
+from nemoguardrails.actions.actions import ActionResult, action
+from tests.utils import TestChat
+
+CONFIGS_FOLDER = os.path.join(os.path.dirname(__file__), ".", "test_configs")
+
+GOTITAI_API_URL = "https://api.got-it.ai/api/v1/hallucination-manager/truthchecker"
+
+
+@action(is_system_action=True)
+async def retrieve_relevant_chunks():
+    """Retrieve relevant chunks from the knowledge base and add them to the context."""
+    context_updates = {}
+    context_updates["relevant_chunks_sep"] = ["Shipping takes at least 3 days."]
+
+    return ActionResult(
+        return_value=context_updates["relevant_chunks_sep"],
+        context_updates=context_updates,
+    )
+
+
+@pytest.mark.asyncio
+async def test_hallucination(monkeypatch):
+    monkeypatch.setenv("GOTITAI_API_KEY", "xxx")
+    config = RailsConfig.from_path(os.path.join(CONFIGS_FOLDER, "gotitai_truthchecker"))
+    chat = TestChat(
+        config,
+        llm_completions=[
+            "user ask general question",  # user intent
+            "Yes, shipping can be done in 2 days.",  # bot response that will be intercepted
+        ],
+    )
+
+    with aioresponses() as m:
+        chat.app.register_action(retrieve_relevant_chunks, "retrieve_relevant_chunks")
+        m.post(
+            GOTITAI_API_URL,
+            payload={
+                "hallucination": "yes",
+            },
+        )
+
+        chat >> "Do you ship within 2 days?"
+        await chat.bot_async("I don't know the answer to that.")
+
+
+@pytest.mark.asyncio
+async def test_not_hallucination(monkeypatch):
+    monkeypatch.setenv("GOTITAI_API_KEY", "xxx")
+    config = RailsConfig.from_path(os.path.join(CONFIGS_FOLDER, "gotitai_truthchecker"))
+    chat = TestChat(
+        config,
+        llm_completions=[
+            # "  express greeting",
+            "user ask general question",  # user intent
+            "No, shipping takes at least 3 days.",  # bot response that will not be intercepted
+        ],
+    )
+
+    with aioresponses() as m:
+        chat.app.register_action(retrieve_relevant_chunks, "retrieve_relevant_chunks")
+        m.post(
+            GOTITAI_API_URL,
+            payload={
+                "hallucination": "no",
+            },
+        )
+
+        chat >> "Do you ship within 2 days?"
+        await chat.bot_async("No, shipping takes at least 3 days.")
+
+
+@pytest.mark.asyncio
+async def test_no_context(monkeypatch):
+    monkeypatch.setenv("GOTITAI_API_KEY", "xxx")
+    config = RailsConfig.from_path(os.path.join(CONFIGS_FOLDER, "gotitai_truthchecker"))
+    chat = TestChat(
+        config,
+        llm_completions=[
+            # "  express greeting",
+            "user ask general question",  # user intent
+            "Yes, shipping can be done in 2 days.",  # bot response that will not be intercepted
+        ],
+    )
+
+    with aioresponses() as m:
+        m.post(
+            GOTITAI_API_URL,
+            payload={
+                "hallucination": None,
+            },
+        )
+
+        chat >> "Do you ship within 2 days?"
+        await chat.bot_async("Yes, shipping can be done in 2 days.")