Skip to content

Commit

Permalink
fix: Allow mounting tool registry in OSS to load private tools tool (#…
Browse files Browse the repository at this point in the history
…554)

* Allow mounting tool registry in OSS to load private tools too

* Minor README update for tool registry related env

* Minor log update to include tool name and version loaded
  • Loading branch information
chandrasekharan-zipstack authored Aug 6, 2024
1 parent 1fb2d4e commit 1ebb15a
Show file tree
Hide file tree
Showing 14 changed files with 71 additions and 31 deletions.
1 change: 1 addition & 0 deletions .github/workflows/ci-container-build.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ jobs:
cp ../worker/sample.env ../worker/.env
cp ../x2text-service/sample.env ../x2text-service/.env
cp sample.essentials.env essentials.env
cp sample.env .env
docker compose -f docker-compose.yaml up -d
sleep 10
Expand Down
8 changes: 4 additions & 4 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -616,17 +616,17 @@ backend/plugins/processor/*
frontend/src/plugins/*

# Tool registry
unstract/tool-registry/src/unstract/tool_registry/*.json
unstract/tool-registry/tool_registry_config/*.json
unstract/tool-registry/tests/*.yaml
!unstract/tool-registry/src/unstract/tool_registry/public_tools.json
unstract/tool-registry/src/unstract/tool_registry/config/registry.yaml
!unstract/tool-registry/tool_registry_config/public_tools.json
unstract/tool-registry/tool_registry_config/registry.yaml

# Docker related
# End of https://www.toptal.com/developers/gitignore/api/windows,macos,linux,pycharm,pycharm+all,pycharm+iml,python,visualstudiocode,react,django
docker/temp/*
docker/init.sql/*
docker/*.env
!docker/sample.*.env
!docker/sample*.env
docker/public_tools.json
docker/proxy_overrides.yaml
docker/workflow_data/
Expand Down
4 changes: 4 additions & 0 deletions backend/sample.env
Original file line number Diff line number Diff line change
Expand Up @@ -143,3 +143,7 @@ CELERY_BROKER_URL = "redis://unstract-redis:6379"

# Indexing flag to prevent re-index
INDEXING_FLAG_TTL=1800

# Path where public and private tools are registered
# with a YAML and JSONs
TOOL_REGISTRY_CONFIG_PATH="/data/tool_registry_config"
2 changes: 2 additions & 0 deletions docker/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ services:
volumes:
- prompt_studio_data:/app/prompt-studio-data
- ./workflow_data:/data
- ${TOOL_REGISTRY_CONFIG_SRC_PATH}:/data/tool_registry_config
environment:
- ENVIRONMENT=development
labels:
Expand All @@ -49,6 +50,7 @@ services:
- traefik.enable=false
volumes:
- ./workflow_data:/data
- ${TOOL_REGISTRY_CONFIG_SRC_PATH}:/data/tool_registry_config

# Celery Flower
celery-flower:
Expand Down
3 changes: 3 additions & 0 deletions docker/sample.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Path where public and private tools are registered
# with a YAML and JSONs
TOOL_REGISTRY_CONFIG_SRC_PATH="${PWD}/../unstract/tool-registry/tool_registry_config"
31 changes: 21 additions & 10 deletions run-platform.sh
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,24 @@ do_git_pull() {
git pull --quiet $(git remote) $branch
}

copy_or_merge_envs() {

local src_file="$1"
local dest_file="$2"
local displayed_reason="$3"

if [ ! -e "$dest_file" ]; then
cp "$src_file" "$dest_file"
echo -e "Created env for ""$blue_text""$displayed_reason""$default_text"" at ""$blue_text""$dest_file""$default_text""."
elif [ "$opt_only_env" = true ] || [ "$opt_update" = true ]; then
python3 $script_dir/docker/scripts/merge_env.py "$src_file" "$dest_file"
if [ $? -ne 0 ]; then
exit 1
fi
echo -e "Merged env for ""$blue_text""$displayed_reason""$default_text"" at ""$blue_text""$dest_file""$default_text""."
fi
}

setup_env() {
# Generate Fernet Key. Refer https://pypi.org/project/cryptography/. for both backend and platform-service.
ENCRYPTION_KEY=$(python3 -c "import secrets, base64; print(base64.urlsafe_b64encode(secrets.token_bytes(32)).decode())")
Expand Down Expand Up @@ -209,16 +227,9 @@ setup_env() {
fi
done

if [ ! -e "$script_dir/docker/essentials.env" ]; then
cp "$script_dir/docker/sample.essentials.env" "$script_dir/docker/essentials.env"
echo -e "Created env for ""$blue_text""essential services""$default_text"" at ""$blue_text""$script_dir/docker/essentials.env""$default_text""."
elif [ "$opt_only_env" = true ] || [ "$opt_update" = true ]; then
python3 $script_dir/docker/scripts/merge_env.py "$script_dir/docker/sample.essentials.env" "$script_dir/docker/essentials.env"
if [ $? -ne 0 ]; then
exit 1
fi
echo -e "Merged env for ""$blue_text""essential services""$default_text"" at ""$blue_text""$script_dir/docker/essentials.env""$default_text""."
fi
copy_or_merge_envs "$script_dir/docker/sample.essentials.env" "$script_dir/docker/essentials.env" "essential services"
copy_or_merge_envs "$script_dir/docker/sample.env" "$script_dir/docker/.env" "docker compose"


if [ "$opt_only_env" = true ]; then
echo -e "$green_text""Done.""$default_text" && exit 0
Expand Down
2 changes: 1 addition & 1 deletion unstract/tool-registry/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ This document explains the structure of the Tool Registry and provides instructi

## Registry Configuration

The Tool Registry relies on a `registry.yaml` file to maintain a comprehensive list of registered tools. Tools can be made public or private
The Tool Registry relies on a `registry.yaml` file to maintain a comprehensive list of registered tools. Tools can be made public or private. In order for the tool registry configuration to be used, set the env `TOOL_REGISTRY_CONFIG_PATH` wherever this library is imported and used.

### Registry

Expand Down
3 changes: 3 additions & 0 deletions unstract/tool-registry/sample.env
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Path where public and private tools are registered
# with a YAML and JSONs
TOOL_REGISTRY_CONFIG_PATH="${PWD}/tool_registry_config"
18 changes: 12 additions & 6 deletions unstract/tool-registry/src/unstract/tool_registry/helper.py
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@ def __init__(
self.private_tools_file = private_tools_file
self.public_tools_file = public_tools_file
self.tools = self._load_tools_from_registry_file()
if self.tools:
logger.info(f"Loaded tools from registry YAML: {self.tools}")

def _load_tools_from_registry_file(self) -> list[str]:
"""Load all tools from the registry YAML.
Expand Down Expand Up @@ -197,12 +199,10 @@ def get_registry(self) -> dict[str, Any]:
Returns:
dict[str, Any]: _description_
"""
try:
yml_data: dict[str, Any] = ToolUtils.get_registry(self.registry_file)
return yml_data
except FileNotFoundError:
logger.error(f"File not found: {self.registry_file}")
raise RegistryNotFound()
yml_data: dict[str, Any] = ToolUtils.get_registry(
self.registry_file, raise_exc=True
)
return yml_data

def save_registry(self, data: dict[str, Any]) -> None:
"""save_registry Save the updated YAML back to the file.
Expand Down Expand Up @@ -301,8 +301,14 @@ def get_all_tools_from_disk(self) -> dict[str, dict[str, Any]]:
data = ToolUtils.get_all_tools_from_disk(file_path=tool_file)
if not data:
logger.info(f"No data from {tool_file}")
tool_version_list = [
f"tool: {k}, version: {v['properties']['toolVersion']}"
for k, v in data.items()
]
logger.info(f"Loading tools from {tool_file}: {tool_version_list}")
tools.update(data)
except FileNotFoundError:
logger.warning(f"Unable to find tool file to load tools: {tool_file}")
pass
return tools

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@


class ToolRegistry:
REGISTRY_FILE = "config/registry.yaml"
REGISTRY_FILE = "registry.yaml"
PRIVATE_TOOL_CONFIG_FILE = "private_tools.json"
PUBLIC_TOOL_CONFIG_FILE = "public_tools.json"

Expand All @@ -39,7 +39,12 @@ def __init__(
- get_tool_properties_by_tool_id(): Get properties of a tool.
- get_tool_icon__by_tool_id(): Get icon of a tool.
"""
directory = os.path.dirname(os.path.abspath(__file__))
directory = os.getenv("TOOL_REGISTRY_CONFIG_PATH")
if not directory:
raise ValueError(
"Env 'TOOL_REGISTRY_CONFIG_PATH' is not set, please add the tool "
"registry JSONs and YAML to a directory and set the env."
)
self.helper = ToolRegistryHelper(
registry=os.path.join(directory, registry_file),
private_tools_file=os.path.join(directory, private_tools),
Expand Down
15 changes: 10 additions & 5 deletions unstract/tool-registry/src/unstract/tool_registry/tool_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
from unstract.sdk.adapters.enums import AdapterTypes
from unstract.tool_registry.constants import AdapterPropertyKey, Tools
from unstract.tool_registry.dto import AdapterProperties, Spec, Tool, ToolMeta
from unstract.tool_registry.exceptions import InvalidToolURLException
from unstract.tool_registry.exceptions import InvalidToolURLException, RegistryNotFound

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -65,8 +65,8 @@ def get_all_tools_from_disk(file_path: str) -> dict[str, Any]:
with open(file_path) as json_file:
data: dict[str, Any] = json.load(json_file)
return data
except json.JSONDecodeError:
logger.error("Tools from Disk: JSON decode error")
except json.JSONDecodeError as e:
logger.warning(f"Error loading tools from {file_path}: {e}")
return {}

@staticmethod
Expand All @@ -75,7 +75,7 @@ def save_registry(file_path: str, data: dict[str, Any]) -> None:
yaml.dump(data, file, default_flow_style=False)

@staticmethod
def get_registry(file_path: str) -> dict[str, Any]:
def get_registry(file_path: str, raise_exc: bool = False) -> dict[str, Any]:
"""Get Registry File.
Args:
Expand All @@ -89,9 +89,14 @@ def get_registry(file_path: str) -> dict[str, Any]:
with open(file_path) as file:
yml_data = yaml.safe_load(file)
except FileNotFoundError:
logger.warning(f"Could not find tool registry YAML: {str(file_path)}")
if raise_exc:
raise RegistryNotFound()
pass
except Exception as error:
logger.error(f"Error While loading {str(file_path)} Error: {error}")
logger.error(f"Error while loading {str(file_path)}: {error}")
if raise_exc:
raise error
return yml_data

@staticmethod
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ auth:
username: <username>
document: unstract-tool-registry
tools:
# - docker:unstract/tool-resume-parser
- local:fileops
# - docker:unstract/tool-classifier
- local:classifier
version: 1.0.0
3 changes: 2 additions & 1 deletion worker/src/unstract/worker/worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,7 +222,8 @@ def run_container(
)
except ToolRunException as te:
self.logger.error(
f"Error while running docker container: {te}",
"Error while running docker container"
f" {container_config.get('name')}: {te}",
stack_info=True,
exc_info=True,
)
Expand Down

0 comments on commit 1ebb15a

Please sign in to comment.