This project provides the Docker images for running Unity Catalog Server and CLI.
Before you can build the Docker image, install the following tools:
build-uc-server-docker builds the Docker image of Unity Catalog Localhost Reference Server.
./docker/bin/build-uc-server-docker
Note
build-uc-server-docker
runs the entire sbt build while creating the Docker image.
build-uc-server-docker
creates an image named unitycatalog
with the version from version.sbt.
docker images unitycatalog
REPOSITORY TAG IMAGE ID CREATED SIZE
unitycatalog 0.2.0-SNAPSHOT 8b68b233813b About a minute ago 427MB
Once the Docker image of Unity Catalog's Localhost Reference Server is built, you can start it up using start-uc-server-docker script.
./docker/bin/start-uc-server-docker
Container unitycatalog does not exist. Creating it...
fbf8a0d2fc6a82f81134c4c50fb4c777399e7095706cb65ff6fe0c158ec43ef4
The container unitycatalog is running with the following parameters:
{
"Command": "\"/bin/bash bin/start…\"",
"CreatedAt": "2024-08-26 18:10:48 +0200 CEST",
"ID": "fbf8a0d2fc6a",
"Image": "unitycatalog:0.2.0-SNAPSHOT",
"Labels": "",
"LocalVolumes": "1",
"Mounts": "3e4bcb63d68a91…",
"Names": "unitycatalog",
"Networks": "bridge",
"Ports": "0.0.0.0:8081->8081/tcp",
"RunningFor": "Less than a second ago",
"Size": "32.8kB (virtual 427MB)",
"State": "running",
"Status": "Up Less than a second"
}
start-uc-server-docker
starts the Unity Catalog server to listen to 8081
.
docker container ls --filter name=unitycatalog --no-trunc --format 'table {{.ID}}\t{{.Image}}\t{{.Command}}\t{{.Ports}}'
CONTAINER ID IMAGE COMMAND PORTS
fbf8a0d2fc6a82f81134c4c50fb4c777399e7095706cb65ff6fe0c158ec43ef4 unitycatalog:0.2.0-SNAPSHOT "/bin/bash bin/start-uc-server" 0.0.0.0:8081->8081/tcp
Use the regular non-dockerized Unity Catalog CLI to access the server and list the catalogs.
./bin/uc catalog list --server http://localhost:8081
┌─────┬────────────┬──────────┬─────────────┬──────────┬────────────────────────────────────┐
│NAME │ COMMENT │PROPERTIES│ CREATED_AT │UPDATED_AT│ ID │
├─────┼────────────┼──────────┼─────────────┼──────────┼────────────────────────────────────┤
│unity│Main catalog│{} │1721241605334│null │f029b870-9468-4f10-badc-630b41e5690d│
└─────┴────────────┴──────────┴─────────────┴──────────┴────────────────────────────────────┘
build-uc-cli-docker builds the Docker image of Unity Catalog CLI.
./docker/bin/build-uc-cli-docker
build-uc-cli-docker
creates an image named unitycatalog-cli
with the version from version.sbt.
docker images unitycatalog-cli
REPOSITORY TAG IMAGE ID CREATED SIZE
unitycatalog-cli 0.2.0-SNAPSHOT 52502d16934f About a minute ago 1.48GB
Note
In case you run into the following build exception, restart the CLI docker image build.
❯ ./docker/bin/build-uc-cli-docker
...
[error] java.lang.RuntimeException: Failed to find name hashes for io.unitycatalog.cli.utils.CliUtils
...
[error] (cli / Compile / compileIncremental) Failed to find name hashes for io.unitycatalog.cli.utils.CliUtils
start-uc-cli-docker uses the unitycatalog-cli
image to run Unity Catalog CLI in a Docker container.
Note
You've already started the Unity Catalog server in a Docker container in Running Unity Catalog Server Container step.
Note
localhost
inside a Docker container is different from the local machine's localhost
.
./docker/bin/start-uc-cli-docker catalog list --server http://host.docker.internal:8081
┌─────┬────────────┬──────────┬─────────────┬──────────┬────────────────────────────────────┐
│NAME │ COMMENT │PROPERTIES│ CREATED_AT │UPDATED_AT│ ID │
├─────┼────────────┼──────────┼─────────────┼──────────┼────────────────────────────────────┤
│unity│Main catalog│{} │1721241605334│null │f029b870-9468-4f10-badc-630b41e5690d│
└─────┴────────────┴──────────┴─────────────┴──────────┴────────────────────────────────────┘
This project provides several examples using curl
commands to interact with the Unity Catalog API and demonstrate basic functionalities.
Note
If you wish to use the CLI to test the catalog, run a docker CLI instance and follow the instructions from the repositories main readme.
The examples use a variable unitycatalog_endpoint
to store the catalog's URL. Update this variable with the actual address based on your network configuration.
unitycatalog_endpoint="http://unitycatalog:8080/api/2.1/unity-catalog"
Caution
Make sure you run the above command in your bash session before proceeding to the examples. Otherwise your examples will not run correctly.
Important
Because the default run script runs the script on a network called unitycatalog_network
,
if you want to test the catalog you need to make sure you application is running in the same network.
For instance, you will not be able to test the API from your machine directly using Postman,
wget or cUrl unless you run them with the catalog on the same network, or you change the catalog's
network to run on the Host network.
This is OS specific, therefore we opted for the most generic option, which is to create
a docker network, but feel free to change those configurations to suit your needs.
This example demonstrates creating a new catalog named "MyCatalog" with a description and properties:
create_catalog_request_body='{
"name": "MyCatalog",
"comment": "An example of how to create your first catalog",
"properties": {
"Project": "Unity Catalog Demo",
"Environment": "Development",
"Access": "Public"
}
}'
create_catalog_request=$(printf "curl -s -X POST \
-H 'Content-Type: application/json' \
-H 'Accept: application/json' \
-d '%s' \
%s/catalogs" "$create_catalog_request_body" "$unitycatalog_endpoint")
docker run --rm --network unitycatalog_network alpine/curl sh -c "$create_catalog_request" | jq .
This example lists all available catalogs:
list_catalog_request_body="curl -s --location '$unitycatalog_endpoint/catalogs' \
--header 'Accept: application/json'"
docker run --rm --network unitycatalog_network alpine/curl sh -c "$list_catalog_request_body" | jq .
This example creates a new schema named "Schema_A" within the "MyCatalog" catalog:
create_schema_a_request_body='{
"name": "Schema_A",
"catalog_name": "MyCatalog",
"comment": "Schema A in catalog MyCatalog",
"properties": {
"Project": "Unity Catalog Demo",
"Environment": "Development",
"Access": "Public",
"Type": "Schema"
}
}'
create_schema_a_request=$(printf "curl -s \
--location '%s/schemas' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--data '%s'" "$unitycatalog_endpoint" "$create_schema_a_request_body")
docker run --rm --network unitycatalog_network alpine/curl sh -c "$create_schema_a_request" | jq .
This example lists all schemas within the "MyCatalog" catalog:
list_schemas_request="curl -s --location '$unitycatalog_endpoint/schemas?catalog_name=MyCatalog' \
--header 'Accept: application/json'"
echo $list_schemas_request
docker run --rm --network unitycatalog_network alpine/curl sh -c "$list_schemas_request" | jq .
This example demonstrates creating a managed table named "Table_A" within the "Schema_A" schema of the "MyCatalog" catalog. A managed table lets Unity Catalog manage the data location.
create_table_a_request_body='{
"name": "Table_A",
"catalog_name": "MyCatalog",
"schema_name": "Schema_A",
"table_type": "MANAGED",
"data_source_format": "DELTA",
"columns": [
{
"name": "ID",
"type_name": "LONG",
"comment": "The unique ID of the person",
"nullable": "false"
},
{
"name": "FirstName",
"type_name": "STRING",
"comment": "The persons official first name",
"nullable": "false"
},
{
"name": "LastName",
"type_name": "STRING",
"comment": "The persons official last name",
"nullable": "false"
}
],
"comment": "A managed table. Leaving it to Unity Catalog to pick the location.",
"properties": {
"Project": "Unity Catalog Demo",
"Environment": "Development",
"Access": "Public",
"Type": "Table",
"Stage": "Gold"
}
}'
create_table_a_request=$(printf "curl -s \
--location '%s/tables' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--data '%s'" "$unitycatalog_endpoint" "$create_table_a_request_body")
docker run --rm --network unitycatalog_network alpine/curl sh -c "$create_table_a_request" | jq .
This example demonstrates creating an external table named "Table_B" within the "Schema_A" schema of the "MyCatalog" catalog. An external table points to existing data stored outside of Unity Catalog.
First, you'll need to create a CSV file containing the data for the table. This example creates a file named test.csv
within the container's /opt/unitycatalog/external_data
directory:
docker exec -it unitycatalog /bin/bash
cd /opt/unitycatalog
mkdir external_data
cd external_data
printf '"ID", "FirstName", "LastName"
"1", "Jean", "Boutros"
"2", "Eddy", "Stone"
"3", "Paul", "Clark"\n' > test.csv
Now, define the external table using a curl
command. The storage_location
property points to the CSV file location within the container:
create_table_b_request_body='{
"name": "Table_B",
"catalog_name": "MyCatalog",
"schema_name": "Schema_A",
"table_type": "EXTERNAL",
"data_source_format": "CSV",
"columns": [
{
"name": "ID",
"type_name": "LONG",
"comment": "The unique ID of the person",
"nullable": "false"
},
{
"name": "FirstName",
"type_name": "STRING",
"comment": "The persons official first name",
"nullable": "false"
},
{
"name": "LastName",
"type_name": "STRING",
"comment": "The persons official last name",
"nullable": "false"
}
],
"storage_location": "/opt/unitycatalog/external_data/test.csv",
"comment": "No comments",
"properties": {
"Project": "Unity Catalog Demo",
"Environment": "Development",
"Access": "Public",
"Type": "External Table",
"Stage": "RAW"
}
}'
create_table_b_request=$(printf "curl -s \
--location '%s/tables' \
--header 'Content-Type: application/json' \
--header 'Accept: application/json' \
--data '%s'" "$unitycatalog_endpoint" "$create_table_b_request_body")
docker run --rm --network unitycatalog_network alpine/curl sh -c "$create_table_b_request" | jq .