From ea3012e3729b934d51f2e5be7905fd6faf1ed451 Mon Sep 17 00:00:00 2001
From: Nir Ozery <nir.ozery@treeverse.io>
Date: Tue, 5 Mar 2024 15:25:18 +0200
Subject: [PATCH 1/2] Add README content

---
 README.md | 114 +++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 113 insertions(+), 1 deletion(-)
diff --git a/README.md b/README.md
index f0a3a87..50ff9a4 100644
--- a/README.md
+++ b/README.md
@@ -1 +1,113 @@
-# lakefs-iceberg-catalog
\ No newline at end of file
+<img src="https://docs.lakefs.io/assets/logo.svg" alt="lakeFS logo" width=300/>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<img src="https://www.apache.org/logos/res/iceberg/iceberg.png" alt="Apache Iceberg logo" width=300/>
+
+## lakeFS Iceberg Catalog
+
+lakeFS enriches your Iceberg tables with Git capabilities: create a branch and make your changes in isolation, without affecting other team members.
+
+See the instructions below on build, configuration and usage
+
+## Build
+
+From the repository root run the following maven command
+
+```sh
+mvn clean install -U -DskipTests
+```
+
+Under the `target` directory you will find the jar:
+
+`lakefs-iceberg-catalog-<version>.jar`
+
+Load this jar into your environment.
+
+## Configuration
+
+lakeFS Catalog is using lakeFS HadoopFileSystem under the hood to interact with lakeFS.
+In addition, for better performance we configure the S3A FS to interact directly with the underlying storage:
+
+```scala
+conf.set("spark.hadoop.fs.lakefs.impl", "io.lakefs.LakeFSFileSystem")
+conf.set("spark.hadoop.fs.lakefs.access.key", "AKIAIOSFDNN7EXAMPLEQ")
+conf.set("spark.hadoop.fs.lakefs.secret.key", "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY")
+conf.set("spark.hadoop.fs.lakefs.endpoint", "http://localhost:8000/api/v1")
+conf.set("spark.hadoop.fs.s3a.access.key", "<your-aws-access-key>")
+conf.set("spark.hadoop.fs.s3a.secret.key", "<your-aws-secret-key>")
+```
+
+In the catalog configuration pass the lakefs FS scheme configured previously as the warehouse location
+
+```scala
+conf.set("spark.sql.catalog.lakefs", "org.apache.iceberg.spark.SparkCatalog")
+conf.set("spark.sql.catalog.lakefs.catalog-impl", "io.lakefs.iceberg.LakeFSCatalog")
+conf.set("spark.sql.catalog.lakefs.warehouse", "lakefs://")
+```
+
+## Usage
+
+For our examples, assume lakeFS repository called `myrepo`.
+
+### Create a table
+
+Let's create a table called `table1` under `main` branch and namespace `name.space.`
+To create the table, use the following syntax:
+
+```sql
+CREATE TABLE lakefs.myrepo.main.name.space.table1 (id int, data string);
+```
+
+### Create a branch
+
+We will create a new branch `dev` from `main`, but first lets commit the creation of the table to the main branch:
+
+```
+lakectl commit lakefs://myrepo/main -m "my first iceberg commit"
+```
+
+To create a new branch:
+
+```
+lakectl branch create lakefs://myrepo/dev -s lakefs://myrepo/main
+```
+
+### Make changes on the branch
+
+We can now make changes on `dev` branch:
+
+```sql
+INSERT INTO lakefs.myrepo.dev.name.space.table1 VALUES (3, 'data3');
+```
+
+### Query the table
+
+If we query the table on the `dev` branch, we will see the data we inserted:
+
+```sql
+SELECT * FROM lakefs.myrepo.dev.name.space.table1;
+```
+
+Results in:
+```
++----+------+
+| id | data |
++----+------+
+| 1  | data1|
+| 2  | data2|
+| 3  | data3|
++----+------+
+```
+
+However, data on the `main` branch remains unaffected:
+
+```sql
+SELECT * FROM lakefs.myrepo.main.name.space.table1;
+```
+
+Results in:
+```
++----+------+
+| id | data |
++----+------+
+| 1  | data1|
+| 2  | data2|
++----+------+
+```

From a12b8500877c623462d36c1ac80855c7baa3e616 Mon Sep 17 00:00:00 2001
From: Nir Ozery <nir.ozery@treeverse.io>
Date: Tue, 5 Mar 2024 17:43:19 +0200
Subject: [PATCH 2/2] CR Fixes

---
 README.md | 18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index 50ff9a4..18b77da 100644
--- a/README.md
+++ b/README.md
@@ -22,24 +22,25 @@ Load this jar into your environment.
 
 ## Configuration
 
-lakeFS Catalog is using lakeFS HadoopFileSystem under the hood to interact with lakeFS.
+lakeFS Catalog is using [lakeFS Hadoop FileSystem](https://docs.lakefs.io/integrations/spark.html#lakefs-hadoop-filesystem) under the hood to interact with lakeFS.
 In addition, for better performance we configure the S3A FS to interact directly with the underlying storage:
 
 ```scala
 conf.set("spark.hadoop.fs.lakefs.impl", "io.lakefs.LakeFSFileSystem")
 conf.set("spark.hadoop.fs.lakefs.access.key", "AKIAIOSFDNN7EXAMPLEQ")
 conf.set("spark.hadoop.fs.lakefs.secret.key", "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY")
-conf.set("spark.hadoop.fs.lakefs.endpoint", "http://localhost:8000/api/v1")
+conf.set("spark.hadoop.fs.lakefs.endpoint", "<your-lakefs-endpoint>/api/v1")
 conf.set("spark.hadoop.fs.s3a.access.key", "<your-aws-access-key>")
 conf.set("spark.hadoop.fs.s3a.secret.key", "<your-aws-secret-key>")
 ```
 
-In the catalog configuration pass the lakefs FS scheme configured previously as the warehouse location
+To configure a custom lakeFS catalog using Spark:
+In the catalog configuration pass the lakefs FS schema configured previously as the warehouse location
 
 ```scala
 conf.set("spark.sql.catalog.lakefs", "org.apache.iceberg.spark.SparkCatalog")
 conf.set("spark.sql.catalog.lakefs.catalog-impl", "io.lakefs.iceberg.LakeFSCatalog")
-conf.set("spark.sql.catalog.lakefs.warehouse", "lakefs://")
+conf.set("spark.sql.catalog.lakefs.warehouse", "lakefs://") // Should be equal to the name of the lakefs FS configured
 ```
 
 ## Usage
@@ -48,7 +49,7 @@ For our examples, assume lakeFS repository called `myrepo`.
 
 ### Create a table
 
-Let's create a table called `table1` under `main` branch and namespace `name.space.`
+Let's create a table called `table1` under `main` branch and namespace `name.space`
 To create the table, use the following syntax:
 
 ```sql
@@ -111,3 +112,10 @@ Results in:
 | 2  | data2|
 +----+------+
 ```
+
+### Merge changes
+
+After changing the data on `dev` branch, it is possible to merge the data back to `main` using lakeFS UI, lakectl, or 
+any of our various clients.
+Note that currently for Iceberg tables only fast-forward merge is supported. To ensure the validity of the table history 
+the table in the `main` branch must not be altered before merging from `dev`.
\ No newline at end of file