diff --git a/docs/administration-guide/gaffer-config/change-accumulo-passwords.md b/docs/administration-guide/gaffer-config/change-accumulo-passwords.md index 55299b95e6..dac9e659ab 100644 --- a/docs/administration-guide/gaffer-config/change-accumulo-passwords.md +++ b/docs/administration-guide/gaffer-config/change-accumulo-passwords.md @@ -1,51 +1,19 @@ # Changing the Accumulo Passwords -When deploying Accumulo - either as part of a Gaffer stack or as a standalone, the passwords for all the users and the instance.secret are set to default values and should be changed. The instance.secret cannot be changed once deployed as it is used in initalisation. +When deploying Accumulo - either as part of a Gaffer stack or as a standalone, +the passwords for all the users and the instance.secret are set to default +values and should be changed. The instance.secret cannot be changed once +deployed as it is used in initalisation. -## Standard Deployment +The passwords can be configured in a standard deployment via the +[`accumulo.properties`](https://accumulo.apache.org/docs/2.x/configuration/files#accumuloproperties) +file. -The passwords can be configured in a standard deployment via the `accumulo.properties` file. - -The following table outlines the values and defaults if using the container images: +The following table outlines the values and defaults if using the container +images: | Name | value | default value | | -------------------- | ------------------------------- | ------------- | | Instance Secret | `instance.secret` | "DEFAULT" | | Tracer user | `trace.user` | "root" | | Tracer user password | `trace.token.property.password` | "secret" | - - -## Helm Deployment - -When deploying the Accumulo helm chart, the following values are set. If you are using the Gaffer helm chart with the Accumulo integration, the values will be prefixed with "accumulo": - -| Name | value | default value | -| -------------------- | --------------------------------------------- | ------------- | -| Instance Secret | `config.accumuloSite."instance.secret"` | "DEFAULT" | -| Root password | `config.userManagement.rootPassword` | "root" | -| Tracer user password | `config.userManagement.users.tracer.password` | "tracer" | - -When you deploy the Gaffer Helm chart with Accumulo, a "gaffer" user with a password of "gaffer" is used by default following the same pattern as the tracer user. - -So to install a new Gaffer with Accumulo store, create an `accumulo-passwords.yaml` with the following contents: - -```yaml -accumulo: - enabled: true - config: - accumuloSite: - instance.secret: "changeme" - userManagement: - rootPassword: "changeme" - users: - tracer: - password: "changme" - gaffer: - password: "changeme" -``` - -You can install the graph with: - -```bash -helm install my-graph gaffer-docker/gaffer -f accumulo-passwords.yaml -``` diff --git a/docs/administration-guide/gaffer-config/graph-metadata.md b/docs/administration-guide/gaffer-config/graph-metadata.md index 42c5f7f848..723b15aeae 100644 --- a/docs/administration-guide/gaffer-config/graph-metadata.md +++ b/docs/administration-guide/gaffer-config/graph-metadata.md @@ -13,85 +13,20 @@ like: } ``` -## Configuring a Standard Deployment +## Changing Values + +The standard file location in the gaffer images for the file is `/gaffer/graph/graphConfig.json` To change any of the values for a standard Gaffer deployment all thats needed is to configure the JSON file for the `graphConfig`. The key value pairs in -the file can then be configured as you wished and upon restarting the graph +the file can then be configured as you wish and upon restarting the graph the values will be updated (assuming the file is loaded correctly). -The standard file location in the gaffer images for the file is `/gaffer/graph/graphConfig.json` - -## Configuring a Helm Deployment - -Configuring the graph metadata via Helm follows a similar principal to the JSON -files however, you must use the YAML format instead for the key value pairs. The -following gives an example of how the description value can be updated via Helm. - -Create a file called `graph-meta.yaml`. We will use this file to add our -description and graph ID. Changing the description is as easy as changing the -`graph.config.description` value. - -```yaml -graph: - config: - description: "My graph description" -``` - -Upgrade your deployment using Helm to load the new file: - -```bash -helm upgrade my-graph gaffer-docker/gaffer -f graph-metadata.yaml --reuse-values -``` - -The `--reuse-values` argument means we do not override any passwords that we set -in the initial construction. +However be aware, if you are using the Accumulo store, updating the `graphId` is +a little more complicated since the `graphId` corresponds to an Accumulo table. +This means that if you change the ID then a new Accumulo table will be used and +any existing data would be lost. !!! tip You can see you new description if you to the Swagger UI and call the `/graph/config/description` endpoint. - -## Updating the Graph ID - -This may be simple or complicated depending on your store type. If you are using -a Map or Federated store, you can just set the `graph.config.graphId` value in -the same way. Though if you are using a Map Store, the graph will be emptied as -a result. - -However, if you are using the Accumulo store, updating the `graphId` is a little -more complicated since the Graph Id corresponds to an Accumulo table. We have to -change the gaffer users permissions to read and write to that table.To do that -update the graph-meta.yaml file with the following contents: - -=== "JSON" - Configure the `graphConfig.json` file. - - ```json - { - "graphId": "MyGraph", - "description": "My Graph description" - } - ``` - -=== "YAML" - Add to a `graph-meta.yaml` or similar file and load via Helm. - - ```yaml - graph: - config: - graphId: "MyGraph" - description: "My Graph description" - - accumulo: - config: - userManagement: - users: - gaffer: - permissions: - table: - MyGraph: - - READ - - WRITE - - BULK_IMPORT - - ALTER_TABLE - ``` diff --git a/docs/administration-guide/gaffer-config/schema.md b/docs/administration-guide/gaffer-config/schema.md index b3a3ac0f44..d16b4d4e12 100644 --- a/docs/administration-guide/gaffer-config/schema.md +++ b/docs/administration-guide/gaffer-config/schema.md @@ -392,17 +392,6 @@ Once the schema has been loaded into a graph the parent elements are merged into } ``` -## Helm Deployment - -The easiest way to deploy a schema file is to use helms `--set-file` option which lets you set a value from the contents of a file. -For a Helm deployment to pick up changes to a Schema, you need to run a helm upgrade: - -```bash -helm upgrade my-graph gaffer-docker/gaffer --set-file graph.schema."schema\.json"=./schema.json --reuse-values -``` - -The `--reuse-values` argument tells helm to re-use the passwords. - ## Java API Schemas can be loaded from a JSON file directly using the [`fromJSON()` method of the `Schema` class](https://gchq.github.io/Gaffer/uk/gov/gchq/gaffer/store/schema/Schema.html#fromJson(java.io.InputStream...)). This accepts `byte[]`, `InputStream` or `Path` types, for example: diff --git a/docs/administration-guide/gaffer-deployment/gaffer-docker/gaffer-images.md b/docs/administration-guide/gaffer-deployment/gaffer-docker/gaffer-images.md index 2cd56fea9e..91ee18770e 100644 --- a/docs/administration-guide/gaffer-deployment/gaffer-docker/gaffer-images.md +++ b/docs/administration-guide/gaffer-deployment/gaffer-docker/gaffer-images.md @@ -101,11 +101,15 @@ FROM gchq/gaffer-rest:latest COPY ./custom-lib:1.0-SNAPSHOT.jar /gaffer/jars/lib/ ``` -To add any libraries to the `gchq/gaffer` image in order to push down any extra -value objects and filters to Accumulo you have to add the jars to the -`/opt/accumulo/lib/ext` directory: +For an Accumulo deployment, you may wish to add additional libraries to the +classpath to enable the use of new iterators. To do this you need to update the +`gchq/gaffer` image and add the JARs to the `/opt/accumulo/lib/ext` directory: ```dockerfile FROM gchq/gaffer:latest COPY ./my-library-1.0-SNAPSHOT.jar /opt/accumulo/lib/ext ``` + +!!! note + This path is different in Accumulo v1 please see the [migration page](../../../change-notes/migrating-from-v1-to-v2/accumulo-migration.md) + for more detail. diff --git a/docs/administration-guide/gaffer-deployment/gaffer-docker/how-to-run.md b/docs/administration-guide/gaffer-deployment/gaffer-docker/how-to-run.md index 30ce25bf35..c9e7d97219 100644 --- a/docs/administration-guide/gaffer-deployment/gaffer-docker/how-to-run.md +++ b/docs/administration-guide/gaffer-deployment/gaffer-docker/how-to-run.md @@ -11,21 +11,9 @@ is contained locally to the container (hence the name). For larger scale graphs this less desireable as we will usually want to be able to scale and load balance the storage based on the volume of data; this is where Hadoop comes in. -## What is Hadoop/Accumulo? - -[Apache Hadoop](https://hadoop.apache.org/) is an open-source software framework -used for distributed storage and processing of large datasets. Hadoop is -designed to handle various types of data, including structured, semi-structured, -and unstructured data. It is a highly scalable framework that allows users to -add nodes to the cluster as needed. - -Hadoop has two main components: Hadoop Distributed File System (HDFS) and -MapReduce. HDFS is a distributed file system that provides high-throughput -access to data. MapReduce is a programming model used for processing large -datasets in parallel. - -Accumulo is built on top of the HDFS to provide a key-value store with all the -same scalability and robustness of Hadoop. +!!! tip + Please see the [Accumulo Store page](../../gaffer-stores/accumulo-store.md) + for more information on Accumulo and Hadoop. ## Running a Cluster via Docker for Gaffer @@ -49,6 +37,10 @@ to run is the following: **`gchq/gaffer-rest`** +The guide here will walk through the set up of each container using standard Docker +but it may be more practical to use a tool such as docker compose. An example docker +compose file can be found in the [gaffer-docker repository](https://github.com/gchq/gaffer-docker/blob/develop/docker/gaffer/docker-compose.yaml). + Before starting any containers we need to create a network so all the containers can talk to each other. To do this we simply run the following command to make a network and name it appropriately: diff --git a/docs/administration-guide/gaffer-deployment/kubernetes-guide/helm-configuration.md b/docs/administration-guide/gaffer-deployment/kubernetes-guide/helm-configuration.md new file mode 100644 index 0000000000..2751fded81 --- /dev/null +++ b/docs/administration-guide/gaffer-deployment/kubernetes-guide/helm-configuration.md @@ -0,0 +1,114 @@ +# Configuring Gaffer with Helm + +!!! warning + Configuration via Helm is under development the information here is subject + to change in future releases. + +The general overview of what you can configure in a Gaffer graph is outlined +under the [configuring Gaffer pages](../../gaffer-config/config.md). However, +under a Helm based Kubernetes deployment the configuration needs to be applied +slightly differently, this page captures how you can currently configure a +Gaffer deployment using Helm. + +!!! tip + Use the `--reuse-values` argument on a Helm upgrade to re-use passwords + from the initial construction. + +## Graph Metadata + +Create a file called `graph-meta.yaml`. We will use this file to add our +description and graph ID. Changing the description is as easy as changing the +`graph.config.description` value. + +```yaml +graph: + config: + description: "My graph description" +``` + +Upgrade your deployment using Helm to load the new file: + +```bash +helm upgrade my-graph gaffer-docker/gaffer -f graph-metadata.yaml --reuse-values +``` + +### Graph ID + +Updating the ID may be simple or complicated depending on your store type. If +you are using a Map or Federated store, you can just set the +`graph.config.graphId` value like with the graph description. Though if you are +using a Map Store, the graph will be emptied as a result. + +To safely update the Graph ID of an Accumulo instance you must change the gaffer +users permissions to read and write to that table. To do that update the +`graph-meta.yaml` file with the following contents: + +```yaml +graph: + config: + graphId: "MyGraph" + description: "My Graph description" + +accumulo: + config: + userManagement: + users: + gaffer: + permissions: + table: + MyGraph: + - READ + - WRITE + - BULK_IMPORT + - ALTER_TABLE +``` + +## Loading new Graph Schema + +The easiest way to deploy a schema file is to use helms `--set-file` option +which lets you set a value from the contents of a file. For a Helm deployment to +pick up changes to a Schema, you need to run a helm upgrade: + +```bash +helm upgrade my-graph gaffer-docker/gaffer --set-file graph.schema."schema\.json"=./schema.json --reuse-values +``` + +## Change Accumulo Passwords + +When deploying the Accumulo Helm chart, the following values are set. If you are +using the Gaffer Helm chart with the Accumulo integration, the values will be +prefixed with "accumulo": + +| Name | value | default value | +| -------------------- | --------------------------------------------- | ------------- | +| Instance Secret | `config.accumuloSite."instance.secret"` | "DEFAULT" | +| Root password | `config.userManagement.rootPassword` | "root" | +| Tracer user password | `config.userManagement.users.tracer.password` | "tracer" | + +When you deploy the Gaffer Helm chart with Accumulo, a "gaffer" user with a +password of "gaffer" is used by default following the same pattern as the tracer +user. + +So to install a new Gaffer with Accumulo store, create an +`accumulo-passwords.yaml` with the following contents: + +```yaml +accumulo: + enabled: true + config: + accumuloSite: + instance.secret: "changeme" + userManagement: + rootPassword: "changeme" + users: + tracer: + password: "changme" + gaffer: + password: "changeme" +``` + +You can install the graph with: + +```bash +helm install my-graph gaffer-docker/gaffer -f accumulo-passwords.yaml +``` diff --git a/docs/administration-guide/gaffer-deployment/kubernetes-guide/simple-deployment.md b/docs/administration-guide/gaffer-deployment/kubernetes-guide/simple-deployment.md index 15a8b2276e..8914c2e860 100644 --- a/docs/administration-guide/gaffer-deployment/kubernetes-guide/simple-deployment.md +++ b/docs/administration-guide/gaffer-deployment/kubernetes-guide/simple-deployment.md @@ -12,16 +12,9 @@ this you can skip this step. helm repo add gaffer-docker https://gchq.github.io/gaffer-docker ``` -## Choose the Gaffer Store - -Gaffer can be backed with a number of different technologies to back its store. -Which one you want depends on the use case but as a rule of thumb: - -- If you want just something to spin up quickly at small scale and are not - worried about persistence, use the [Map Store](../../gaffer-stores/map-store.md). -- If you want to back it with a key value datastore, you can deploy the [Accumulo Store](../../gaffer-stores/accumulo-store.md). -- If you want to join two or more graphs together to query them as one, you will - want to use the [Federated Store](../../gaffer-stores/federated-store.md). +The next step is to chose a Store type for the deployment there is a handy +overview of each type in the [quickstart](../quickstart.md) to help you decide +on this. ## Deploy using a Map Store @@ -52,7 +45,7 @@ is coupled to the `graphId`. !!! warning All the default Accumulo passwords are in place so if you were to deploy this - in production, you should consider changing the [default Accumulo passwords](../../gaffer-config/change-accumulo-passwords.md). + in production, you should consider changing the [default Accumulo passwords](./helm-configuration.md#change-accumulo-passwords). You can stand up an Accumulo store by running: diff --git a/docs/administration-guide/gaffer-stores/accumulo-store.md b/docs/administration-guide/gaffer-stores/accumulo-store.md index 7628ffd46f..de23b0c6ac 100644 --- a/docs/administration-guide/gaffer-stores/accumulo-store.md +++ b/docs/administration-guide/gaffer-stores/accumulo-store.md @@ -11,6 +11,15 @@ Gaffer contains a store implemented using Apache Accumulo. This offers the follo - Flexibly query-time filtering, aggregation and transformation - Integration with Apache Spark to allow Gaffer data stored in Accumulo to be analysed as either an RDD or a Dataframe +## What is Hadoop/Accumulo? + +[Apache Hadoop](https://hadoop.apache.org/) is an open-source software framework used for distributed storage and processing of large datasets. +Hadoop is designed to handle various types of data, including structured, semi-structured, and unstructured data. It is a highly scalable framework that allows users to add nodes to the cluster as needed. + +Hadoop has two main components: Hadoop Distributed File System (HDFS) and MapReduce. HDFS is a distributed file system that provides high-throughput access to data. MapReduce is a programming model used for processing large datasets in parallel. + +Accumulo is built on top of the HDFS to provide a key-value store with all the same scalability and robustness of Hadoop. + ## Use cases Gaffer's `AccumuloStore` is particularly well-suited to graphs where the properties on vertices and edges are formed by aggregating interactions over time windows. @@ -25,7 +34,7 @@ Gaffer can also be used with a `MiniAccumuloCluster`. This is an Accumulo cluste All real applications of Gaffer's `AccumuloStore` will use an Accumulo cluster running on a real Hadoop cluster consisting of multiple servers. Instructions on setting up an Accumulo cluster can be found in [Accumulo's User Manual](https://accumulo.apache.org/docs/2.x/getting-started/quickstart). -To use Gaffer's Accumulo store, it is necessary to add a jar file to the class path of all of Accumulo's tablet servers. This jar contains Gaffer code that runs inside Accumulo's tablet servers to provide functionality such as aggregation and filtering at ingest and query time. +To use Gaffer's Accumulo store, it is necessary to add a jar file to the class path of all of Accumulo's tablet servers. This jar contains Gaffer code that runs inside Accumulo's tablet servers to provide functionality such as aggregation and filtering at ingest and query time. The Accumulo store iterators.jar required can be downloaded from [maven central](https://central.sonatype.com/search?namespace=uk.gov.gchq.gaffer&name=accumulo-store). It follows the naming scheme `accumulo-store-{version}-iterators.jar`, e.g. `accumulo-store-2.0.0-iterators.jar`. This jar file will then need to be installed on Accumulo's tablet servers by adding it to the classpath. For Accumulo 1.x.x it can be placed in the `lib/ext` folder within the Accumulo distribution on each tablet server, Accumulo should load this jar file without needing to be restarted. For Accumulo 2.x.x [this dynamic reloading classpath directory functionality has been deprecated](https://accumulo.apache.org/release/accumulo-2.0.0/#removed-default-dynamic-reloading-classpath-directory-libext). The jar can instead be put into the `lib` directory and Accumulo restarted. The `lib` directory can also be used with Accumulo 1.x.x and may be useful if you see error messages due to classes not being found and restarting Accumulo doesn't fix the problem. @@ -231,8 +240,8 @@ To carry out the migration you will need the following: - your schema in 1 or more json files. - `store.properties` file contain your accumulo store properties -- a jar-with-dependencies containing the Accumulo Store classes and any of your custom classes. -If you don't have any custom classes then you can just use the `accumulo-store-[version]-utility.jar`. +- a jar-with-dependencies containing the Accumulo Store classes and any of your custom classes. +If you don't have any custom classes then you can just use the `accumulo-store-[version]-utility.jar`. Otherwise you can create one by adding a build profile to your maven pom: ```xml diff --git a/mkdocs.yml b/mkdocs.yml index b0ca2d63b2..6a1924207a 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -138,6 +138,7 @@ nav: - Kubernetes: - 'Running Gaffer on Kubernetes': 'administration-guide/gaffer-deployment/kubernetes-guide/running-on-kubernetes.md' - 'Creating a Simple Deployment': 'administration-guide/gaffer-deployment/kubernetes-guide/simple-deployment.md' + - 'Configuring Gaffer with Helm': 'administration-guide/gaffer-deployment/kubernetes-guide/helm-configuration.md' - Store Types: - 'Store Guide': 'administration-guide/gaffer-stores/store-guide.md' - 'Accumulo Store': 'administration-guide/gaffer-stores/accumulo-store.md'