diff --git a/docs/administration-guide/gaffer-config/change-accumulo-passwords.md b/docs/administration-guide/gaffer-config/change-accumulo-passwords.md new file mode 100644 index 0000000000..dac9e659ab --- /dev/null +++ b/docs/administration-guide/gaffer-config/change-accumulo-passwords.md @@ -0,0 +1,19 @@ +# Changing the Accumulo Passwords + +When deploying Accumulo - either as part of a Gaffer stack or as a standalone, +the passwords for all the users and the instance.secret are set to default +values and should be changed. The instance.secret cannot be changed once +deployed as it is used in initalisation. + +The passwords can be configured in a standard deployment via the +[`accumulo.properties`](https://accumulo.apache.org/docs/2.x/configuration/files#accumuloproperties) +file. + +The following table outlines the values and defaults if using the container +images: + +| Name | value | default value | +| -------------------- | ------------------------------- | ------------- | +| Instance Secret | `instance.secret` | "DEFAULT" | +| Tracer user | `trace.user` | "root" | +| Tracer user password | `trace.token.property.password` | "secret" | diff --git a/docs/administration-guide/gaffer-config/graph-metadata.md b/docs/administration-guide/gaffer-config/graph-metadata.md new file mode 100644 index 0000000000..17cf5a9420 --- /dev/null +++ b/docs/administration-guide/gaffer-config/graph-metadata.md @@ -0,0 +1,31 @@ +# Graph Metadata Configuration + +The graph configuration file is a JSON file that configures a few bits of the +Gaffer graph. Primarily it is used to set the name and description along with +any additional hooks to run before an operation chain e.g. to impose limits on +max results etc. For example, a simple graph configuration file may look like: + +```json title="graphConfig.json" +{ + "graphId": "ExampleGraph", + "description": "An example graph" +} +``` + +## Changing Values + +The standard file location in the gaffer images for the file is `/gaffer/graph/graphConfig.json` + +To change any of the values for a standard Gaffer deployment all that's needed +is to configure the JSON file for the `graphConfig`. The key value pairs in +the file can then be configured as you wish and upon restarting the graph +the values will be updated (assuming the file is loaded correctly). + +However be aware, if you are using the Accumulo store, updating the `graphId` is +a little more complicated since the `graphId` corresponds to an Accumulo table. +This means that if you change the ID then a new Accumulo table will be used and +any existing data would be lost. + +!!! tip + You can see you new description if you to the Swagger UI and call the + `/graph/config/description` endpoint. diff --git a/docs/administration-guide/schema.md b/docs/administration-guide/gaffer-config/schema.md similarity index 99% rename from docs/administration-guide/schema.md rename to docs/administration-guide/gaffer-config/schema.md index 9adb120d00..d16b4d4e12 100644 --- a/docs/administration-guide/schema.md +++ b/docs/administration-guide/gaffer-config/schema.md @@ -14,8 +14,8 @@ The sections below walkthrough the features of Schemas in detail and explain how ## Elements schema The Elements schema is designed to be a high level document describing what information your Graph contains, i.e. the different kinds of edges and entities and the list of properties associated with each. -Essentially this part of the schema should just be a list of all the entities and edges in the graph. -Edges describe the relationship between a source vertex and a destination vertex. +Essentially this part of the schema should just be a list of all the entities and edges in the graph. +Edges describe the relationship between a source vertex and a destination vertex. Entities describe a vertex. Edges describe the relationship between a source vertex and a destination vertex. We use the term "element" to mean either an edge or an entity. @@ -47,7 +47,7 @@ Edges and Entities can optionally have the following fields: - `properties` - Properties are defined by a map of key-value pairs of property names to property types. Property types are described in the Types schema. - `groupBy` - Allows you to specify extra properties (in addition to the element group and vertices) to use for controlling when similar elements should be grouped together and summarised. By default Gaffer uses the element group and its vertices to group similar elements together when aggregating and summarising elements. - `visibilityProperty` - Used to specify the property to use as a visibility property when using visibility properties in your graph. If sensitive elements have a visibility property then set this field to that property name. This ensures Gaffer knows to restrict access to sensitive elements. -- `timestampProperty` - Used to specify timestamp property in your graph, so Gaffer Stores know to treat that property specially. Setting this is optional and does not affect the queries available to users. This property allows Store implementations like Accumulo to optimise the way the timestamp property is persisted. For these stores using it can have a very slight performance improvement due to the lazy loading of properties. For more information [see the timestamp section of the Accumulo Store Reference](gaffer-stores/accumulo-store.md#timestamp). +- `timestampProperty` - Used to specify timestamp property in your graph, so Gaffer Stores know to treat that property specially. Setting this is optional and does not affect the queries available to users. This property allows Store implementations like Accumulo to optimise the way the timestamp property is persisted. For these stores using it can have a very slight performance improvement due to the lazy loading of properties. For more information [see the timestamp section of the Accumulo Store Reference](../gaffer-stores/accumulo-store.md#timestamp). - `aggregate` - Specifies if aggregation is enabled for this element group. True by default. If you would like to disable aggregation, set this to false. These 2 optional fields are for advanced users. They can go in the Elements Schema, however we have split them out into separate Validation and Aggregation Schema files for this page, so the logic doesn't complicate the Elements schema. @@ -392,7 +392,6 @@ Once the schema has been loaded into a graph the parent elements are merged into } ``` - ## Java API Schemas can be loaded from a JSON file directly using the [`fromJSON()` method of the `Schema` class](https://gchq.github.io/Gaffer/uk/gov/gchq/gaffer/store/schema/Schema.html#fromJson(java.io.InputStream...)). This accepts `byte[]`, `InputStream` or `Path` types, for example: diff --git a/docs/administration-guide/gaffer-deployment/gaffer-docker/gaffer-images.md b/docs/administration-guide/gaffer-deployment/gaffer-docker/gaffer-images.md new file mode 100644 index 0000000000..8397d9ee9c --- /dev/null +++ b/docs/administration-guide/gaffer-deployment/gaffer-docker/gaffer-images.md @@ -0,0 +1,115 @@ +# Gaffer Images + +As demonstrated in the [quickstart](../quickstart.md) it is very simple to start +up a basic in memory gaffer graph using the available Open Container Initiative +(OCI) images. + +For large scale graphs with persistent storage you will want to use a different +storage backend to a Map Store; the recommended one being Accumulo. To do this a +different deployment of containers are required. This guide will run through the +containers needed for a basic Accumulo cluster and how to configure and create +custom images of Gaffer. + +## Available Images + +Currently there are a few different images that can be used to run a Gaffer +deployment. The main ones are outlined in the following table and are all +available on [Docker Hub](https://hub.docker.com/u/gchq). + +| Image | Description | +| ----- | ----------- | +| `gchq/accumulo` | This image is a containerised deployment of [Apache Accumulo](https://accumulo.apache.org/). This was created as historically there has not been an available official image from the maintainers of Accumulo. There has since been an [offical image](https://github.com/apache/accumulo-docker) made available however, it is not well supported so not currently in use in Gaffer. | +| `gchq/hdfs` | A custom image for running HDFS (Hadoop file system) via a container. Contains an official release of [Apache Hadoop](https://hadoop.apache.org/) which is used as the scalable data storage for Accumulo. | +| `gchq/gaffer` | This is the main container image for Gaffer that is built on on top of the `gchq/accumulo` image so includes a release of `zookeeper`, `hdfs` and `accumulo` along with the Gaffer libraries. Running this image simply runs an Accumulo instance but with the Gaffer libraries loaded to allow Graph creation. | +| `gchq/gaffer-rest` | This is the REST API image containing the files that can be used to configure the graph to connect to the chosen store, by default there are some pre-configured config files which can be overridden by a [bind-mount](#volumes-and-bind-mount) of alternatives. | + +!!! note + There are a few other images available; however, they are less frequently + used or purely example images, please see the [`gaffer-docker`](https://github.com/gchq/gaffer-docker/tree/develop/docker) + repository for more details. + +## Volumes and bind-mount + +To change and configure the graph that is deployed you will need to override +the default files in the images by default. You can of course create a custom +image with different config files however, it can be more flexible to just +bind-mount over the current files. + +To do this you will need to know the location of the files in the image you +want to override but in many cases you can mount over an entire directory +for example: + +!!! example "" + The path `/custom/configs` is some path on the host system with different + config files in that can be mounted in when running the image. + + ```bash + docker run \ + -p 8080:8080 \ + -v /custom/configs/gaffer/graph:/gaffer/graph \ + -v /custom/configs/gaffer/schema:/gaffer/schema \ + -v /custom/configs/gaffer/store:/gaffer/store \ + gchq/gaffer-rest:2.0.0 + ``` + +## Custom Images + +To avoid managing a file on the host and bind-mount it, the configuration can be +baked into the image. This works well if the configuration itself is rather +static and the same across all environments. + +Creating a custom image can also be useful if you want to load custom extensions +to use with Gaffer (e.g. Jars) by default. + +To create a custom image simply make a new `Dockerfile` and use one of the Gaffer +images as the base image like the following: + +```dockerfile +FROM gchq/gaffer-rest:latest + +# Copy over the existing directory with store configs in +COPY ./custom/configs/gaffer/store /gaffer/store +``` + +Then build the new image using a suitable tool or just plain Docker from the +current directory like: + +```bash +docker build -t my-gaffer-rest . +``` + +### Adding Additional Libraries + +By default with the Gaffer deployment you get access to the: + +- Sketches library +- Time library +- Bitmap Library +- JCS cache library + +If you want more libraries than this (either one of ours of one of your own) you +will need to customise the docker images and use them in place of the defaults. + +At the moment, the `gchq/gaffer-rest` image uses a runnable jar file located at +`/gaffer/jars`. When it runs it includes the `/gaffer/jars/lib` on the +classpath. This is empty by default because all the dependencies are +bundled in to the JAR. However, if you wanted to add your own jars, you can add +then to this directory like the following: + +```dockerfile +FROM gchq/gaffer-rest:latest +COPY ./custom-lib:1.0-SNAPSHOT.jar /gaffer/jars/lib/ +``` + +For an Accumulo deployment, you may wish to add additional libraries to the +classpath to enable the use of new iterators. To do this you need to update the +`gchq/gaffer` image and add the JARs to the `/opt/accumulo/lib/ext` directory: + +```dockerfile +FROM gchq/gaffer:latest +COPY ./my-library-1.0-SNAPSHOT.jar /opt/accumulo/lib/ext +``` + +!!! note + This path is different in Accumulo v1 please see the [migration page](../../../change-notes/migrating-from-v1-to-v2/accumulo-migration.md) + for more detail. diff --git a/docs/administration-guide/gaffer-deployment/gaffer-docker/how-to-run.md b/docs/administration-guide/gaffer-deployment/gaffer-docker/how-to-run.md new file mode 100644 index 0000000000..fe6c960797 --- /dev/null +++ b/docs/administration-guide/gaffer-deployment/gaffer-docker/how-to-run.md @@ -0,0 +1,262 @@ +# How to Deploy with Accumulo on Docker + +After reading the [previous page](./gaffer-images.md) you should have a good +understanding of what images are available for Gaffer and how to configure them +to your needs. However, before running a deployment backed by Accumulo you will +need to know a bit of background on Hadoop to understand how the data will scale +and be distributed. + +Usually when deploying a container image you simply run the image and everything +is contained locally to the container (hence the name). For larger scale graphs +this is less desirable as we will usually want to be able to scale and load +balance the storage based on the volume of data; this is where Hadoop comes in. + +!!! tip + Please see the [Accumulo Store page](../../gaffer-stores/accumulo-store.md) + for more information on Accumulo and Hadoop. + +## Running a Cluster via Docker for Gaffer + +To run an Accumulo cluster on Docker containers, we will need a few different +containers that will work together. The general set of containers we will need +to run is the following: + +**`zookeeper`** + +**`gchq/hdfs`** + +- `namenode` +- `datanode` (one or more) + +**`gchq/gaffer`** (accumulo) + +- `master` +- `tserver` +- `monitor` +- `gc` + +**`gchq/gaffer-rest`** + +The guide here will walk through the set up of each container using standard Docker +but it may be more practical to use a tool such as docker compose. An example docker +compose file can be found in the [gaffer-docker repository](https://github.com/gchq/gaffer-docker/blob/develop/docker/gaffer/docker-compose.yaml). + +Before starting any containers we need to create a network so all the containers +can talk to each other. To do this we simply run the following command to make a +network and name it appropriately: + +```bash +docker network create gaffer-example +``` + +!!! note + If you are using something like docker compose or host networking (e.g. with + `--net=host`) you can skip the network creation step. + +### ZooKeeper + +Starting with [ZooKeeper](https://zookeeper.apache.org/), this is used by +Accumulo to provide distributed synchronization so is useful to start up first. + +```bash +docker run \ + --detach \ + --name zookeeper \ + --hostname zookeeper \ + --net gaffer-example \ + --env ZOO_SERVERS="server.1=zookeeper:2888:3888;2181" \ + --env ZOO_4LW_COMMANDS_WHITELIST="*" \ + --volume /data \ + --volume /datalog \ + zookeeper:3.7.1 +``` + +The above docker command will run a ZooKeeper container with few bits of +configuration. The main part that is being set up is the hostname and ports. +With ZooKeeper you can also do this by providing a `zoo.cfg` file but as there +is not much to do for a small cluster so we can just pass the configuration in +via environment variables with `--env`. See the [official ZooKeeper docs](https://zookeeper.apache.org/) +for more information and configuration options. + +### Hadoop/HDFS + +Next we can launch the Hadoop cluster, we will use the custom distribution of +the HDFS image `gchq/hdfs`. As mentioned before this provides a single-node +Hadoop cluster which we can run multiple times to extend into a multi-node +cluster. + +To run a Hadoop cluster we first need the configuration files for Hadoop which +we can then add into the running containers. As a starting point you can use the +files from the +[`gaffer-docker`](https://github.com/gchq/gaffer-docker/tree/develop/docker/hdfs/conf) +repository, but you may wish to edit these for your deployment and can read more +in the [official Hadoop docs](https://hadoop.apache.org/docs/r1.2.1/cluster_setup.html#Configuration+Files). + +The first Hadoop container we need is a `namenode` container which runs the +Namenode service essentially acting as a master node. We can run this using +docker like the following: + +```bash +docker run \ + --detach \ + --name hdfs-namenode \ + --hostname hdfs-namenode \ + --net gaffer-example \ + --publish 9870:9870 \ + --env HADOOP_CONF_DIR="/etc/hadoop/conf" \ + --volume /custom/configs/hdfs:/etc/hadoop/conf \ + --volume /var/log/hadoop \ + --volume /data1 \ + --volume /data2 \ + gchq/hdfs:3.3.3 namenode +``` + +The configuration of the above `docker` command is fairly straight forward, we +make sure the port we have configured in the `core-site.xml` file is made +available on the host (e.g. port 9870). We also set up the bind-mount and +environment variables so the custom configuration files are available and used. +Finally there are a few defined volumes for data, these can be changed to where +ever you might want to store the data; it is highly recommended to set these up +correctly for your available infrastructure. + +Once the `namenode` has been created we now need to add the `datanode`(s) these +are additional nodes in the Hadoop cluster to store data. You can add multiple +data nodes to distribute the data across volumes and even machines (with some +additional setup). + +To create a `datanode` container its much of the same steps as the `namenode` +however, we use a different command to run the `datanode` service. + +```bash +docker run \ + --detach \ + --name hdfs-datanode1 \ + --hostname hdfs-datanode1 \ + --net gaffer-example \ + --env HADOOP_CONF_DIR="/etc/hadoop/conf" \ + --volume /custom/configs/hdfs:/etc/hadoop/conf \ + --volume /var/log/hadoop \ + --volume /data1 \ + --volume /data2 \ + gchq/hdfs:3.3.3 datanode +``` + +!!! note + It is recommended you configure the volumes used by your data nodes to fit + your infrastructure and the amount of data you are expecting to store. You + can also run `datanode` containers across multiple machines and network them + together to have full data distribution. + +### Accumulo + +There are a few different containers that need to be started for an Accumulo +instance. They work on a similar principal to the Hadoop/HDFS containers where +the configuration is added via a bind-mount and the different services started +via the command passed to the container. + +For Accumulo we will use the `gchq/gaffer` image which includes the libraries +for both Gaffer and Accumulo. For a deployment of Accumulo generally the +following nodes/containers are needed: + +- `master` - This is the primary coordinating process. Must specify one node. + Can specify a few for fault tolerance (note as of Accumulo v2.1 this is + referred to as `manager`). +- `gc` - The Accumulo garbage collector. Must specify one node. Can specify a + few for fault tolerance. +- `monitor` - Node where Accumulo monitoring web server is run. +- `tserver` - An Accumulo worker process. + +=== "master node" + This node needs to be running before starting the others. + + ```bash + docker run \ + --detach \ + --name accumulo-master \ + --hostname accumulo-master \ + --net gaffer-example \ + --env ACCUMULO_CONF_DIR="/etc/accumulo/conf" \ + --env HADOOP_USER_NAME="hadoop" \ + --volume /custom/configs/accumulo:/etc/accumulo/conf \ + --volume /var/log/accumulo \ + gchq/gaffer:2.0.0-accumulo-2.0.1 master + ``` +=== "gc node" + + ```bash + docker run \ + --detach \ + --name accumulo-gc \ + --hostname accumulo-gc \ + --net gaffer-example \ + --env ACCUMULO_CONF_DIR="/etc/accumulo/conf" \ + --env HADOOP_USER_NAME="hadoop" \ + --volume /custom/configs/accumulo:/etc/accumulo/conf \ + --volume /var/log/accumulo \ + gchq/gaffer:2.0.0-accumulo-2.0.1 gc + ``` + +=== "monitor node" + + ```bash + docker run \ + --detach \ + --name accumulo-monitor \ + --hostname accumulo-monitor \ + --net gaffer-example \ + --publish 9995:9995 \ + --env ACCUMULO_CONF_DIR="/etc/accumulo/conf" \ + --env HADOOP_USER_NAME="hadoop" \ + --volume /custom/configs/accumulo:/etc/accumulo/conf \ + --volume /var/log/accumulo \ + gchq/gaffer:2.0.0-accumulo-2.0.1 monitor + ``` + +=== "tserver node" + + ```bash + docker run \ + --detach \ + --name accumulo-tserver \ + --hostname accumulo-tserver \ + --net gaffer-example \ + --env ACCUMULO_CONF_DIR="/etc/accumulo/conf" \ + --env HADOOP_USER_NAME="hadoop" \ + --volume /custom/configs/accumulo:/etc/accumulo/conf \ + --volume /var/log/accumulo \ + gchq/gaffer:2.0.0-accumulo-2.0.1 tserver + ``` + +!!! note + Please see the [official Accumulo docs](https://accumulo.apache.org/docs/2.x/configuration/overview) + for more information on configuring the deployment. + +### REST API + +The final container we need to start up is the REST API, this essentially gives +the front end so we can use containers together in a Gaffer cluster. The REST +API container is also where the configuration for the graph is applied, such as +the schema files and store properties. + +To start up the REST API it is a similar process to the other containers; +however, there are a few more bind-mounts that need defining to configure the +graph (you can also build a custom image with files baked in). + +```bash +docker run \ + --detach \ + --name gaffer-rest \ + --net gaffer-example \ + --publish 8080:8080 \ + --volume /custom/configs/application.properties:/gaffer/config/application.properties \ + --volume /custom/configs/graph:/gaffer/graph \ + --volume /custom/configs/schema:/gaffer/schema \ + --volume /custom/configs/store:/gaffer/store \ + gchq/gaffer-rest:2.0.0-accumulo-2.0.1 monitor +``` + +!!! note + The `gaffer-rest` image comes with some default configuration files and + graph schema, you'll likely want to configure these for your project so + please see the pages on [gaffer configs](../../gaffer-config/config.md) and + [graph schema](../../gaffer-config/schema.md) for more information. diff --git a/docs/administration-guide/gaffer-deployment/kubernetes-guide/helm-configuration.md b/docs/administration-guide/gaffer-deployment/kubernetes-guide/helm-configuration.md new file mode 100644 index 0000000000..2751fded81 --- /dev/null +++ b/docs/administration-guide/gaffer-deployment/kubernetes-guide/helm-configuration.md @@ -0,0 +1,114 @@ +# Configuring Gaffer with Helm + +!!! warning + Configuration via Helm is under development the information here is subject + to change in future releases. + +The general overview of what you can configure in a Gaffer graph is outlined +under the [configuring Gaffer pages](../../gaffer-config/config.md). However, +under a Helm based Kubernetes deployment the configuration needs to be applied +slightly differently, this page captures how you can currently configure a +Gaffer deployment using Helm. + +!!! tip + Use the `--reuse-values` argument on a Helm upgrade to re-use passwords + from the initial construction. + +## Graph Metadata + +Create a file called `graph-meta.yaml`. We will use this file to add our +description and graph ID. Changing the description is as easy as changing the +`graph.config.description` value. + +```yaml +graph: + config: + description: "My graph description" +``` + +Upgrade your deployment using Helm to load the new file: + +```bash +helm upgrade my-graph gaffer-docker/gaffer -f graph-metadata.yaml --reuse-values +``` + +### Graph ID + +Updating the ID may be simple or complicated depending on your store type. If +you are using a Map or Federated store, you can just set the +`graph.config.graphId` value like with the graph description. Though if you are +using a Map Store, the graph will be emptied as a result. + +To safely update the Graph ID of an Accumulo instance you must change the gaffer +users permissions to read and write to that table. To do that update the +`graph-meta.yaml` file with the following contents: + +```yaml +graph: + config: + graphId: "MyGraph" + description: "My Graph description" + +accumulo: + config: + userManagement: + users: + gaffer: + permissions: + table: + MyGraph: + - READ + - WRITE + - BULK_IMPORT + - ALTER_TABLE +``` + +## Loading new Graph Schema + +The easiest way to deploy a schema file is to use helms `--set-file` option +which lets you set a value from the contents of a file. For a Helm deployment to +pick up changes to a Schema, you need to run a helm upgrade: + +```bash +helm upgrade my-graph gaffer-docker/gaffer --set-file graph.schema."schema\.json"=./schema.json --reuse-values +``` + +## Change Accumulo Passwords + +When deploying the Accumulo Helm chart, the following values are set. If you are +using the Gaffer Helm chart with the Accumulo integration, the values will be +prefixed with "accumulo": + +| Name | value | default value | +| -------------------- | --------------------------------------------- | ------------- | +| Instance Secret | `config.accumuloSite."instance.secret"` | "DEFAULT" | +| Root password | `config.userManagement.rootPassword` | "root" | +| Tracer user password | `config.userManagement.users.tracer.password` | "tracer" | + +When you deploy the Gaffer Helm chart with Accumulo, a "gaffer" user with a +password of "gaffer" is used by default following the same pattern as the tracer +user. + +So to install a new Gaffer with Accumulo store, create an +`accumulo-passwords.yaml` with the following contents: + +```yaml +accumulo: + enabled: true + config: + accumuloSite: + instance.secret: "changeme" + userManagement: + rootPassword: "changeme" + users: + tracer: + password: "changme" + gaffer: + password: "changeme" +``` + +You can install the graph with: + +```bash +helm install my-graph gaffer-docker/gaffer -f accumulo-passwords.yaml +``` diff --git a/docs/administration-guide/gaffer-deployment/kubernetes-guide/running-on-kubernetes.md b/docs/administration-guide/gaffer-deployment/kubernetes-guide/running-on-kubernetes.md new file mode 100644 index 0000000000..de00c49c39 --- /dev/null +++ b/docs/administration-guide/gaffer-deployment/kubernetes-guide/running-on-kubernetes.md @@ -0,0 +1,78 @@ +# Running Gaffer on Kubernetes + +Gaffer's Open Container Initiative (OCI) images mean it is also possible to +deploy via kubernetes to give an alternative scalable deployment. This guide +will assume the reader is familiar with general usage of kubernetes, further +reading is available in the [official documentation](https://kubernetes.io/docs/home/). + +!!! note + All the files needed to get started using Gaffer in Kubernetes are contained + in the [`kubernetes`](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes) + sub-folder of the [`gaffer-docker`](https://github.com/gchq/gaffer-docker) + repository. + +## Pre-requisites + +To deploy container images on a kubernetes cluster, you'll need the following: + +- A Kubernetes cluster (local or remote) +- [kubectl](https://kubernetes.io/docs/reference/kubectl/) +- [helm](https://helm.sh/docs/intro/install/) +- An ingress controller running (e.g. NGINX) + +You will also need to install a container management engine such as, `containerd` +via Docker or Podman, to run and manage your containers. + +## Adding the Gaffer Helm Charts + +Helm is a package manager for Kubernetes which uses a format called *charts*. +A chart is a collection of files that describe a set of Kubernetes resources, +essentially what images to run where and how much resources they can access. + +The Helm charts for Gaffer can be found in the following places in the +`gaffer-docker` repository: + +- [Gaffer](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/gaffer) +- [Accumulo](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/accumulo) +- [HDFS](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/hdfs) +- [JupyterHub with Gaffer Integrations](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/gaffer-jhub) +- [Example Gaffer Graph of Road Traffic Data](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/gaffer-road-traffic) + +These charts can be accessed by cloning our repository or by using Helm to add +the `gaffer-docker` repo: + +```bash +helm repo add gaffer-docker https://gchq.github.io/gaffer-docker +``` + +## Using Custom Images + +You may wish to create custom images that have configuration or additional +libraries baked in. + +The [Docker deployment guide](../gaffer-docker/gaffer-images.md#custom-images) +has information on how to create new images but you will need a way of making +the custom images visible to the Kubernetes cluster. Once visible you can switch +them out. + +Create a `custom-images.yaml` file with the following contents: + +```yaml +# Add custom REST API image +api: + image: + repository: custom-rest + tag: latest + +# Add custom Accumulo image +accumulo: + image: + repository: custom-gaffer-accumulo + tag: latest +``` + +To switch them run: + +```bash +helm upgrade my-graph gaffer-docker/gaffer -f custom-images.yaml --reuse-values +``` diff --git a/docs/administration-guide/gaffer-deployment/kubernetes-guide/simple-deployment.md b/docs/administration-guide/gaffer-deployment/kubernetes-guide/simple-deployment.md new file mode 100644 index 0000000000..8914c2e860 --- /dev/null +++ b/docs/administration-guide/gaffer-deployment/kubernetes-guide/simple-deployment.md @@ -0,0 +1,81 @@ +# Creating a Simple Deployment + +This guide will describe how to deploy a simple graph on a Kubernetes cluster +with the minimum configuration. It is assumed you have read the [previous page](./running-on-kubernetes.md) +to get an overview of running Gaffer on Kubernetes. + +To start with, you should add the Gaffer Docker repo to your Helm repos. This +will save the need for cloning this Git repository. If you have already done +this you can skip this step. + +```bash +helm repo add gaffer-docker https://gchq.github.io/gaffer-docker +``` + +The next step is to chose a Store type for the deployment there is a handy +overview of each type in the [quickstart](../quickstart.md) to help you decide +on this. + +## Deploy using a Map Store + +A Map Store is just an in-memory store that can be used for demos or if you need +something small scale short-term. It is our default store so there is no need +for any extra configuration. + +You can install a Map Store by just running: + +```bash +helm install my-graph gaffer-docker/gaffer +``` + +## Deploy using an Accumulo Store + +If you want to deploy an Accumulo Store with your graph, it is relatively easy +to do so with some small additional configuration. Create a file called +`accumulo.yaml` and add the following: + +```yaml +accumulo: + enabled: true +``` + +By default, the Gaffer user is created with a password of "gaffer" the +`CREATE_TABLE` system permission with full access to the simpleGraph table which +is coupled to the `graphId`. + +!!! warning + All the default Accumulo passwords are in place so if you were to deploy this + in production, you should consider changing the [default Accumulo passwords](./helm-configuration.md#change-accumulo-passwords). + +You can stand up an Accumulo store by running: + +```bash +helm install my-graph gaffer-docker/gaffer -f accumulo.yaml +``` + +## Deploy using a Federated Store + +If you want to deploy a Federated Store, all that you really need to do is set +the `store.properties`. To do this add the following to a `federated.yaml` file: + +```yaml +graph: + storeProperties: + gaffer.store.class: uk.gov.gchq.gaffer.federatedstore.FederatedStore + gaffer.store.properties.class: uk.gov.gchq.gaffer.federatedstore.FederatedStoreProperties + gaffer.serialiser.json.modules: uk.gov.gchq.gaffer.sketches.serialisation.json.SketchesJsonModules +``` + +The addition of the `SketchesJsonModules` is just to ensure that if the +Federated Store was connecting to a store which used sketches, they could be +rendered nicely in json. + +We can create the graph with: + +```bash +helm install federated gaffer-docker/gaffer -f federated.yaml +``` + +!!! note + For information on how to configure the deployed graph further please + see the [Gaffer configuration guides](../../gaffer-config/config.md). diff --git a/docs/administration-guide/gaffer-deployment/quickstart.md b/docs/administration-guide/gaffer-deployment/quickstart.md new file mode 100644 index 0000000000..25ab7c3438 --- /dev/null +++ b/docs/administration-guide/gaffer-deployment/quickstart.md @@ -0,0 +1,70 @@ +# Deployment Quickstart + +The quickest way to get up and running with Gaffer is through its container +images. To start up a simple map store based instance with some default schemas +simply pull and run the `gaffer-rest` image. + +```bash +docker pull gchq/gaffer-rest:2.0.0 +``` + +```bash +docker run -p 8080:8080 gchq/gaffer-rest:2.0.0 +``` + +The Swagger rest API should be available at +[http://127.0.0.1:8080/rest](http://127.0.0.1:8080/rest) to try out. + +Be aware that as the image uses the map store backend by default, all graph +data will be saved in memory so killing the container will mean you will lose +any data added to the graph. Take a look at the [possible storage options](#possible-storage-options) +section for an overview of the different store types Gaffer supports. + +If you wish to add custom schema to try out you can mount these into the +container at start up to configure the graph. By default the `gaffer-rest` image +looks under `/gaffer/schema` meaning you can bind-mount over this directory with +a directory containing your custom schema. + +```bash +docker run -p 8080:8080 -v /path/to/your/schema:/gaffer/schema gchq/gaffer-rest:2.0.0 +``` + +!!! info + A simple map store based deployment is usually only useful for small scale + graphs and rapid prototyping; please see the [subsequent pages](./gaffer-docker/how-to-run.md) + in this section for more scalable deployments. + +## Possible Storage Options + +As Gaffer essentially works as a framework to structure and save data into a +data store, the storage option is one of the largest considerations when +deploying a new graph. A few technologies are supported by Gaffer; however, some +are more widely used than others, the main types you might want to use are: + +- **Accumulo Store** - The main recommended data store for Gaffer implemented by + [Apache Accumulo](https://accumulo.apache.org/). +- **Map Store** - In memory JVM store, useful for quick prototyping. +- **Proxy Store** - This provides a way to hook into an existing Gaffer store, + when used all operations are delegated to the chosen Gaffer Rest API. +- **Federated Store** - Similar to a proxy store however, this will forward all + requests to a collection of sub graphs but merge the responses so they + appear as one graph. + +Once the storage option has been chosen, the deployment can be setup and started +using one or more of the available Gaffer container images. + +!!! info + Please see the [gaffer stores documentation](../gaffer-stores/store-guide.md) + for more information on the available store types. + +To change the storage backend for Gaffer the `store.properties` file can be +configured with the chosen type. Various other properties and configuration are +available and covered in the [Gaffer configuration section](../gaffer-config/config.md). + +!!! example "" + Example `store.properties` for MapStore + + ```properties + gaffer.store.class=uk.gov.gchq.gaffer.mapstore.MapStore + gaffer.store.properties.class=uk.gov.gchq.gaffer.mapstore.MapStoreProperties + ``` diff --git a/docs/administration-guide/gaffer-stores/accumulo-store.md b/docs/administration-guide/gaffer-stores/accumulo-store.md index 7628ffd46f..de23b0c6ac 100644 --- a/docs/administration-guide/gaffer-stores/accumulo-store.md +++ b/docs/administration-guide/gaffer-stores/accumulo-store.md @@ -11,6 +11,15 @@ Gaffer contains a store implemented using Apache Accumulo. This offers the follo - Flexibly query-time filtering, aggregation and transformation - Integration with Apache Spark to allow Gaffer data stored in Accumulo to be analysed as either an RDD or a Dataframe +## What is Hadoop/Accumulo? + +[Apache Hadoop](https://hadoop.apache.org/) is an open-source software framework used for distributed storage and processing of large datasets. +Hadoop is designed to handle various types of data, including structured, semi-structured, and unstructured data. It is a highly scalable framework that allows users to add nodes to the cluster as needed. + +Hadoop has two main components: Hadoop Distributed File System (HDFS) and MapReduce. HDFS is a distributed file system that provides high-throughput access to data. MapReduce is a programming model used for processing large datasets in parallel. + +Accumulo is built on top of the HDFS to provide a key-value store with all the same scalability and robustness of Hadoop. + ## Use cases Gaffer's `AccumuloStore` is particularly well-suited to graphs where the properties on vertices and edges are formed by aggregating interactions over time windows. @@ -25,7 +34,7 @@ Gaffer can also be used with a `MiniAccumuloCluster`. This is an Accumulo cluste All real applications of Gaffer's `AccumuloStore` will use an Accumulo cluster running on a real Hadoop cluster consisting of multiple servers. Instructions on setting up an Accumulo cluster can be found in [Accumulo's User Manual](https://accumulo.apache.org/docs/2.x/getting-started/quickstart). -To use Gaffer's Accumulo store, it is necessary to add a jar file to the class path of all of Accumulo's tablet servers. This jar contains Gaffer code that runs inside Accumulo's tablet servers to provide functionality such as aggregation and filtering at ingest and query time. +To use Gaffer's Accumulo store, it is necessary to add a jar file to the class path of all of Accumulo's tablet servers. This jar contains Gaffer code that runs inside Accumulo's tablet servers to provide functionality such as aggregation and filtering at ingest and query time. The Accumulo store iterators.jar required can be downloaded from [maven central](https://central.sonatype.com/search?namespace=uk.gov.gchq.gaffer&name=accumulo-store). It follows the naming scheme `accumulo-store-{version}-iterators.jar`, e.g. `accumulo-store-2.0.0-iterators.jar`. This jar file will then need to be installed on Accumulo's tablet servers by adding it to the classpath. For Accumulo 1.x.x it can be placed in the `lib/ext` folder within the Accumulo distribution on each tablet server, Accumulo should load this jar file without needing to be restarted. For Accumulo 2.x.x [this dynamic reloading classpath directory functionality has been deprecated](https://accumulo.apache.org/release/accumulo-2.0.0/#removed-default-dynamic-reloading-classpath-directory-libext). The jar can instead be put into the `lib` directory and Accumulo restarted. The `lib` directory can also be used with Accumulo 1.x.x and may be useful if you see error messages due to classes not being found and restarting Accumulo doesn't fix the problem. @@ -231,8 +240,8 @@ To carry out the migration you will need the following: - your schema in 1 or more json files. - `store.properties` file contain your accumulo store properties -- a jar-with-dependencies containing the Accumulo Store classes and any of your custom classes. -If you don't have any custom classes then you can just use the `accumulo-store-[version]-utility.jar`. +- a jar-with-dependencies containing the Accumulo Store classes and any of your custom classes. +If you don't have any custom classes then you can just use the `accumulo-store-[version]-utility.jar`. Otherwise you can create one by adding a build profile to your maven pom: ```xml diff --git a/docs/administration-guide/import-export-data.md b/docs/administration-guide/import-export-data.md deleted file mode 100644 index e69de29bb2..0000000000 diff --git a/docs/administration-guide/introduction.md b/docs/administration-guide/introduction.md index 0489686420..66bf53a96b 100644 --- a/docs/administration-guide/introduction.md +++ b/docs/administration-guide/introduction.md @@ -11,8 +11,8 @@ There are detailed guides in this section on how to set up a Gaffer instance, covering both containerised deployments via standard Docker/Podman along with Kubernetes deployment via Helm. -- [Kubernetes Guide](./where-to-run-gaffer/kubernetes-guide/kubernetes-guide.md) -- [Docker Guide](./where-to-run-gaffer/gaffer-docker.md) +- [Kubernetes Guide](./gaffer-deployment/kubernetes-guide/running-on-kubernetes.md) +- [Docker Guide](./gaffer-deployment/gaffer-docker/gaffer-images.md) ## Graph Configuration diff --git a/docs/administration-guide/where-to-run-gaffer/gaffer-docker.md b/docs/administration-guide/where-to-run-gaffer/gaffer-docker.md deleted file mode 100644 index 4a983c548b..0000000000 --- a/docs/administration-guide/where-to-run-gaffer/gaffer-docker.md +++ /dev/null @@ -1,37 +0,0 @@ -# Gaffer Docker - -The [gaffer-docker](https://github.com/gchq/gaffer-docker) repository contains -all code needed to run Gaffer using Docker. - -All the files needed to get started using Gaffer in Docker are contained in the -['docker'](https://github.com/gchq/gaffer-docker/tree/develop/docker) -sub-folder. - -In this directory you can find the Dockerfiles and docker compose files for -building container images for: - -- [Gaffer](https://github.com/gchq/gaffer-docker/tree/develop/docker/gaffer) -- [Gaffer's REST - API](https://github.com/gchq/gaffer-docker/tree/develop/docker/gaffer-rest) -- [Gaffer's Road Traffic - Example](https://github.com/gchq/gaffer-docker/tree/develop/docker/gaffer-road-traffic-loader) -- [HDFS](https://github.com/gchq/gaffer-docker/tree/develop/docker/hdfs) -- [Accumulo](https://github.com/gchq/gaffer-docker/tree/develop/docker/accumulo) -- [Gaffer's Integration - Tests](https://github.com/gchq/gaffer-docker/tree/develop/docker/gaffer-integration-tests) -- [gafferpy Jupyter - Notebook](https://github.com/gchq/gaffer-docker/tree/develop/docker/gaffer-pyspark-notebook) -- [Gaffer's JupyterHub Options - Server](https://github.com/gchq/gaffer-docker/tree/develop/docker/gaffer-jhub-options-server) -- [Spark](https://github.com/gchq/gaffer-docker/tree/develop/docker/spark-py) - -Each directory contains a README with more specific information on what these -images are for and how to build them. - -Please note that some of these containers will only be useful if utilised by the -Helm Charts under Kubernetes, and may not be possible to run on their own. - -## Requirements - -Before you can build and run these containers you will need to install Docker or -a compatible equivalent (e.g. Podman). diff --git a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/add-libraries.md b/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/add-libraries.md deleted file mode 100644 index 77d4f934fb..0000000000 --- a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/add-libraries.md +++ /dev/null @@ -1,70 +0,0 @@ -# Adding your own libraries and functions - -By default with the Gaffer deployment you get access to the: - -- Sketches library -- Time library -- Bitmap Library -- JCS cache library - -If you want more libraries than this (either one of ours of one of your own) you will need to customise the docker images and use them in place of the defaults. - -You will need a [basic Gaffer instance deployed on Kubernetes](deploy-empty-graph.md). - -## Add Extra Libraries to Gaffer REST - -At the moment, Gaffer uses a runnable jar file located at `/gaffer/jars`. When it runs it includes the `/gaffer/jars/lib` on the classpath. There is nothing in there by default because all the dependencies are bundled in to the JAR. However, if you wanted to add your own jars, you can do it like this: - -```Dockerfile -FROM gchq/gaffer-rest:latest -COPY ./my-custom-lib:1.0-SNAPSHOT.jar /gaffer/jars/lib/ -``` - -Build the image using: - -```bash -docker build -t custom-rest:latest . -``` - -## Add the extra libraries to the Accumulo image - -Gaffer's Accumulo image includes support for the following Gaffer libraries: - -- The Bitmap Library -- The Sketches Library -- The Time Library - -In order to push down any extra value objects and filters to Accumulo that are not in those libraries, we have to add the jars to the accumulo `/lib/ext directory`. Here is an example `Dockerfile`: - -```Dockerfile -FROM gchq/gaffer:latest -COPY ./my-library-1.0-SNAPSHOT.jar /opt/accumulo/lib/ext -``` - -Then build the image - -```bash -docker build -t custom-gaffer-accumulo:latest . -``` - -# Switch the images in the deployment - -You will need a way of making the custom images visible to the kubernetes cluster. Once visible you can switch them out. Create a `custom-images.yaml` file with the following contents: - -```yaml -api: - image: - repository: custom-rest - tag: latest - -accumulo: - image: - repository: custom-gaffer-accumulo - tag: latest -``` - -To switch them run: - -```bash -helm upgrade my-graph gaffer-docker/gaffer -f custom-images.yaml --reuse-values -``` diff --git a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/change-accumulo-passwords.md b/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/change-accumulo-passwords.md deleted file mode 100644 index a5e2316858..0000000000 --- a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/change-accumulo-passwords.md +++ /dev/null @@ -1,36 +0,0 @@ -# Changing the Accumulo Passwords - -When deploying Accumulo - either as part of a Gaffer stack or as a standalone, the passwords for all the users and the instance.secret are set to default values and should be changed. The instance.secret cannot be changed once deployed as it is used in initalisation. - -When deploying the Accumulo helm chart, the following values are set. If you are using the Gaffer helm chart with the Accumulo integration, the values will be prefixed with "accumulo": - -| Name | value | default value | -| -------------------- | --------------------------------------------- | ------------- | -| Instance Secret | `config.accumuloSite."instance.secret"` | "DEFAULT" | -| Root password | `config.userManagement.rootPassword` | "root" | -| Tracer user password | `config.userManagement.users.tracer.password` | "tracer" | - -When you deploy the Gaffer Helm chart with Accumulo, a "gaffer" user with a password of "gaffer" is used by default following the same pattern as the tracer user. - -So to install a new Gaffer with Accumulo store, create an `accumulo-passwords.yaml` with the following contents: - -```yaml -accumulo: - enabled: true - config: - accumuloSite: - instance.secret: "changeme" - userManagement: - rootPassword: "changeme" - users: - tracer: - password: "changme" - gaffer: - password: "changeme" -``` - -You can install the graph with: - -```bash -helm install my-graph gaffer-docker/gaffer -f accumulo-passwords.yaml -``` diff --git a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/change-graph-metadata.md b/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/change-graph-metadata.md deleted file mode 100644 index 603db49c99..0000000000 --- a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/change-graph-metadata.md +++ /dev/null @@ -1,63 +0,0 @@ -# Changing the Graph ID and Description - -By default, the default Gaffer deployment ships with the Graph name "simpleGraph" and description "A graph for demo purposes" These are just placeholders and can be overwritten. This guide will show you how. - -The first thing you will need to do is [deploy an empty graph](deploy-empty-graph.md). - -## Changing the description - -Create a file called `graph-meta.yaml`. We will use this file to add our description and graph ID. Changing the description is as easy as changing the `graph.config.description` value. - -```yaml -graph: - config: - description: "My graph description" -``` - -## Deploy the new description - -Upgrade your deployment using helm: - -```bash -helm upgrade my-graph gaffer-docker/gaffer -f graph-metadata.yaml --reuse-values -``` - -The `--reuse-values` argument means we do not override any passwords that we set in the initial construction. - -You can see you new description if you to the Swagger UI and call the `/graph/config/description` endpoint. - -## Updating the Graph ID - -This may be simple or complicated depending on your store type. If you are using the Map or Federated store, you can just set the `graph.config.graphId` value in the same way. Though if you are using a MapStore, the graph will be emptied as a result. - -However, if you are using the Accumulo store, updating the graph Id is a little more complicated since the Graph Id corresponds to an Accumulo table. We have to change the gaffer users permissions to read and write to that table. To do that update the graph-meta.yaml file with the following contents: - -```yaml -graph: - config: - graphId: "MyGraph" - description: "My Graph description" - -accumulo: - config: - userManagement: - users: - gaffer: - permissions: - table: - MyGraph: - - READ - - WRITE - - BULK_IMPORT - - ALTER_TABLE -``` - -## Deploy your changes - -Upgrade your deployment using Helm. - -```bash -helm upgrade my-graph gaffer-docker/gaffer -f graph-metadata.yaml --reuse-values -``` - -If you take a look at Accumulo monitor, you will see your new Accumulo table. diff --git a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/deploy-empty-graph.md b/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/deploy-empty-graph.md deleted file mode 100644 index 966a17f05c..0000000000 --- a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/deploy-empty-graph.md +++ /dev/null @@ -1,73 +0,0 @@ -# How to deploy a simple graph - -This guide will describe how to deploy a simple empty graph with the minimum configuration. - -You will need: - -- [kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) -- [helm](https://github.com/helm/helm/releases) -- A Kubernetes cluster (local or remote) -- An ingress controller running (for accessing UIs) - -## Add the Gaffer Docker repo - -To start with, you should add the Gaffer Docker repo to your helm repos. This will save the need for cloning this Git repository. If you have already done this you can skip this step. - -```bash -helm repo add gaffer-docker https://gchq.github.io/gaffer-docker -``` - -## Choose the store - -Gaffer can be backed with a number of different technologies to back its store. Which one you want depends on the use case but as a rule of thumb: - -- If you want just something to spin up quickly at small scale and are not worried about persistence, use the MapStore. -- If you want to back it with a key value datastore, you can deploy the Accumulo Store. -- If you want to join two or more graphs together to query them as one, you will want to use the Federated Store. - -### Deploy the MapStore - -The MapStore is just an in-memory store that can be used for demos or if you need something small scale short-term. It is our default store so there is no need for any extra configuration. - -You can install a MapStore by just running: - -``` -helm install my-graph gaffer-docker/gaffer -``` - -### Deploy the Accumulo Store - -If you want to deploy an Accumulo Store with your graph, it is relatively easy to do so with some small additional configuration. Create a file called `accumulo.yaml` and add the following: - -```yaml -accumulo: - enabled: true -``` - -By default, the Gaffer user is created with a password of "gaffer" the CREATE_TABLE system permission with full access to the simpleGraph table which is coupled to the graphId. All the default Accumulo passwords are in place so if you were to deploy this in production, you should consider changing the [default accumulo passwords](change-accumulo-passwords.md). - -You can stand up the accumulo store by running: - -```bash -helm install my-graph gaffer-docker/gaffer -f accumulo.yaml -``` - -### Deploy the Federated Store - -If you want to deploy the Federated Store, all that you really need to do is set the `store.properties`. To do this add the following to a `federated.yaml` file: - -```yaml -graph: - storeProperties: - gaffer.store.class: uk.gov.gchq.gaffer.federatedstore.FederatedStore - gaffer.store.properties.class: uk.gov.gchq.gaffer.federatedstore.FederatedStoreProperties - gaffer.serialiser.json.modules: uk.gov.gchq.gaffer.sketches.serialisation.json.SketchesJsonModules -``` - -The addition of the `SketchesJsonModules` is just to ensure that if the FederatedStore was connecting to a store which used sketches, they could be rendered nicely in json. - -We can create the graph with: - -```bash -helm install federated gaffer-docker/gaffer -f federated.yaml -``` diff --git a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/deploy-schema.md b/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/deploy-schema.md deleted file mode 100644 index fa80719ef2..0000000000 --- a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/deploy-schema.md +++ /dev/null @@ -1,68 +0,0 @@ -# How to deploy your own schema - -Gaffer uses schema files to describe the data contained in a Graph. This guide will tell you how to deploy your own schemas with a Gaffer Graph. - -You will first need [a basic Gaffer instance deployed on Kubernetes] (deploy-empty-graph.md). - -Once you have that deployed we can change the schema. - -## Edit the schema - -If you run a GetSchema operation against the graph, you will notice that the count property is of type `java.lang.Integer` - change that property to be of type `java.lang.Long`. - -The easiest way to deploy a schema file is to use helms `--set-file` option which lets you set a value from the contents of a file. - -??? example "Example of a `schema.json` file" - - ```json - { - "edges": { - "BasicEdge": { - "source": "vertex", - "destination": "vertex", - "directed": "true", - "properties": { - "count": "count" - } - } - }, - "entities": { - "BasicEntity": { - "vertex": "vertex", - "properties": { - "count": "count" - } - } - }, - "types": { - "vertex": { - "class": "java.lang.String" - }, - "count": { - "class": "java.lang.Long", - "aggregateFunction": { - "class": "uk.gov.gchq.koryphe.impl.binaryoperator.Sum" - } - }, - "true": { - "description": "A simple boolean that must always be true.", - "class": "java.lang.Boolean", - "validateFunctions": [ - { "class": "uk.gov.gchq.koryphe.impl.predicate.IsTrue" } - ] - } - } - } - ``` - -## Update deployment with the new schema - -For our deployment to pick up the changes, we need to run a helm upgrade: - -```bash -helm upgrade my-graph gaffer-docker/gaffer --set-file graph.schema."schema\.json"=./schema.json --reuse-values -``` - -The `--reuse-values` argument tells helm to re-use the passwords that we defined earlier. - -Now if we inspect the schema, you will see that the `count` property has changed to a `Long`. diff --git a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/kubernetes-guide.md b/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/kubernetes-guide.md deleted file mode 100644 index 5fd7acce24..0000000000 --- a/docs/administration-guide/where-to-run-gaffer/kubernetes-guide/kubernetes-guide.md +++ /dev/null @@ -1,40 +0,0 @@ -# Gaffer in Kubernetes - -The [gaffer-docker](https://github.com/gchq/gaffer-docker) repository contains all code needed to run Gaffer using Docker and Kubernetes. - -All the files needed to get started using Gaffer in Kubernetes are contained in the ['kubernetes'](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes) sub-folder of the [gaffer-docker](https://github.com/gchq/gaffer-docker) repository. -In this directory you can find the Helm charts required to deploy various applications onto Kubernetes clusters. - -The Helm charts and associated information for each application can be found in the following places: - -- [Gaffer](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/gaffer) -- [Example Gaffer Graph of Road Traffic Data](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/gaffer-road-traffic) -- [JupyterHub with Gaffer Integrations](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/gaffer-jhub) -- [HFDS](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/hdfs) -- [Accumulo](https://github.com/gchq/gaffer-docker/tree/develop/kubernetes/accumulo) - -These charts can be accessed by cloning our repository or by using our Helm repo hosted on our [Github Pages Site](https://gchq.github.io/gaffer-docker/). - -## Requirements - -To deploy these applications, you'll need access to a suitable Kubernetes distribution. - -You will also need to install a container management engine, for example Docker or Podman, to build, run and manage your containers. - -## Adding this repo to Helm - -To add the gaffer-docker repo to helm run: - -```bash -helm repo add gaffer-docker https://gchq.github.io/gaffer-docker -``` - -## How to Guides - -There are a number of guides to help you deploy Gaffer on Kubernetes. It is important you look at these before you get started as they provide the initial steps for running these applications. - -* [Deploy a simple empty graph](deploy-empty-graph.md) -* [Add your schema](deploy-schema.md) -* [Change the graph ID and description](change-graph-metadata.md) -* [Adding your own libraries and functions](add-libraries.md) -* [Changing passwords for the Accumulo store](change-accumulo-passwords.md) \ No newline at end of file diff --git a/docs/development-guide/example-deployment/writing-the-schema.md b/docs/development-guide/example-deployment/writing-the-schema.md index 19c190a82f..101b07feed 100644 --- a/docs/development-guide/example-deployment/writing-the-schema.md +++ b/docs/development-guide/example-deployment/writing-the-schema.md @@ -27,7 +27,7 @@ In Gaffer an element refers to any object in the graph, i.e. your nodes (vertexe up a graph we need to tell Gaffer what objects are in the graph and the properties they have. The standard way to do this is a JSON config file in the schema directory. The filename can just be called something like `elements.json`, the name is not special as all files under the `schema` -directory will be [merged into a master schema](../../administration-guide/schema.md), but we recommended +directory will be [merged into a master schema](../../administration-guide/gaffer-config/schema.md), but we recommended using an appropriate name. As covered in the [Getting Started Schema page](../../user-guide/schema.md), to write a schema you can see that there are some @@ -147,7 +147,7 @@ extended schema below. ## Types Schema -The other schema that now needs to be written is the types [schema](../../administration-guide/schema.md). As you have seen in the elements +The other schema that now needs to be written is the types [schema](../../administration-guide/gaffer-config/schema.md). As you have seen in the elements schema there are some placeholder types added as the values for many of the keys. These types work similarly to if you have ever programmed in a strongly typed language, they are essentially the wrapper for the value to encapsulate it. diff --git a/docs/development-guide/project-structure/components/graph.md b/docs/development-guide/project-structure/components/graph.md index 7d13e9d9e2..08c0f2a602 100644 --- a/docs/development-guide/project-structure/components/graph.md +++ b/docs/development-guide/project-structure/components/graph.md @@ -33,7 +33,7 @@ The store properties tells the graph the type of store to connect to along with ## Schema The schema is passed to the store to instruct the store how to store and process the data. -See [Schemas](../../../administration-guide/schema.md) for detailed information on schemas and the [Java API section of that page](../../../administration-guide/schema.md#java-api) for lower level info. +See [Schemas](../../../administration-guide/gaffer-config/schema.md) for detailed information on schemas and the [Java API section of that page](../../../administration-guide/gaffer-config/schema.md#java-api) for lower level info. ## Graph Configuration The graph configuration allows you to apply special customisations to the Graph instance. The only required field is the `graphId`. @@ -47,9 +47,9 @@ The `GraphConfig` can be configured with the following: - `view` - The `Graph View` allows a graph to be configured to only returned a subset of Elements when any Operation is executed. For example if you want your `Graph` to only show data that has a count more than 10 you could add a View to every operation you execute, or you can use this `Graph View` to apply the filter once and it would be merged into to all Operation Views so users only ever see this particular view of the data. - `library` - This contains information about the `Schema` and `StoreProperties` to be used. - `hooks` - A list of `GraphHook`s that will be triggered before, after and on failure when operations are executed on the `Graph`. See [GraphHooks](#graph-hooks) for more information. - + Here is an example of a `GraphConfig`: - + ```java new GraphConfig.Builder() .config(new GraphConfig.Builder() diff --git a/mkdocs.yml b/mkdocs.yml index 9f139330aa..0361e71f2c 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -44,8 +44,8 @@ plugins: - index.md - redirects: redirect_maps: - 'dev/docker.md': 'administration-guide/where-to-run-gaffer/gaffer-docker.md' - 'dev/kubernetes-guide/kubernetes.md': 'administration-guide/where-to-run-gaffer/kubernetes-guide/kubernetes-guide.md' + 'dev/docker.md': 'administration-guide/gaffer-deployment/gaffer-docker/gaffer-images.md' + 'dev/kubernetes-guide/kubernetes.md': 'administration-guide/gaffer-deployment/kubernetes-guide/running-on-kubernetes.md' 'dev/ways-of-working.md': 'development-guide/ways-of-working.md' 'dev/components/operation.md': 'development-guide/project-structure/components/operation.md' 'dev/components/cache.md': 'development-guide/project-structure/components/cache.md' @@ -69,7 +69,7 @@ plugins: 'reference/stores-guide/accumulo.md': 'administration-guide/gaffer-stores/accumulo-store.md' 'getting-started/basics.md': 'user-guide/introduction.md' 'getting-started/guide/cardinality.md': 'user-guide/gaffer-basics/what-is-cardinality.md' - 'getting-started/quickstart.md': 'user-guide/introduction.md' + 'getting-started/quickstart.md': 'administration-guide/gaffer-deployment/quickstart.md' 'getting-started/guide/guide.md': 'user-guide/introduction.md' nav: @@ -130,29 +130,31 @@ nav: - 'Spring': 'development-guide/project-structure/maven-dependencies/spring.md' - Administration Guide: - 'Introduction': 'administration-guide/introduction.md' - - 'Docker': 'administration-guide/where-to-run-gaffer/gaffer-docker.md' - - 'Kubernetes': - - 'Kubernetes Guide': 'administration-guide/where-to-run-gaffer/kubernetes-guide/kubernetes-guide.md' - - 'Deploy an Empty Graph': 'administration-guide/where-to-run-gaffer/kubernetes-guide/deploy-empty-graph.md' - - 'Add your Schema': 'administration-guide/where-to-run-gaffer/kubernetes-guide/deploy-schema.md' - - 'Change the Graph ID and Description': 'administration-guide/where-to-run-gaffer/kubernetes-guide/change-graph-metadata.md' - - 'Adding your Own Libraries and Functions': 'administration-guide/where-to-run-gaffer/kubernetes-guide/add-libraries.md' - - 'Changing Accumulo Passwords': 'administration-guide/where-to-run-gaffer/kubernetes-guide/change-accumulo-passwords.md' - - 'Store Setup': + - Deployment: + - 'Quickstart': 'administration-guide/gaffer-deployment/quickstart.md' + - Docker: + - 'Gaffer Images': 'administration-guide/gaffer-deployment/gaffer-docker/gaffer-images.md' + - 'How to Deploy with Accumulo on Docker': 'administration-guide/gaffer-deployment/gaffer-docker/how-to-run.md' + - Kubernetes: + - 'Running Gaffer on Kubernetes': 'administration-guide/gaffer-deployment/kubernetes-guide/running-on-kubernetes.md' + - 'Creating a Simple Deployment': 'administration-guide/gaffer-deployment/kubernetes-guide/simple-deployment.md' + - 'Configuring Gaffer with Helm': 'administration-guide/gaffer-deployment/kubernetes-guide/helm-configuration.md' + - Store Types: - 'Store Guide': 'administration-guide/gaffer-stores/store-guide.md' - 'Accumulo Store': 'administration-guide/gaffer-stores/accumulo-store.md' - 'Federated Store': 'administration-guide/gaffer-stores/federated-store.md' - 'Map Store': 'administration-guide/gaffer-stores/map-store.md' - 'Proxy Store': 'administration-guide/gaffer-stores/proxy-store.md' - - 'Gaffer Configuration': + - Graph Configuration: - 'Configuration': 'administration-guide/gaffer-config/config.md' + - 'Graph Metadata': 'administration-guide/gaffer-config/graph-metadata.md' + - 'Schema': 'administration-guide/gaffer-config/schema.md' + - 'Changing Accumulo Passwords': 'administration-guide/gaffer-config/change-accumulo-passwords.md' - 'Proxy': 'administration-guide/gaffer-config/proxy.md' - 'URL': 'administration-guide/gaffer-config/url.md' - - 'Schema': 'administration-guide/schema.md' - - 'Import/Export Data': 'administration-guide/import-export-data.md' - 'Named Operations': 'administration-guide/named-operations.md' - 'Operation Score': 'administration-guide/operation-score.md' - - 'Security': + - Security: - 'Security Guide': 'administration-guide/security/security-guide.md' - 'User Control': 'administration-guide/security/user-control.md' - Change Notes: