remove couple of gerunds + adjust code sections (copy tags) (#330)

* added markdown artifacts for the telegram bot livelab * implemented the adjustements requested by the livelab team * additional improvements suggested by Anoosha * adjustement requested by Rahul Tasker * minor improvement on the bot config step * QUARTERLY QA ID 11418 * QUARTERLY QA ID 11418 * added springai artifacts for the oracle vector search livelab * added adjustments after QA checks * markdown adjustements * markdown adjustements + database step * markdown final adjustments * final batch of improvements and adjustments * diagram adjustment * remove couple of gerunds + adjust code sections (copy tags)
oracle-livelabs · Aug 23, 2024 · f006567 · f006567
1 parent f5fbe62
commit f006567
Showing 1 changed file with 26 additions and 2 deletions.
diff --git a/springai-vector/model/model.md b/springai-vector/model/model.md
@@ -16,14 +16,14 @@ Mac:
 
 In this lab, you will:
 
-- Look at deploying Cohere AI Command-R models with Ollama and Oracle Cloud Infrastructure (OCI).
+- Deploy Cohere AI Command-R models with Ollama and Oracle Cloud Infrastructure (OCI).
 - Look at the basic test of your model's endpoint for Command-R. 
 
 ### Prerequisites
 
 * This lab requires the completion of the **Setup Dev Environment** tutorial.
 
-## Task 1. Using Cohere AI's Command-R model to support chat and embeddings with private LLMs
+## Task 1. Use Cohere AI's Command-R model to support chat and embeddings with private LLMs
 
 Cohere Command-R is a family of LLMs optimized for conversational interaction and long context tasks. Command R delivers high precision on retrieval augmented generation (RAG) with low latency and high throughput. You can get more details about the Command-R models at the [Command-R product page](https://cohere.com/command), and the full technical details are available at the [Model Details](https://docs.cohere.com/docs/command-r) section of its technical documentation.
 
@@ -62,82 +62,102 @@ Cohere Command-R is a family of LLMs optimized for conversational interaction an
 7. At the end of creation process, obtain the **Public IPv4 address**, and with your private key (the one you generated or uploaded during creation), connect to:
 
 ```
+ <copy>
     ssh -i ./<your_private>.key opc@[GPU_SERVER_IP]
+</copy>
 ```
 
 8. Install and configure docker to use GPUs:
 
 ```
+<copy>
     sudo /usr/libexec/oci-growfs
     curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo |   sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
     sudo dnf install -y dnf-utils zip unzip
     sudo dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
     sudo dnf remove -y runc
     sudo dnf install -y docker-ce --nobest
     sudo useradd docker_user
+</copy>
 ```
 
 9. We need to make sure that your Operating System user has permissions to run Docker containers. To do this, we can run the following command:
 
 ```
+<copy>
 sudo visudo
+</copy>
 ```
 
 And add this line at the end:
 
 ```
+<copy>
 docker_user  ALL=(ALL)  NOPASSWD: /usr/bin/docker
+</copy>
 ```
 
 10. For convenience, we need to switch to our new user. For this, run:
 
 ```
+<copy>
 sudo su - docker_user
+</copy>
 ```
 
 11. Finally, let's add an alias to execute Docker with admin privileges every time we type `docker` in our shell. For this, we need to modify a file, depending on your OS (in `.bash_profile` (MacOS) / `.bashrc` (Linux)). Insert, at the end of the file, this command:
 
 ```
+<copy>
 alias docker="sudo /usr/bin/docker"
 exit
+</copy>
 ```
 
 12. We finalize our installation by executing:
 
 ```
+<copy>
 sudo yum install -y nvidia-container-toolkit
 sudo nvidia-ctk runtime configure --runtime=docker
 sudo systemctl restart docker
 nvidia-ctk runtime configure --runtime=docker --config=$HOME/.config/docker/daemon.json
+</copy>
 ```
 
 13. If you're on Ubuntu instead, run:
 
 ```
+<copy>
 sudo apt-get install nvidia-container-toolkit=1.14.3-1 \
         nvidia-container-toolkit-base=1.14.3-1 \
         libnvidia-container-tools=1.14.3-1 \
         libnvidia-container1=1.14.3-1
 sudo apt-get install -y nvidia-docker2
+</copy>
 ```
 
 13. Let's reboot and re-connect to the VM, and run again:
 
 ```
+<copy>
 sudo reboot now
 # after restart, run:
 sudo su - docker_user
+</copy>
 ```
 
 14. Run `docker` to check if everything it's ok.
 
 15. Let's run a Docker container with the `ollama/llama2` model for embeddings/completion:
 
 ```
+<copy>
 docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama serve
 docker exec -it ollama ollama pull command-r
 docker exec -it ollama ollama pull llama3
 docker logs -f --tail 10 ollama
+</copy>
 ```
 
 Both the model, for embeddings/completion will run under the same server, and they will be addressed providing in the REST request for the specific model required.
@@ -173,19 +193,23 @@ Your configured ingress rule:
 6. Configure the environment variables below directly, or update the `env.sh` file and run `source ./env.sh`:
 
 ```
+<copy>
 export OLLAMA_URL=http://[GPU_SERVER_IP]:11434
 export OLLAMA_EMBEDDINGS=command-r
 export OLLAMA_MODEL=command-r
+</copy>
 ```
 
 
 7. Test with a shell running:
 
 ```
+<copy>
 curl ${OLLAMA_URL}/api/generate -d '{
         "model": "command-r",
         "prompt":"Who is Ayrton Senna?"
 }'
+</copy>
 ```
 
 You'll receive the response in continuous sequential responses, facilitating the delivery of the content chunk by chunk, instead of forcing API users to wait for the whole response to be generated before it's displayed to users.