From 0ff3e7b474e7d62777bdf9416eda46b8b73a6368 Mon Sep 17 00:00:00 2001 From: Yaron Date: Sun, 22 Sep 2024 13:09:28 +0300 Subject: [PATCH] Merge pull request #1120 from run-ai/log-collection logs-collection --- .../{log-collection.md => logs-collection.md} | 101 ++++++++---------- docs/home/overview.md | 8 +- mkdocs.yml | 1 + 3 files changed, 46 insertions(+), 64 deletions(-) rename docs/admin/troubleshooting/{log-collection.md => logs-collection.md} (50%) diff --git a/docs/admin/troubleshooting/log-collection.md b/docs/admin/troubleshooting/logs-collection.md similarity index 50% rename from docs/admin/troubleshooting/log-collection.md rename to docs/admin/troubleshooting/logs-collection.md index 1892253cf1..37dc91076a 100644 --- a/docs/admin/troubleshooting/log-collection.md +++ b/docs/admin/troubleshooting/logs-collection.md @@ -1,23 +1,23 @@ +# Logs Collection This article provides instructions for IT administrators on collecting Run:ai logs for support, including prerequisites, CLI commands, and log file retrieval. It also covers enabling verbose logging for Prometheus and the Run:ai Scheduler. ## Collect logs to send to support -To collect Run:ai logs, follow these steps precisely: +To collect Run:ai logs, follow these steps: ### Prerequisites * Ensure that you have administrator-level access to the Kubernetes cluster where Run:ai is installed. -* The Run:ai Administrator Command-Line Interface (CLI) must be [installed](..//config/cli-admin-install.md). -* You must be logged into the Run:ai CLI with the correct permissions. +* The Run:ai [Administrator Command-Line Interface](../config/cli-admin-install.md) (CLI) must be installed. -### Step-by-Step Instructions +#### Step-by-Step Instructions -1. Open a terminal on your local machine (or any machine that has network access to the Kubernetes cluster) where the Run:ai Administrator CLI is installed. -2. Log in to the Run:ai CLI (if required) -3. Collect the Logs: - Execute the command to collect the logs: +1. Run the Command from your local machine or a Bastion Host (secure server) + Open a terminal on your local machine (or any machine that has network access to the Kubernetes cluster) where the Run:ai Administrator CLI is installed. +2. Collect the Logs + Execute the following command to collect the logs: ``` bash runai-adm collect-logs @@ -25,11 +25,11 @@ To collect Run:ai logs, follow these steps precisely: This command gathers all relevant Run:ai logs from the system and generate a compressed file. -5. Locate the Generated File +3. Locate the Generated File After running the command, note the location of the generated compressed log file. You can retrieve and send this file to Run:ai Support for further troubleshooting. !!! Note - The tar file packages the logs of Run:ai components only. It does __not__ include logs of researcher containers that may contain private information. + The tar file packages the logs of Run:ai components only. It does not include logs of researcher containers that may contain private information ## Logs verbosity @@ -44,70 +44,57 @@ Before you begin, ensure you have the following: * kubectl installed and configured: * The Kubernetes command-line tool, `kubectl`, must be installed and configured to interact with the cluster. * Sufficient privileges to edit configurations and view logs. -* Administrative access to Run:ai’s installation settings. * Monitoring Disk Space * When enabling verbose logging, ensure adequate disk space to handle the increased log output, especially when enabling debug or high verbosity levels. ### Adding verbosity -#### Adding verbosity to Prometheus +??? "Adding verbosity to Prometheus" + To increase the logging verbosity for Prometheus, follow these steps: -To increase the logging verbosity for Prometheus, follow these steps: + 1. Edit the `RunaiConfig` to adjust Prometheus log levels. Copy the following command to your terminal: -1. Edit the `RunaiConfig` to adjust Prometheus log levels. Copy the following command to your terminal: -2. Bash - -``` -kubectl edit runaiconfig runai -n runai -``` + ``` bash + kubectl edit runaiconfig runai -n runai + ``` -4. - In the configuration file that opens, add or modify the following section to set the log level to `debug`: -5. Bash + 2. In the configuration file that opens, add or modify the following section to set the log level to `debug`: -``` -spec: - prometheus: + ``` yaml spec: - logLevel: debug -``` - -7. - Save the changes. To view the Prometheus logs with the new verbosity level, run: -8. Bash - -``` -kubectl logs -n runai prometheus-runai-0 -``` + prometheus: + spec: + logLevel: debug + ``` + + 3. Save the changes. To view the Prometheus logs with the new verbosity level, run: -10. + ``` bash + kubectl logs -n runai prometheus-runai-0 + ``` + This command streams the last 100 lines of logs from Prometheus, providing detailed information useful for debugging. -#### Adding verbosity to the scheduler - -To enable extended logging for the Run:ai scheduler: - -1. Edit the `RunaiConfig` to adjust scheduler verbosity: -2. Bash +??? "Adding verbosity to the scheduler" -``` -kubectl edit runaiconfig runai -n runai -``` + To enable extended logging for the Run:ai scheduler: -4. - Add or modify the following section under the scheduler settings: -5. Bash + 1. Edit the `RunaiConfig` to adjust scheduler verbosity: -``` -runai-scheduler: - args: - verbosity: 6 -``` + ``` bash + kubectl edit runaiconfig runai -n runai + ``` + + 2 Add or modify the following section under the scheduler settings: -7. - This increases the verbosity level of the scheduler logs to provide more detailed output. + ``` yaml + runai-scheduler: + args: + verbosity: 6 + ``` -Warning + This increases the verbosity level of the scheduler logs to provide more detailed output. -Enabling verbose logging can significantly increase disk space usage. Monitor your storage capacity and adjust the verbosity level as necessary. +!!! Warning + Enabling verbose logging can significantly increase disk space usage. Monitor your storage capacity and adjust the verbosity level as necessary. diff --git a/docs/home/overview.md b/docs/home/overview.md index 3f14cd38dc..41c1f67d3d 100644 --- a/docs/home/overview.md +++ b/docs/home/overview.md @@ -41,13 +41,7 @@ Run:ai cloud availability is monitored at [status.run.ai](https://status.run.ai) ## Collect Logs to Send to Support -As an IT Administrator, you can collect Run:ai logs to send to support: - -* Install the [Run:ai Administrator command-line interface](../admin//config/cli-admin-install.md). -* Run `runai-adm collect-logs`. The command will generate a compressed file containing all of the existing Run:ai log files. - -!!! Note - The tar file packages the logs of Run:ai components only. It does __not__ include logs of researcher containers that may contain private information. +As an IT Administrator, you can collect Run:ai logs to send to support. For more information see [logs collection](../admin/troubleshooting/logs-collection.md). ## Example Code diff --git a/mkdocs.yml b/mkdocs.yml index 45c91cacf1..a69cb0fb4b 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -251,6 +251,7 @@ nav: - 'User Identity in Container' : 'admin/authentication/non-root-containers.md' - 'Troubleshooting' : # - 'Cluster Health' : 'admin/troubleshooting/cluster-health-check.md' + - 'Logs Collection' : 'admin/troubleshooting/logs-collection.md' - 'Troubleshooting' : 'admin/troubleshooting/troubleshooting.md' - 'Diagnostics' : 'admin/troubleshooting/diagnostics.md'