Skip to content

Commit

Permalink
Merge pull request #1121 from run-ai/log-collection-218
Browse files Browse the repository at this point in the history
Merge pull request #1120 from run-ai/log-collection
  • Loading branch information
yarongol committed Sep 22, 2024
2 parents e3e73c4 + 0ff3e7b commit 3a1c8fc
Show file tree
Hide file tree
Showing 3 changed files with 46 additions and 64 deletions.
Original file line number Diff line number Diff line change
@@ -1,35 +1,35 @@

# Logs Collection

This article provides instructions for IT administrators on collecting Run:ai logs for support, including prerequisites, CLI commands, and log file retrieval. It also covers enabling verbose logging for Prometheus and the Run:ai Scheduler.

## Collect logs to send to support

To collect Run:ai logs, follow these steps precisely:
To collect Run:ai logs, follow these steps:

### Prerequisites

* Ensure that you have administrator-level access to the Kubernetes cluster where Run:ai is installed.
* The Run:ai Administrator Command-Line Interface (CLI) must be [installed](..//config/cli-admin-install.md).
* You must be logged into the Run:ai CLI with the correct permissions.
* The Run:ai [Administrator Command-Line Interface](../config/cli-admin-install.md) (CLI) must be installed.

### Step-by-Step Instructions
#### Step-by-Step Instructions

1. Open a terminal on your local machine (or any machine that has network access to the Kubernetes cluster) where the Run:ai Administrator CLI is installed.
2. Log in to the Run:ai CLI (if required)
3. Collect the Logs:
Execute the command to collect the logs:
1. Run the Command from your local machine or a Bastion Host (secure server)
Open a terminal on your local machine (or any machine that has network access to the Kubernetes cluster) where the Run:ai Administrator CLI is installed.
2. Collect the Logs
Execute the following command to collect the logs:

``` bash
runai-adm collect-logs
```

This command gathers all relevant Run:ai logs from the system and generate a compressed file.

5. Locate the Generated File
3. Locate the Generated File
After running the command, note the location of the generated compressed log file. You can retrieve and send this file to Run:ai Support for further troubleshooting.

!!! Note
The tar file packages the logs of Run:ai components only. It does __not__ include logs of researcher containers that may contain private information.
The tar file packages the logs of Run:ai components only. It does not include logs of researcher containers that may contain private information

## Logs verbosity

Expand All @@ -44,70 +44,57 @@ Before you begin, ensure you have the following:
* kubectl installed and configured:
* The Kubernetes command-line tool, `kubectl`, must be installed and configured to interact with the cluster.
* Sufficient privileges to edit configurations and view logs.
* Administrative access to Run:ai’s installation settings.
* Monitoring Disk Space
* When enabling verbose logging, ensure adequate disk space to handle the increased log output, especially when enabling debug or high verbosity levels.

### Adding verbosity

#### Adding verbosity to Prometheus
??? "Adding verbosity to Prometheus"
To increase the logging verbosity for Prometheus, follow these steps:

To increase the logging verbosity for Prometheus, follow these steps:
1. Edit the `RunaiConfig` to adjust Prometheus log levels. Copy the following command to your terminal:

1. Edit the `RunaiConfig` to adjust Prometheus log levels. Copy the following command to your terminal:
2. Bash

```
kubectl edit runaiconfig runai -n runai
```
``` bash
kubectl edit runaiconfig runai -n runai
```

4.
In the configuration file that opens, add or modify the following section to set the log level to `debug`:
5. Bash
2. In the configuration file that opens, add or modify the following section to set the log level to `debug`:

```
spec:
prometheus:
``` yaml
spec:
logLevel: debug
```
7.
Save the changes. To view the Prometheus logs with the new verbosity level, run:
8. Bash
```
kubectl logs -n runai prometheus-runai-0
```
prometheus:
spec:
logLevel: debug
```

3. Save the changes. To view the Prometheus logs with the new verbosity level, run:

10.
``` bash
kubectl logs -n runai prometheus-runai-0
```

This command streams the last 100 lines of logs from Prometheus, providing detailed information useful for debugging.

#### Adding verbosity to the scheduler
To enable extended logging for the Run:ai scheduler:
1. Edit the `RunaiConfig` to adjust scheduler verbosity:
2. Bash
??? "Adding verbosity to the scheduler"

```
kubectl edit runaiconfig runai -n runai
```
To enable extended logging for the Run:ai scheduler:

4.
Add or modify the following section under the scheduler settings:
5. Bash
1. Edit the `RunaiConfig` to adjust scheduler verbosity:

```
runai-scheduler:
args:
verbosity: 6
```
``` bash
kubectl edit runaiconfig runai -n runai
```

2 Add or modify the following section under the scheduler settings:

7.
This increases the verbosity level of the scheduler logs to provide more detailed output.
``` yaml
runai-scheduler:
args:
verbosity: 6
```

Warning
This increases the verbosity level of the scheduler logs to provide more detailed output.

Enabling verbose logging can significantly increase disk space usage. Monitor your storage capacity and adjust the verbosity level as necessary.
!!! Warning
Enabling verbose logging can significantly increase disk space usage. Monitor your storage capacity and adjust the verbosity level as necessary.

8 changes: 1 addition & 7 deletions docs/home/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,7 @@ Run:ai cloud availability is monitored at [status.run.ai](https://status.run.ai)

## Collect Logs to Send to Support

As an IT Administrator, you can collect Run:ai logs to send to support:

* Install the [Run:ai Administrator command-line interface](../admin//config/cli-admin-install.md).
* Run `runai-adm collect-logs`. The command will generate a compressed file containing all of the existing Run:ai log files.

!!! Note
The tar file packages the logs of Run:ai components only. It does __not__ include logs of researcher containers that may contain private information.
As an IT Administrator, you can collect Run:ai logs to send to support. For more information see [logs collection](../admin/troubleshooting/logs-collection.md).

## Example Code

Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -251,6 +251,7 @@ nav:
- 'User Identity in Container' : 'admin/authentication/non-root-containers.md'
- 'Troubleshooting' :
# - 'Cluster Health' : 'admin/troubleshooting/cluster-health-check.md'
- 'Logs Collection' : 'admin/troubleshooting/logs-collection.md'
- 'Troubleshooting' : 'admin/troubleshooting/troubleshooting.md'
- 'Diagnostics' : 'admin/troubleshooting/diagnostics.md'

Expand Down

0 comments on commit 3a1c8fc

Please sign in to comment.