Skip to content

Commit

Permalink
Install script (#160)
Browse files Browse the repository at this point in the history
* added install script, updated readme

* updated readme

* formatting

* add careless test subprogram

* ensure latest release

* add example careless test output

* bump version

* reformat code block

* reformat code

* test reformat

* fix formatting

* add newline

---------

Co-authored-by: Kevin Dalton <[email protected]>
  • Loading branch information
kmdalton and Kevin Dalton authored Jun 13, 2024
1 parent 2930d2e commit 14c4a38
Show file tree
Hide file tree
Showing 5 changed files with 66 additions and 30 deletions.
76 changes: 48 additions & 28 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,38 +19,58 @@ pip install careless
```

## Installation with GPU Support
Careless supports GPU acceleration on NVIDIA GPUs. For users who would like to run `careless` in a cluster computing environment, requesting an interactive GPU node for the installation might be helpful. You may want to first follow the latest [Tensorflow installation instructions](https://www.tensorflow.org/install/pip#step-by-step_instructions) and then install `careless`. Specifically,
1) Create and activate a new environment with the appropriate Python version
```bash
conda create -yn careless python=checktheversion
conda activate careless
pip install --upgrade pip
```
2) Install dependencies for GPU support (see the following paragraph)
3) Install TensorFlow (and verify its GPU support)
```bash
pip install tensorflow==checktheversion
#check that tensorflow sees the correct number of GPUs
python3 -c "import tensorflow as tf; print(len(tf.config.list_physical_devices('GPU')))"
```
4) Install `careless`
```bash
pip install careless
```

The following dependencies are required for GPU support
- NVIDIA driver,
- CUDA Toolkit, and
- cuDNN.

You can determine the versions required by the latest TensorFlow release from the [TensorFlow docs](https://www.tensorflow.org/install/pip#software_requirements). The driver is usually installed through the system package manager and will require root privileges. In a cluster computing environment, a suitable version of the NVIDIA driver will usually be provided by your system administrators. The two libraries, CUDA toolkit and cuDNN, may either be installed through the system package manager or using the Anaconda python distribution as described in the [TensorFlow docs](https://www.tensorflow.org/install/pip#step-by-step_instructions).

You may confirm GPU acceleration is active using the `nvidia-smi` command to monitor GPU usage during model training. If you are having trouble enabling GPU support, you may want to use the `--tf-debug` flag during training for verbose logging of TensorFlow issues.
Careless supports GPU acceleration on NVIDIA GPUs through the CUDA library. We strongly encourage users to take advantage of this feature. To streamline installation, we maintain a script which installs careless with CUDA support. The following section will guide you through installing careless for the GPU.

1) **Install the NVIDIA driver** for your accelerator card. On most hyper performance computing scenarios, this driver should be pre-installed. If it is not, we suggest you contact your system administrator as installation will require elevated privileges.

You may check if the driver is functional by typing `nvidia-smi`. If it is working properly you will see output like the following,

Thu Jun 13 13:01:32 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15 Driver Version: 550.54.15 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 Tesla V100S-PCIE-32GB On | 00000000:86:00.0 Off | 0 |
| N/A 32C P0 25W / 250W | 0MiB / 32768MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| No running processes found |
+-----------------------------------------------------------------------------------------+
A faulty driver will give an error message:

NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
If the driver isn't installed, you will see:

nvidia-smi: command not found
2) **Install Anaconda**. Once you have confirmed the NIVIDIA driver is available, proceed to install the Anaconda python distribution by following the instructions [here](https://docs.anaconda.com/free/anaconda/install/) or as directed by your cluster documentation. Before proceeding, make sure you activate your conda base environment. For typically installations, this should normally happen by opening a new login shell. Alternatively, you may directly source the `conda.sh` in your Anaconda install directory.
3) **Install careless** and associated dependencies including CUDA by running:

source <(curl -s https://raw.githubusercontent.com/rs-station/careless/main/install-cuda.sh)
This will automatically create a new conda environment named careless.

Careless is now installed in its own environment. Whenever you want to run careless, you must first activate the careless conda environment by issuing `conda activate careless`. You can test CUDA support by running the `careless test` subprogram. If your installation was successful, you should see GPU devices listed in the output of `careless test` as in this example:

(careless) user@computer:~$ careless test
Careless version 0.4.2
###############################################
# TensorFlow can access the following devices #
###############################################
- CPU: /physical_device:CPU:0
- GPU: /physical_device:GPU:0


## Dependencies

`careless` is likely to run on any operating system and python version which is compatible with TensorFlow.
Pip will handle installation of all dependencies.
`careless` uses mostly tools from the conventional scientific python stack plus
- optimization routines from [TensorFlow](https://www.tensorflow.org/)
- statistical distributions from [Tensorflow-Probability](https://www.tensorflow.org/probability)
Expand Down
2 changes: 1 addition & 1 deletion careless/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.4.2
0.4.3
8 changes: 8 additions & 0 deletions careless/careless.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,14 @@ def run_careless(parser):
df = LaueFormatter.from_parser(parser)
elif parser.type == 'mono':
df = MonoFormatter.from_parser(parser)
elif parser.type == 'test':
print("###############################################")
print("# TensorFlow can access the following devices #")
print("###############################################")
for dev in tf.config.list_physical_devices():
print(f" - {dev.device_type}: {dev.name}")
from sys import exit
exit()


inputs,rac = df.format_files(parser.reflection_files)
Expand Down
7 changes: 7 additions & 0 deletions careless/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ class CustomParser(EnvironmentSettingsMixin):
- Detect conflicting arguments and raise an informative error
"""
def _validate_input_files(self, parser):
if parser.type == 'test':
return
for inFN in parser.reflection_files:
if not exists(inFN):
self.error(f"Unmerged reflection file {inFN} does not exist")
Expand Down Expand Up @@ -88,6 +90,7 @@ def _fill_text(self, text, width, indent):
subs = parser.add_subparsers(title="Experiment Type", required=True, dest="type")
mono_sub = subs.add_parser("mono", help="Process monochromatic diffraction data.", formatter_class=CustomFormatter)
poly_sub = subs.add_parser("poly", help="Process polychromatic, 'Laue', diffraction data.", formatter_class=CustomFormatter)
test_sub = subs.add_parser("test", help="Print available physical devices", formatter_class=CustomFormatter)

from careless.args import required,poly,groups

Expand All @@ -112,3 +115,7 @@ def _fill_text(self, text, width, indent):
mono_group.add_argument(*args, **kwargs)
poly_group.add_argument(*args, **kwargs)

# Test needs environment settings options
from careless.args import tf_options
for args,kwargs in tf_options.args_and_kwargs:
test_sub.add_argument(*args, **kwargs)
3 changes: 2 additions & 1 deletion install-cuda.sh
Original file line number Diff line number Diff line change
Expand Up @@ -44,4 +44,5 @@ export LD_LIBRARY_PATH="${ORIGINAL_LD_LIBRARY_PATH}"
unset CUDNN_DIR
unset PTXAS_DIR' >> $CONDA_PREFIX/etc/conda/deactivate.d/env_vars.sh

pip install careless
# Install careless
pip install --upgrade careless

0 comments on commit 14c4a38

Please sign in to comment.