Refactor DPU Tools into a uniform interface #27

SamD2021 · 2024-11-08T19:19:02Z

All scripts have been either turned into modules, or been housed in existing modules.
Merged Dockerfiles together.
Added a new DPU Types enum class.
Added DPU Hardware Detection.
Refactored DPU Tools script into a class, since there were many variables that should be shared between functions
Transformed pxeboot script into a module and refactored it into a class.
Added firmware class for Bluefields into fwutils module.
Made capture output the default for the run function in the common module.
Most tools can be utilized with auto hardware detection
Added Console check for DPU Hardware Type

This is in preparation for making a unified interface for dpu tools that is able to detected the hardware its running on an call its respective tools ipu: Remove IPU dir

Remove redundant fwversion Remove redundant fwup Remove redundant fwdefaults Remove listbf script The list bf script has been integrated into the list dpus command and function. Now we can use this information to auto detect hardware and have the tool behave accordingly depending on its hardware

Make the imports match the new structure

Many of these functions aren't necessarily tied to the ipu so moving it to the common module makes more sense

Matching imports to new structure

Since we want to unify the scripts behind dpu-tools we should also build the tools using the same Dockerfile

This enum keeps us with a structured way to track different DPU types. Previously we would just use hardcoded strings, but with this class we can even validate if the DPU Type is within our tracked types

Since DPU tools often relies passing the args to each function, it would be nice to have a data structure to bundle all of this in. Likewise it would be good if we could also keep track of what DPU type are working with.

This script is useful because it lets rshim run in the background, which the Bluefields DPUs need to console into them.

…s well

By making this class we can keep together related logic to reset, upgrade and version firmware across dpu types

Many of the Bluefield scripts requires capturing the output. It would be better to have it default to capturing output and have an option to turn it off when we think its not useful

Here we are using a single mode subcommand that can be used to retrieve the mode and set if passed the set-mode flag

…dule

…ll BF reset

SalDaniele

Overall, I am happy to merge if @bn222 approves.

From a user-perspective, I feel everything is working and intuitive (i.e. help for each subcommand provides good output).

From a maintainers persepctive, this could use a little more work to make it easy to extend this in the future for whatever DPUs we add support for moving forward.

SalDaniele · 2024-11-12T19:44:59Z

README.md

 ```

 ## Tools

-All the tools can directly be ran inside the container. All the tools automatically find and act on the first BF in the system.
+All the tools can directly be interacted with through the dpu-tools interface. All the tools automatically find and act on the first DPU in the system. One small difference is that using the console subcommand requires specifying what DPU type you are working with. Check out its help page by passing the `-h` flag.


"one small difference" awkward wording, maybe something like "Note* specific subcommands require knowledge of the DPU type in question. For example, DPU-type is required when running the console command..."

SalDaniele · 2024-11-12T19:46:02Z

README.md

-| `get_mode`   | Gets the BF mode.                                                                        |
-| `cx_fwup`    | Upgrades the firmware on a CX                                                            |
+| `mode`       | Gets the BF mode. Use `--set-mode` to change the mode to either dpu or nic               | 
+| `utils`      | Access common or non-dpu specific utilities. {cw_fwup, bfb}                              |


I like it, we just need to make sure to make an update accordingly in https://github.com/bn222/cluster-deployment-automation/blob/main/host.py#L428.

This applies to all the commands called from CDA.

SalDaniele · 2024-11-12T19:50:36Z

dpu-tools

+
+    def reset(self) -> None:
+        if self.dpu_type == DPUType.IPU.name:
+            run("ssh [email protected] sudo reboot")


We should update this to use the IMC address, this should be a required argument for any IPU commands that need connection to the IMC

SalDaniele · 2024-11-12T19:53:35Z

dpu-tools

+logger = logging.getLogger(__name__)
+
+
+class DPUTools:


It might make sense to make this an abstract-base-class that each DPU derives from. Imagine you are adding a new DPU. You don't want to have to reach through every single line and add a bunch of new "if DPUType.my_new_dpu"... statements.

Might be nicer to do this in one place, and create a class object that inherits from DPUTools for each of these.

That said, this PR works and has been tested, so fine with adding this in a future PR.

SalDaniele · 2024-11-12T20:02:14Z

dpu-tools

+
+
+def main() -> None:
+    detected = detect_dpu_type()


We should take the DPU type from the user if they provide it, otherwise we use the auto-detected dpu type.

We could even through a warning if the users dpu type and the autodetected do not match.

SalDaniele · 2024-11-12T20:05:32Z

dpu-tools

+    )
+    bfb_parser.set_defaults(subcommand="bfb")
+    # Parse arguments and initialize DPUTools
+    args = parser.parse_args()


It might be nice to have this main() function call out to another function to set up all the above arguments. The way it is now makes the code a bit harder to read / harder to track where subsequent changes need to be added.

So main would essentially do:

set up arguments

set up logging

create dpuTools

dispatch dpuTools

Any then the more granular argument organization could go in the "set up arguments" function.

SalDaniele · 2024-11-12T20:11:34Z

utils/common_bf.py

+
+
+@dataclasses.dataclass(frozen=True)
+class Result:


can't we just take this from common.py?

bn222 · 2024-11-12T21:55:31Z

I'm also happy with the changes. We will need to specify which dpu type we're using since we can't detect it.

SamD2021 added 30 commits November 5, 2024 16:00

Reorganize the tools for a unified interface

1b8d461

This is in preparation for making a unified interface for dpu tools that is able to detected the hardware its running on an call its respective tools ipu: Remove IPU dir

dpu-tools: Modify ipu and common related imports

acfa23f

Make the imports match the new structure

Move common hardware agnostic logic from common_ipu.py into common.py

2243e79

Many of these functions aren't necessarily tied to the ipu so moving it to the common module makes more sense

dpu-tools: import setup_logging and run from common.py

31e94e6

ipu/fwutils.py: Update import to match new structure

427147f

Matching imports to new structure

Dockerfile: Merge the Dockerfiles into one

c229421

Since we want to unify the scripts behind dpu-tools we should also build the tools using the same Dockerfile

dpu-tools: Remove redundant code

421a76e

Move console function from dpu-tools script to common_ipu.py

d46d7ce

Move console function from bf/console script to common_bf.py

94da5a2

utils/common.py: Add DPU Types

e2fd01c

This enum keeps us with a structured way to track different DPU types. Previously we would just use hardcoded strings, but with this class we can even validate if the DPU Type is within our tracked types

utils/common.py: Add scan_for_dpus

26de962

utils/common.py: Add DPU Hardware detection

d48d01c

dpu-tools: Use Hardware detection for console

ff9fa58

dpu-tools: Refactor DPUTools into a class

c2cd274

Since DPU tools often relies passing the args to each function, it would be nice to have a data structure to bundle all of this in. Likewise it would be good if we could also keep track of what DPU type are working with.

common_ipu.py: Change console method to console_ipu

e0f4097

utils/fwutils.py: Adapt imports to match package structure

c8f77f1

entry.sh: Add a script to start rshim and run dpu-tools

029db7b

This script is useful because it lets rshim run in the background, which the Bluefields DPUs need to console into them.

Dockerfile: Consolidate rshim logic into an entry script

557188e

Move list_dpus into dpu-tools

933012d

utils/common_ipu.py: Fix imports

261f19b

utils/common_ipu.py: Use configure minicom for creating ipu console a…

09ae8c0

…s well

utils/fwutils.py: Add BFFirmware

b10178e

By making this class we can keep together related logic to reset, upgrade and version firmware across dpu types

utils/remote_api.py: Move remote api class into its own module

a60e53b

utils/fwutils.py: Add firmware version for BF

71220fd

utils/fwutils.py: Add Firmware Upgrade for BF

bc4be6c

utils/fwutils.py: Add Firmware Reset for BF

5b5f1a5

dpu-tools: Add Hardware Detection for Firmware Version

e631493

dpu-tools: Add Hardware detection for required flags

0852296

dpu-tools: Consume Firmware Reset for BF

a47b4b0

SamD2021 added 20 commits November 8, 2024 13:25

dpu-tools: Add BF option for Firmware Up

2acee81

utils/common.py: Make capture_output default to true

d5e470e

Many of the Bluefield scripts requires capturing the output. It would be better to have it default to capturing output and have an option to turn it off when we think its not useful

utils/common_bf.py: Use the run function from the common module

bf7f054

utils/common_bf.py: Add get_mode

8d71462

dpu-tools: Consume get_mode in the main interface

dedae0d

utils/common_bf.py: Add set_mode

0cc8f03

dpu-tools: Consume set_mode in the main interface

b780a41

Here we are using a single mode subcommand that can be used to retrieve the mode and set if passed the set-mode flag

utils/common_bf.py: Move functionality to download BFB images to the BF

d4aee3e

utils/common_bf.py: Move functionality to reset BF into its common mo…

7913a1a

…dule

Move and convert pxeboot into a utils module

61a28d5

dpu-tools: Add command to interact with PXEBOOT logic

296546d

dpu-tools: Let the reset command call the reset function which can ca…

bcd218a

…ll BF reset

dpu-tools: Properly validate the console subcommand

6f05dd5

utils/fwutils.py: Add Firmware Upgrade for cx

3a75d5d

dpu-tools: Consume Firmware Upgrade for CX Cards

034e60b

dpu-tools: Fix main function return annotation

05032ff

mypy.ini: Adjust mypy to match new structure and imports

65ee3dd

dpu-tools: Consume bfb for BF

0d4c92b

README.md: Update Readme with new guide to interact with the tools

c885804

pyproject.toml: Match new structure

c440661

SalDaniele reviewed Nov 12, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor DPU Tools into a uniform interface #27

Refactor DPU Tools into a uniform interface #27

SamD2021 commented Nov 8, 2024

SalDaniele left a comment

SalDaniele Nov 12, 2024

SalDaniele Nov 12, 2024

SalDaniele Nov 12, 2024

SalDaniele Nov 12, 2024

SalDaniele Nov 12, 2024

SalDaniele Nov 12, 2024

SalDaniele Nov 12, 2024

bn222 commented Nov 12, 2024

Refactor DPU Tools into a uniform interface #27

Are you sure you want to change the base?

Refactor DPU Tools into a uniform interface #27

Conversation

SamD2021 commented Nov 8, 2024

SalDaniele left a comment

Choose a reason for hiding this comment

SalDaniele Nov 12, 2024

Choose a reason for hiding this comment

SalDaniele Nov 12, 2024

Choose a reason for hiding this comment

SalDaniele Nov 12, 2024

Choose a reason for hiding this comment

SalDaniele Nov 12, 2024

Choose a reason for hiding this comment

SalDaniele Nov 12, 2024

Choose a reason for hiding this comment

SalDaniele Nov 12, 2024

Choose a reason for hiding this comment

SalDaniele Nov 12, 2024

Choose a reason for hiding this comment

bn222 commented Nov 12, 2024