add node type indexing #225

sunya-ch · 2024-01-29T10:18:53Z

This PR is related to the issue #216 and TODO list in previous PR #222.

The main change is to add node_type indexing in src/train/profiler/node_type_index.py.
I introduce NodeTypeSpec, NodeTypeIndexCollection to group inputting machine data by spec on each pipeline training.
As shown below, at collection, we will autogenerate NodeTypeSpec (processor, #cores, #chips, memory and so on) and keep it in data path. At training, we will read that value by the machine id and then index it in the NodeTypeIndexCollection.
If the same spec has been indexed, it will use the same index number. However, we expect a step to append data from the same group before training. For AWS instance, we expect single profile per one index. The machine index will be kept under pipeline folder in Json format (node_type_index.json). We can read this file and generate machine index on export.

In addition to above enhancement, this PR also includes multiple bug fixes on CI workflow including adding complete-train pipeline run on tekton test.

Signed-off-by: Sunyanan Choochotkaew [email protected]

Signed-off-by: Sunyanan Choochotkaew <[email protected]>

sunya-ch · 2024-02-01T09:08:06Z

I will update exporter for separating each node type. Here are examples of exported value for the pipeline trained on SPECPower data.

https://github.com/sunya-ch/kepler-model-db/blob/specpower/models/v0.7/README.md

Pipeline README page

Model error report page (per node_type)

rootfs · 2024-02-01T13:49:50Z

also cc @KaiyiLiu1234

sunya-ch force-pushed the profiler2 branch from 04cbdbe to cb262bf Compare January 29, 2024 10:34

sunya-ch marked this pull request as draft February 1, 2024 08:29

add node type indexing

5abd240

Signed-off-by: Sunyanan Choochotkaew <[email protected]>

sunya-ch force-pushed the profiler2 branch from cb262bf to 5abd240 Compare February 1, 2024 09:18

sunya-ch marked this pull request as ready for review February 1, 2024 09:18

rootfs approved these changes Feb 1, 2024

View reviewed changes

rootfs merged commit dc4d631 into sustainable-computing-io:main Feb 1, 2024
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add node type indexing #225

add node type indexing #225

sunya-ch commented Jan 29, 2024 •

edited

Loading

sunya-ch commented Feb 1, 2024 •

edited

Loading

rootfs commented Feb 1, 2024

add node type indexing #225

add node type indexing #225

Conversation

sunya-ch commented Jan 29, 2024 • edited Loading

sunya-ch commented Feb 1, 2024 • edited Loading

Pipeline README page

Model error report page (per node_type)

rootfs commented Feb 1, 2024

sunya-ch commented Jan 29, 2024 •

edited

Loading

sunya-ch commented Feb 1, 2024 •

edited

Loading