update model_training process (train and export) #169

sunya-ch · 2023-09-20T11:02:43Z

** Rebase from the PR #172 **

Also, refer to archived pipeline that should be merged on kepler-model-db first by sustainable-computing-io/kepler-model-db#16

Need

to be merged first.

This PR includes

document (README) auto-generation on export script (auto-generate document mentioned in enrich PR template for adding trained pipeline kepler-model-db#14)
sort metadata with feature group first
handle missing trainer class in the list
update get_url to also handle weight url
add assure_path disable option for getting remote machine path
add trainers_with_weight list to list model with weight
update train script to use only stressng benchmark
update stressng benchmark according to the result in community votes (19/09/2023)

Signed-off-by: Sunyanan Choochotkaew [email protected]

marceloamaral · 2023-09-21T17:13:07Z

model_training/script.sh

@@ -180,7 +181,7 @@ function quick_collect() {
 }

 function train() {
-    train_model stressng_kepler_query,coremark_kepler_query,parsec_kepler_query ${VERSION}
+    train_model stressng_kepler_query ${VERSION}


Is this selecting the data only for training?

If it is also for the validation: should we also use the coremark results for testing to verify the accuracy of the model with different workload?

It will be shuffled and use 10% of it for validation.

kepler-model-server/src/train/trainer/__init__.py

Line 28 in 6b4f716

def normalize_and_split(X_values, y_values, scaler, test_size=0.1):

We need to refactor the code to have a fixed validation dataset. Should be created for a separate issue.

Signed-off-by: Sunyanan Choochotkaew <[email protected]>

rootfs · 2023-09-28T13:10:25Z

@sunya-ch the CI passed now.

sunya-ch marked this pull request as draft September 20, 2023 11:03

sunya-ch mentioned this pull request Sep 20, 2023

Finalize standard training pipeline and exported information #147

Closed

16 tasks

sunya-ch added this to the kepler-release-0.6 milestone Sep 20, 2023

sunya-ch force-pushed the exporter branch from 1559367 to 26c0e3e Compare September 20, 2023 11:19

sunya-ch marked this pull request as ready for review September 20, 2023 11:19

marceloamaral reviewed Sep 21, 2023

View reviewed changes

sunya-ch marked this pull request as draft September 27, 2023 06:15

fix extractor/profileisolator bug and update test case

dbe1166

Signed-off-by: Sunyanan Choochotkaew <[email protected]>

sunya-ch force-pushed the exporter branch from 26c0e3e to 3f3ec0a Compare September 28, 2023 05:50

fix/update model training and export

7c4568a

Signed-off-by: Sunyanan Choochotkaew <[email protected]>

sunya-ch force-pushed the exporter branch from 3f3ec0a to 7c4568a Compare September 28, 2023 06:01

sunya-ch marked this pull request as ready for review September 28, 2023 13:22

rootfs approved these changes Sep 29, 2023

View reviewed changes

rootfs merged commit 7e4e716 into sustainable-computing-io:main Sep 29, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update model_training process (train and export) #169

update model_training process (train and export) #169

sunya-ch commented Sep 20, 2023 •

edited

Loading

marceloamaral Sep 21, 2023

sunya-ch Sep 22, 2023

rootfs commented Sep 28, 2023

update model_training process (train and export) #169

update model_training process (train and export) #169

Conversation

sunya-ch commented Sep 20, 2023 • edited Loading

marceloamaral Sep 21, 2023

Choose a reason for hiding this comment

sunya-ch Sep 22, 2023

Choose a reason for hiding this comment

rootfs commented Sep 28, 2023

sunya-ch commented Sep 20, 2023 •

edited

Loading