Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge dev_postgresql into the master branch #1115

Merged
merged 90 commits into from
Oct 25, 2023
Merged
Changes from 1 commit
Commits
Show all changes
90 commits
Select commit Hold shift + click to select a range
264f9e0
Add DeprecationWarning ignore message to the running script of cnn ex…
wannature Apr 9, 2023
468c34a
Merge pull request #1053 from wannature/singa_v32
chrishkchris Apr 10, 2023
8edcb96
Add training scripts for xceptionnet
Apr 27, 2023
374fef4
Merge pull request #1056 from zmeihui/23-4-27
lzjpaul May 2, 2023
0f16a27
mlp model examples on top of postgresql
wannature May 11, 2023
a39f616
Merge pull request #1057 from wannature/singa_v33
lzjpaul May 13, 2023
df937b0
readme for malaria prediction with cnn model
liuchangshiye May 30, 2023
8aca66c
Merge pull request #1060 from liuchangshiye/malaria-cnn
lzjpaul Jun 2, 2023
2f78a24
add shell command file to run the codes
liuchangshiye Jun 8, 2023
0dce2a4
Merge pull request #1064 from liuchangshiye/malaria-cnn
lzjpaul Jun 8, 2023
12b9204
Init the model selection examples, and add README.md
NLGithubWP May 31, 2023
32f018a
Merge pull request #1065 from NLGithubWP/add_model_selection_psql
lzjpaul Jun 10, 2023
f4d820f
Init armnet examples, add README.md
solopku Jun 12, 2023
8261ada
Merge pull request #1066 from solopku/dev-postgresql
lzjpaul Jun 12, 2023
10a7fc9
Add data loader for malaria dataset
liuchangshiye Jun 13, 2023
1970158
Merge pull request #1070 from liuchangshiye/dev-postgresql
lzjpaul Jul 17, 2023
4257283
add cnn model for malaria detection
lemonviv Jul 20, 2023
c847d46
Merge pull request #1072 from lemonviv/dev-postgresql
lzjpaul Jul 20, 2023
a76f3d7
add mlp model for malaria detection
liuchangshiye Aug 25, 2023
377582f
Merge pull request #1073 from liuchangshiye/mlp-malaria-cnn
nudles Aug 28, 2023
d9831d9
Add the training file for model selection
NLGithubWP Aug 28, 2023
1dc8487
Merge pull request #1074 from NLGithubWP/add_train_mlp
chrishkchris Aug 29, 2023
ba5f573
Add SGD optimizer for model selection
NLGithubWP Aug 30, 2023
b56eaab
Merge pull request #1075 from NLGithubWP/update_train_mlp
chrishkchris Aug 31, 2023
53c664f
Add implementation for a single optimization step in model selection
NLGithubWP Aug 31, 2023
1e55cfb
Merge pull request #1076 from NLGithubWP/single_opt_ms
lzjpaul Sep 1, 2023
dc1a7f3
Increase the step counter, learning rate for the model selection SGD
liuchangshiye Sep 1, 2023
bb878cd
Merge pull request #1078 from liuchangshiye/model-selection-psql
lzjpaul Sep 4, 2023
7f0152a
Add implementation for get and set states
NLGithubWP Sep 2, 2023
af1634c
Merge pull request #1080 from NLGithubWP/update_train_mlp_
nudles Sep 5, 2023
96b0799
Add the implementation for model selection optimizer
daoducanhc Sep 5, 2023
5bdb8fb
Merge pull request #1082 from daoducanhc/chris-sep-5th
chrishkchris Sep 5, 2023
3f1d9cc
Add implementation for mlp models in model selection
Sep 5, 2023
d47766c
Merge pull request #1083 from zmeihui/23-9-5-ms
lzjpaul Sep 6, 2023
94e8e43
Add implementation for sum error loss for model selection
daoducanhc Sep 6, 2023
210d7a6
Merge pull request #1085 from daoducanhc/chris-sep-6th
chrishkchris Sep 6, 2023
0c151e7
Add the backward implementation for sum error loss
NLGithubWP Sep 6, 2023
ec2d81c
Merge pull request #1086 from NLGithubWP/add_model
lzjpaul Sep 6, 2023
2cf2a5b
Add the SumErrorLayer implementation
liuchangshiye Sep 6, 2023
e6cb95f
Merge pull request #1088 from liuchangshiye/model-selection-psql-msmlp
nudles Sep 7, 2023
05cd43c
Create the ms_model_mlp folder for dynamic model creation
Sep 7, 2023
915d7d9
Merge pull request #1089 from zmeihui/23-9-7-ms-model
lzjpaul Sep 7, 2023
d14fc3b
Add the autograd implementation for msmlp
daoducanhc Sep 7, 2023
c5f210e
Merge pull request #1090 from daoducanhc/chris-sep-7th
chrishkchris Sep 7, 2023
62a94cd
Add train one batch function for the dynamic model
KimballCai Sep 7, 2023
5b36459
Merge pull request #1091 from KimballCai/dev-postgresql
lzjpaul Sep 7, 2023
9e955b1
Add create model function for the dynamic model
liuchangshiye Sep 8, 2023
20c9d79
Merge pull request #1092 from liuchangshiye/model-selection-psql-ms-m…
lzjpaul Sep 8, 2023
f84ebe5
Add training process for the dynamic model
NLGithubWP Sep 8, 2023
5df0cec
Merge pull request #1093 from NLGithubWP/update_model
lzjpaul Sep 8, 2023
41e260d
Add autograd implementation for dynamic model creation
Sep 11, 2023
9d89392
Merge pull request #1094 from zmeihui/23-9-11-autograd
chrishkchris Sep 11, 2023
1d2f6e7
Debug the training process for training free model evaluation metric
NLGithubWP Sep 12, 2023
7154dad
Merge pull request #1095 from NLGithubWP/dev-postgresql
lzjpaul Sep 12, 2023
369f0f2
Training process for mlp models with varying layer sizes
lzjpaul Sep 14, 2023
cc741f5
Merge pull request #1096 from lzjpaul/23-9-14-ms
chrishkchris Sep 14, 2023
2c955ee
Training script for model selection
lzjpaul Sep 15, 2023
aa5cc42
Merge pull request #1097 from lzjpaul/23-9-15-ms
chrishkchris Sep 15, 2023
957caaf
Update the training file for dynamic models
Wu-Junran Sep 16, 2023
95a5536
Merge pull request #1099 from Wu-Junran/junran-dev-postgresql
lzjpaul Sep 16, 2023
593dc78
Add the doap file for the project
Sep 20, 2023
22252e7
Merge pull request #1100 from zmeihui/23-9-20-doap
lzjpaul Sep 20, 2023
41a3772
Add the training script for CNN models
daoducanhc Sep 22, 2023
931a312
Merge pull request #1101 from daoducanhc/chris-sep-22th
lzjpaul Sep 22, 2023
a7f59a1
Add the training script for models using MPI
NLGithubWP Sep 23, 2023
41aa437
Merge pull request #1102 from NLGithubWP/add_train_mpi
lzjpaul Sep 23, 2023
c7752fa
Model class file for dynamic models
NLGithubWP Oct 2, 2023
5c71962
Merge pull request #1103 from NLGithubWP/dynamic_models
lzjpaul Oct 2, 2023
5b8d928
Add the ModelMeta for dynamic models
lzjpaul Oct 3, 2023
647e5f3
Merge pull request #1104 from lzjpaul/23-10-3-modelmeta
chrishkchris Oct 3, 2023
26deb2c
Update the ModelMeta class
daoducanhc Oct 5, 2023
a73c7b2
Merge pull request #1105 from daoducanhc/chris-oct-5th
chrishkchris Oct 5, 2023
8f3a7ab
Add the running script for model selection
Oct 6, 2023
aa683ce
Merge pull request #1106 from zmeihui/23-10-6-run
lzjpaul Oct 6, 2023
1e4e272
Add the training script for model selection
NLGithubWP Oct 7, 2023
772ba55
Merge pull request #1107 from NLGithubWP/add_ms_model
lzjpaul Oct 7, 2023
3b9658f
Add data processing functions for model selection
lzjpaul Oct 8, 2023
30b7980
Merge pull request #1108 from lzjpaul/23-10-8-data-process
chrishkchris Oct 9, 2023
3ce7c33
Add the optimizer for model selection
Oct 9, 2023
9d8a2f5
Merge pull request #1109 from zmeihui/23-10-9-optimizer
lzjpaul Oct 10, 2023
063effd
Add the implementation of MSSGD for model selection
daoducanhc Oct 11, 2023
830e1f7
Merge pull request #1110 from daoducanhc/chris-oct-11th
lzjpaul Oct 11, 2023
619d119
Add the implementation of a single optimization step for model selection
NLGithubWP Oct 12, 2023
31ea937
Merge pull request #1111 from NLGithubWP/add_opt_ms
chrishkchris Oct 12, 2023
850bcf1
Add the implementation for the model selection example
lzjpaul Oct 24, 2023
b911ce2
Merge pull request #1112 from lzjpaul/23-10-24-v410
chrishkchris Oct 24, 2023
daa9c17
Update license headers for v4.1.0
lzjpaul Oct 25, 2023
3990365
Merge pull request #1113 from lzjpaul/23-10-24-v410-1
chrishkchris Oct 25, 2023
b25cd92
Add the release notes for v4.1.0
Oct 25, 2023
17ac0ad
Merge pull request #1114 from zmeihui/23-10-25-release-note
lzjpaul Oct 25, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Update the training file for dynamic models
  • Loading branch information
Wu-Junran committed Sep 16, 2023
commit 957caaf36640fc4fe74a6ea1a42bf71919115703
25 changes: 21 additions & 4 deletions examples/model_selection_psql/ms_mlp/train_mlp.py
Original file line number Diff line number Diff line change
@@ -349,6 +349,19 @@ def run(global_rank,
from msmlp import model
model = model.create_model(data_size=data_size,
num_classes=num_classes)

elif model == 'ms_model_mlp':
import os, sys, inspect
current = os.path.dirname(
os.path.abspath(inspect.getfile(inspect.currentframe())))
parent = os.path.dirname(current)
sys.path.insert(0, parent)
from ms_model_mlp import model
model = model.create_model(data_size=data_size,
num_classes=num_classes,
layer_hidden_list=layer_hidden_list)
# print ("model: \n", model)


# For distributed training, sequential has better performance
if hasattr(mssgd, "communicator"):
@@ -399,8 +412,8 @@ def run(global_rank,
print ("num_train_batch: \n", num_train_batch)
print ()
for b in range(num_train_batch):
if b % 200 == 0:
print ("b: \n", b)
# if b % 200 == 0:
# print ("b: \n", b)
# Generate the patch data in this iteration
x = train_x[idx[b * batch_size:(b + 1) * batch_size]]
if model.dimension == 4:
@@ -422,6 +435,7 @@ def run(global_rank,
ty.copy_from_numpy(y)
### step 2: all weights turned to positive (done)
### step 3: new loss (done)
### print ("before model forward ...")
pn_p_g_list, out, loss = model(tx, ty, dist_option, spars, synflow_flag)
### step 4: calculate the multiplication of weights
synflow_score = 0.0
@@ -430,11 +444,13 @@ def run(global_rank,
if len(pn_p_g_item[1].shape) == 2: # param_value.data is "weight"
print ("pn_p_g_item[1].shape: \n", pn_p_g_item[1].shape)
synflow_score += np.sum(np.absolute(tensor.to_numpy(pn_p_g_item[1]) * tensor.to_numpy(pn_p_g_item[2])))
print ("layer_hidden_list: \n", layer_hidden_list)
print ("synflow_score: \n", synflow_score)
elif epoch == (max_epoch - 1) and b == (num_train_batch - 2): # all weights turned to positive
# Copy the patch data into input tensors
tx.copy_from_numpy(x)
ty.copy_from_numpy(y)
# print ("before model forward ...")
pn_p_g_list, out, loss = model(tx, ty, dist_option, spars, synflow_flag)
train_correct += accuracy(tensor.to_numpy(out), y)
train_loss += tensor.to_numpy(loss)[0]
@@ -449,6 +465,7 @@ def run(global_rank,
# print ("normal before model(tx, ty, synflow_flag, dist_option, spars)")
# print ("train_cnn tx: \n", tx)
# print ("train_cnn ty: \n", ty)
# print ("before model forward ...")
pn_p_g_list, out, loss = model(tx, ty, dist_option, spars, synflow_flag)
# print ("normal after model(tx, ty, synflow_flag, dist_option, spars)")
train_correct += accuracy(tensor.to_numpy(out), y)
@@ -500,7 +517,7 @@ def run(global_rank,
description='Training using the autograd and graph.')
parser.add_argument(
'model',
choices=['cnn', 'resnet', 'xceptionnet', 'mlp', 'msmlp', 'alexnet'],
choices=['cnn', 'resnet', 'xceptionnet', 'mlp', 'msmlp', 'alexnet', 'ms_model_mlp'],
default='cnn')
parser.add_argument('data',
choices=['mnist', 'cifar10', 'cifar100'],
@@ -511,7 +528,7 @@ def run(global_rank,
dest='precision')
parser.add_argument('-m',
'--max-epoch',
default=10,
default=3,
type=int,
help='maximum epochs',
dest='max_epoch')