Skip to content

Commit

Permalink
reset default eval_batch_size
Browse files Browse the repository at this point in the history
  • Loading branch information
kervias committed Feb 25, 2024
1 parent 09c0693 commit 3f659ef
Show file tree
Hide file tree
Showing 6 changed files with 8 additions and 5 deletions.
Binary file added docs/source/assets/dataflow.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/features/standard_datamodule.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ For data module, we provide a standardized design with three protocols (see foll
- Middle Data Format Protocol
- Atomic Operation Protocol

![](https://s2.loli.net/2024/02/09/trocOIdR4wxXJDQ.png)
![](../assets/dataflow.jpg)

The first step of Data Templates is to load the raw data from the hard disk. Then, a series of processing steps are performed to obtain model-friendly data objects. Finally, these data objects are passed on to other modules.
We simplify the data preparation into three into three stages:
Expand Down
6 changes: 4 additions & 2 deletions edustudio/datatpl/common/general_datatpl.py
Original file line number Diff line number Diff line change
Expand Up @@ -541,8 +541,10 @@ def _preprocess_feat(df):
for col in df.columns:
col_name, col_type = col.split(":")
if col_type == 'token':
# df[col] = df[col].astype('int64')
pass
try:
df[col] = df[col].astype('int64')
except:
pass
elif col_type == 'float':
df[col] = df[col].astype('float32')
elif col_type == 'token_seq':
Expand Down
1 change: 1 addition & 0 deletions edustudio/traintpl/general_traintpl.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ class GeneralTrainTPL(GDTrainTPL):
'unsave_best_epoch_pth': True,
'ignore_metrics_in_train': [],
'batch_size': 32,
'eval_batch_size': 32
}

def _check_params(self):
Expand Down
3 changes: 2 additions & 1 deletion examples/single_model/run_cdgk_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,8 @@
cfg_file_name=None,
traintpl_cfg_dict={
'cls': 'GeneralTrainTPL',
'batch_size': 1024
'batch_size': 1024,
'eval_batch_size': 1024
},
datatpl_cfg_dict={
'cls': 'CDGKDataTPL',
Expand Down
1 change: 0 additions & 1 deletion examples/single_model/run_dimkt_demo.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
},
traintpl_cfg_dict={
'cls': 'GeneralTrainTPL',
'device': 'cpu',
},
modeltpl_cfg_dict={
'cls': 'DIMKT',
Expand Down

0 comments on commit 3f659ef

Please sign in to comment.