Skip to content

Commit 324c436

Browse files
committed
chore: Merge branch 'fsdp2_min_integration' into fix_failing_test_for_fsdp2_integration
2 parents 201d215 + fac98bd commit 324c436

7 files changed

+54
-345
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -94,7 +94,7 @@ Example:
9494
CUDA_VISIBLE_DEVICES=0,1,2,3 torchrun --rdzv-endpoint localhost:29515 \
9595
--nnodes 1 \
9696
--nproc_per_node 4 \
97-
$(which modalities) run --config_file_path configs/pretraining_config.yaml
97+
$(which modalities) run --config_file_path config_files/training/config_lorem_ipsum_long_fsdp2.yaml
9898
```
9999

100100
Explanation:
@@ -111,7 +111,7 @@ Explanation:
111111

112112
* `$(which modalities) run`: This part dynamically finds the path to the Modalities executable and runs it. The run command triggers the main process to start the training.
113113

114-
* `--config_file_path configs/pretraining_config.yaml`: The --config_file_path argument provides the path to the configuration file for the training job. In the example above, it is given by `configs/pretraining_config.yaml`. A configuraton file contains an exhaustive parameterization for all the training components (e.g., dataset, model, optimizer, etc.), making training fully reproducible. An example configuration file can be found [here](tutorials/getting_started/example_config.yaml), and a complete list of components available in Modalities is provided [here](docs/components/components.md).
114+
* `--config_file_path config_files/training//config_lorem_ipsum_long_fsdp2.yaml`: The --config_file_path argument provides the path to the configuration file for the training job. In the example above, it is given by `config_files/training/config_lorem_ipsum_long_fsdp2.yaml`. A configuraton file contains an exhaustive parameterization for all the training components (e.g., dataset, model, optimizer, etc.), making training fully reproducible. An example configuration file can be found [here](tutorials/getting_started/example_config.yaml), and a complete list of components available in Modalities is provided [here](docs/components/components.md).
115115

116116
If you are a VSCode user, you may want to add this to your `launch.json`:
117117
```json

config_files/training/config_example_coca.yaml

Lines changed: 17 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -182,9 +182,23 @@ loss_fn:
182182
target_key: ${settings.referencing_keys.target_key}
183183
prediction_key: ${settings.referencing_keys.prediction_key}
184184

185+
app_state:
186+
component_key: app_state
187+
variant_key: raw
188+
config:
189+
model:
190+
instance_key: wrapped_model
191+
pass_type: BY_REFERENCE
192+
optimizer:
193+
instance_key: optimizer
194+
pass_type: BY_REFERENCE
195+
lr_scheduler:
196+
instance_key: lr_scheduler
197+
pass_type: BY_REFERENCE
198+
185199
wrapped_model:
186200
component_key: model
187-
variant_key: fsdp_wrapped
201+
variant_key: fsdp1_wrapped
188202
config:
189203
model:
190204
instance_key: model
@@ -256,7 +270,7 @@ model_raw:
256270
bias_attn_pool: False
257271
epsilon_attn_pool: 1e-5
258272

259-
scheduler:
273+
lr_scheduler:
260274
component_key: scheduler
261275
variant_key: onecycle_lr
262276
config:
@@ -286,7 +300,7 @@ optimizer:
286300

287301
gradient_clipper:
288302
component_key: gradient_clipper
289-
variant_key: fsdp_logging_only
303+
variant_key: fsdp1_logging_only
290304
config:
291305
wrapped_model:
292306
instance_key: wrapped_model

0 commit comments

Comments
 (0)