adding LSTM support to pretrain #315

XMaster96 · 2019-05-08T14:50:36Z

This PR adds LSTM support to pretrain. I am not quite done yet, but there are some Implementations matters that I need to discuss first.

personal edit:
I finally found the time to work more on this PR. The problem is that I took so long that I forgot half of what I did, so if there is some rough code in there, it is because of that. I still do not have that much time, so expect me to not answer immediately

closes #253

add frame index aliment to ExpertDataset

update forke

# Conflicts: # stable_baselines/gail/dataset/dataset.py # stable_baselines/trpo_mpi/trpo_mpi.py # tests/test_gail.py

XMaster96 · 2019-05-08T15:27:57Z

The problem is that ExpertsDataset now needs to know the exact batch_size and how many envs are used per update (only if LSTM is True). The problem with that is that this information are contained and calculated in the model. The ExpertDataset object needs to know this information before is past in to the pretrain function. So the user than needs to calculate this data before creating an instance of ExpertDataset. My idea is that pretrain accepts a String with the expert_path for dataset (or a ExpertDataset object), and than create an instance of ExpertDataset with the parameters content in the model internally.

-convert float envs_per_batch in to int envs_per_batch

araffin · 2019-05-18T11:33:48Z

Hello,

do you consider this PR ready for review? (After a quick look, I saw that a saved file was still there (nano) and there seems to be some code duplication that can be improved ;))

XMaster96 · 2019-05-20T10:30:40Z

o/

I saw that a saved file was still there (nano)

I removed nano.

there seems to be some code duplication

I am not quite sure but I think you are referring to the _get_pretrain_placeholders function in each model class. I don't think there is an easy way to combine it in to one function, a lot of the models have slightly different placeholders, so it would be quite hard to combine the functions.

The only thing I still plan to do is to add a bit more functionality to it. The code which is already there is so far finalized and can be reviewed.

XMaster96 · 2019-09-06T11:24:06Z

The Test has failed after updating my branch.
ERROR com.codacy Invalid configuration: Empty argument for --project-token
I don't understand this error fully but after some googling I think this error comes from your side ??

AdamGleave · 2019-09-07T12:17:37Z

The Test has failed after updating my branch.
ERROR com.codacy Invalid configuration: Empty argument for --project-token
I don't understand this error fully but after some googling I think this error comes from your side ??

You can ignore this, I've attempted a fix in #467

XMaster96 · 2019-09-07T15:43:21Z

Now that my PR has finally pass all the unit test again, could you start reviewing the PR, so that I then can change it if necessary.
I am basically done with everything I was planning to do.

araffin

Please remove the test_recorded_images folder too.

araffin · 2019-09-12T09:23:46Z

stable_baselines/a2c/a2c.py

@@ -87,9 +87,20 @@ def __init__(self, policy, env, gamma=0.99, n_steps=5, vf_coef=0.25, ent_coef=0.

    def _get_pretrain_placeholders(self):
        policy = self.train_model
+
+        if self.initial_state is None:


You can do:

states_ph, snew_ph, dones_ph = None, None, None

so it's more compact, same for the else case

I think having the variable declaration vertical and horizontal interrupts the read flow. Yes It would make the code shorter but also les readable in my opinion. But I will change it if you really wont it that way.

araffin · 2019-09-12T09:24:06Z

stable_baselines/acer/acer_simple.py

@@ -152,8 +152,18 @@ def __init__(self, policy, env, gamma=0.99, n_steps=20, num_procs=1, q_coef=0.5,
    def _get_pretrain_placeholders(self):
        policy = self.step_model
        action_ph = policy.pdtype.sample_placeholder([None])
+
+        if self.initial_state is None:


same remark as before

araffin · 2019-09-12T09:25:01Z

stable_baselines/a2c/a2c.py

@@ -87,9 +87,20 @@ def __init__(self, policy, env, gamma=0.99, n_steps=5, vf_coef=0.25, ent_coef=0.

    def _get_pretrain_placeholders(self):
        policy = self.train_model
+
+        if self.initial_state is None:


you should rather check the recurrent attribute of the policy, it is in the base policy class

araffin · 2019-09-12T09:25:55Z

stable_baselines/common/base_class.py

@@ -50,6 +50,10 @@ def __init__(self, policy, env, verbose=0, *, requires_vec_env, policy_base, pol
        self.sess = None
        self.params = None
        self._param_load_ops = None
+        self.initial_state = None


you don't need that variable, there is the recurrent attribute for that

araffin · 2019-09-12T09:26:14Z

stable_baselines/common/base_class.py

@@ -246,13 +250,24 @@ def pretrain(self, dataset, n_epochs=10, learning_rate=1e-4,
            else:
                val_interval = int(n_epochs / 10)

+        use_lstm = self.initial_state is not None


same remark, you can use the recurrent attribute

araffin · 2019-09-12T09:28:12Z

stable_baselines/common/base_class.py

@@ -272,13 +287,23 @@ def pretrain(self, dataset, n_epochs=10, learning_rate=1e-4,

        for epoch_idx in range(int(n_epochs)):
            train_loss = 0.0
+            if use_lstm:
+                state = self.initial_state[:envs_per_batch]


initial state is an attribute of the policy

yes and no.
All the models Which can use LSTM policies have the Variable self.initial_state, Which gets set to the initial state from policy. The variable self.initial_state gets used and not the one in the policy. It is also not that easy, to access the initial state from the BaseRLModel. It wars much simpler to at the self.initial_state variable to the Base Model, and then let is overwrite later at model initialization.

araffin · 2019-09-12T09:29:35Z

stable_baselines/common/base_class.py

+
+                    if use_lstm:
+                        feed_dict.update({states_ph: state, dones_ph: expert_mask})
+                        val_loss_, = self.sess.run([loss], feed_dict)


you only need to update the feeddict, self.sess.run can be called outside, so you avoid code duplication

araffin · 2019-09-12T09:31:22Z

stable_baselines/gail/dataset/dataset.py

    :param batch_size: (int) the minibatch size for behavior cloning
    :param traj_limitation: (int) the number of trajectory to use (if -1, load all)
-    :param randomize: (bool) if the dataset should be shuffled
+    :param randomize: (bool) if the dataset should be shuffled, this will be overwritten to False
+        if LSTM is True.


? where is the LSTM variable?

araffin · 2019-09-12T09:33:02Z

stable_baselines/gail/dataset/dataset.py

-        except StopIteration:
-            dataloader = iter(dataloader)
-            return next(dataloader)
+        if traj_data is not None and expert_path is not None:


Can't this be check in the base class?
Looks like duplicated code

Also, I'm not sure if two classes are needed...

I originally had it in one class, but someone who used the PR, has suggested to split it in two classes. I think that this was a good idea, because it clearly improved the user friendliness of the ExpertDataset class.

araffin · 2019-09-12T09:36:51Z

tests/test_gail.py

-    model.pretrain(dataset, n_epochs=20)
-    model.save("test-pretrain")
-    del dataset, model
+@pytest.mark.parametrize("model_class_data", [[A2C, 4, True, "MlpLstmPolicy",


Looks like duplicated code, I think you can handle the different type of policy and expert path in the if.

…aselines into LSTM-pretrain # Conflicts: # stable_baselines/gail/dataset/dataset.py

skervim · 2019-10-21T12:37:58Z

Hi everyone, I would like to know whether there is still active work on "LSTM support to pretrain". I've seen that this feature was removed from the v2.8.0 milestone more than a month ago. Is the work on hold? Kind regards!
@araffin @XMaster96

XMaster96 · 2019-10-23T15:01:27Z

Hi everyone, I would like to know whether there is still active work on "LSTM support to pretrain". I've seen that this feature was removed from the v2.8.0 milestone more than a month ago. Is the work on hold? Kind regards!
@araffin @XMaster96

@skervim
Well I am done programing it. I also would like to know, what the merge status is. But if you need it, just use my dev Fork.

araffin · 2019-10-28T08:46:19Z

hello @skervim,
This PR requires more changes and time (that includes the code review) than expected. That's why it has been removed from the milestone for now.

As @XMaster96 says, you can use his fork for now if you want try the feature.

XMaster96 · 2019-10-29T06:28:25Z

This PR requires more changes and time (that includes the code review) than expected. That's why it has been removed from the milestone for now.

Are you referring to the requested changes, I have partially implemented and partially commented on why I think I shouldn't change that. Or are you referring to future change requests?

I am also aware that I don't have yet written a Documentation for the website, I was planning on doing it when everything is ok, and merge ready.

XMaster96 and others added 11 commits April 11, 2019 21:12

mode pretrain in base_calse

a9f7e30

add mask support to DataLoader

c708a37

add frame index aliment to ExpertDataset

add better data split

169a80b

add comments and fix some bugs

b0ee4c7

when using dataset.get_next_batch to expect fore returns.

027891f

update _get_pretrain_placeholders in all models.

b7541bd

make it work.

11a9d00

Merge pull request #1 from hill-a/master

004e3fb

update forke

Merge branch 'master' into LSTM-pretrain

4d14812

# Conflicts: # stable_baselines/gail/dataset/dataset.py # stable_baselines/trpo_mpi/trpo_mpi.py # tests/test_gail.py

Make it work with 2.5.1

167a337

Merge branch 'master' into LSTM-pretrain

187f16e

XMaster96 and others added 14 commits May 8, 2019 19:39

improve the syntax

1827b2d

Merge remote-tracking branch 'origin/LSTM-pretrain' into LSTM-pretrain

34e7bde

Merge branch 'master' into LSTM-pretrain

54e5c01

-fix partial_minibatch for LSTMs

c43b39a

-convert float envs_per_batch in to int envs_per_batch

-fix data alignment for LSTMs

c7b795a

Merge remote-tracking branch 'origin/LSTM-pretrain' into LSTM-pretrain

920ac7b

Delete __init__.py

ccddbb2

Delete run_atari.py

ee29e78

Delete run_mujoco.py

938a4f8

Delete ppo2.py

a952d02

-fix syntax line length.

a2a94ad

Merge remote-tracking branch 'origin/LSTM-pretrain' into LSTM-pretrain

58ddd30

-fix syntax

77538da

-fix syntax

a124dcb

XMaster96 and others added 2 commits May 20, 2019 08:58

remove nano.save

c4d9c47

Merge branch 'master' into LSTM-pretrain

d75c01e

XMaster96 and others added 6 commits September 2, 2019 23:01

-fix indentation

4bfb988

-fix syntax

30bdb19

-fix syntax

06ba9da

-fix syntax

a08f420

Merge branch 'master' into LSTM-pretrain

17c04ac

Merge branch 'master' into LSTM-pretrain

1406af3

Merge branch 'master' into LSTM-pretrain

f448eea

araffin requested changes Sep 12, 2019

View reviewed changes

Merge branch 'master' into LSTM-pretrain

9770a12

araffin removed this from the v2.8.0 milestone Sep 13, 2019

XMaster96 added 9 commits September 13, 2019 20:30

-change

eda5b8f

Merge branch 'LSTM-pretrain' of https://github.com/XMaster96/stable-b…

428022e

…aselines into LSTM-pretrain # Conflicts: # stable_baselines/gail/dataset/dataset.py

- change

694e6c1

- change

f11f732

- fix model save

d8685e4

-fix syntax

e03be1d

-fix syntax

2f6da05

-fix syntax

570b8d9

-fix pickle load

e0bb120

XMaster96 requested a review from araffin September 14, 2019 12:27

XMaster96 added 2 commits September 16, 2019 14:30

Merge branch 'master' into LSTM-pretrain

2ee1300

Merge branch 'master' into LSTM-pretrain

988ba5c

Miffyli mentioned this pull request Jan 12, 2022

[question]A problem of how to use MlpLstmPolicy in GAIL training? #1148

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adding LSTM support to pretrain #315

adding LSTM support to pretrain #315

XMaster96 commented May 8, 2019 •

edited by araffin

Loading

XMaster96 commented May 8, 2019

araffin commented May 18, 2019

XMaster96 commented May 20, 2019

XMaster96 commented Sep 6, 2019

AdamGleave commented Sep 7, 2019 •

edited

Loading

XMaster96 commented Sep 7, 2019

araffin left a comment

araffin Sep 12, 2019

XMaster96 Sep 13, 2019

araffin Sep 12, 2019

araffin Sep 12, 2019

araffin Sep 12, 2019

araffin Sep 12, 2019

araffin Sep 12, 2019

XMaster96 Sep 13, 2019

araffin Sep 12, 2019

araffin Sep 12, 2019

araffin Sep 12, 2019

XMaster96 Sep 13, 2019

araffin Sep 12, 2019

skervim commented Oct 21, 2019 •

edited

Loading

XMaster96 commented Oct 23, 2019

araffin commented Oct 28, 2019

XMaster96 commented Oct 29, 2019

adding LSTM support to pretrain #315

Are you sure you want to change the base?

adding LSTM support to pretrain #315

Conversation

XMaster96 commented May 8, 2019 • edited by araffin Loading

XMaster96 commented May 8, 2019

araffin commented May 18, 2019

XMaster96 commented May 20, 2019

XMaster96 commented Sep 6, 2019

AdamGleave commented Sep 7, 2019 • edited Loading

XMaster96 commented Sep 7, 2019

araffin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skervim commented Oct 21, 2019 • edited Loading

XMaster96 commented Oct 23, 2019

araffin commented Oct 28, 2019

XMaster96 commented Oct 29, 2019

XMaster96 commented May 8, 2019 •

edited by araffin

Loading

AdamGleave commented Sep 7, 2019 •

edited

Loading

skervim commented Oct 21, 2019 •

edited

Loading