Bug Fix

Fix tqdm issue (#481)
Fix atari wrapper to be deterministic (#467)
Add writer.flush() in TensorboardLogger to ensure real-time logging result (#485)

Enhancement

Implements set_env_attr and get_env_attr for vector environments (#478)
Implement BCQPolicy and offline_bcq example (#480)
Enable test_collector=None in 3 trainers to turn off testing during training (#485)
Fix an inconsistency in the implementation of Discrete CRR. Now it uses Critic class for its critic, following conventions in other actor-critic policies (#485)
Update several offline policies to use ActorCritic class for its optimizer to eliminate randomness caused by parameter sharing between actor and critic (#485)
Move Atari offline RL examples to examples/offline and tests to test/offline (#485)