@@ -523,19 +523,279 @@ If you would like to contribute to new features, check our [call for contributio
523
523
## Examples, tutorials and demos
524
524
525
525
A series of [ examples] ( https://github.com/pytorch/rl/blob/main/examples/ ) are provided with an illustrative purpose:
526
- - [ DQN] ( https://github.com/pytorch/rl/blob/main/sota-implementations/dqn )
527
- - [ DDPG] ( https://github.com/pytorch/rl/blob/main/sota-implementations/ddpg/ddpg.py )
528
- - [ IQL] ( https://github.com/pytorch/rl/blob/main/sota-implementations/iql/iql_offline.py )
529
- - [ CQL] ( https://github.com/pytorch/rl/blob/main/sota-implementations/cql/cql_offline.py )
530
- - [ TD3] ( https://github.com/pytorch/rl/blob/main/sota-implementations/td3/td3.py )
531
- - [ TD3+BC] ( https://github.com/pytorch/rl/blob/main/sota-implementations/td3+bc/td3+bc.py )
532
- - [ A2C] ( https://github.com/pytorch/rl/blob/main/examples/a2c_old/a2c.py )
533
- - [ PPO] ( https://github.com/pytorch/rl/blob/main/sota-implementations/ppo/ppo.py )
534
- - [ SAC] ( https://github.com/pytorch/rl/blob/main/sota-implementations/sac/sac.py )
535
- - [ REDQ] ( https://github.com/pytorch/rl/blob/main/sota-implementations/redq/redq.py )
536
- - [ Dreamer] ( https://github.com/pytorch/rl/blob/main/sota-implementations/dreamer/dreamer.py )
537
- - [ Decision Transformers] ( https://github.com/pytorch/rl/blob/main/sota-implementations/decision_transformer )
538
- - [ RLHF] ( https://github.com/pytorch/rl/blob/main/examples/rlhf )
526
+
527
+ <table >
528
+ <tr >
529
+ <td ><strong >Algorithm</strong >
530
+ </td >
531
+ <td ><strong >Compile Support**</strong >
532
+ </td >
533
+ <td ><strong >Tensordict-free API</strong >
534
+ </td >
535
+ <td ><strong >Modular Losses</strong >
536
+ </td >
537
+ <td ><strong >Continuous and Discrete</strong >
538
+ </td >
539
+ </tr >
540
+ <tr >
541
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/dqn " >DQN</a >
542
+ </td >
543
+ <td > 1.53x
544
+ </td >
545
+ <td > +
546
+ </td >
547
+ <td > NA
548
+ </td >
549
+ <td > + (through <a href =" https://pytorch.org/rl/stable/reference/generated/torchrl.envs.transforms.ActionDiscretizer.html?highlight=actiondiscretizer " >ActionDiscretizer</a > transform)
550
+ </td >
551
+ </tr >
552
+ <tr >
553
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/ddpg/ddpg.py " >DDPG</a >
554
+ </td >
555
+ <td > 1.54x
556
+ </td >
557
+ <td > +
558
+ </td >
559
+ <td > +
560
+ </td >
561
+ <td > - (continuous only)
562
+ </td >
563
+ </tr >
564
+ <tr >
565
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/iql/ " >IQL</a >
566
+ </td >
567
+ <td > 2.55x
568
+ </td >
569
+ <td > +
570
+ </td >
571
+ <td > +
572
+ </td >
573
+ <td > +
574
+ </td >
575
+ </tr >
576
+ <tr >
577
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/cql/cql_offline.py " >CQL</a >
578
+ </td >
579
+ <td > 1.91x
580
+ </td >
581
+ <td > +
582
+ </td >
583
+ <td > +
584
+ </td >
585
+ <td > +
586
+ </td >
587
+ </tr >
588
+ <tr >
589
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/td3/td3.py " >TD3</a >
590
+ </td >
591
+ <td > 1.79x
592
+ </td >
593
+ <td > +
594
+ </td >
595
+ <td > +
596
+ </td >
597
+ <td > - (continuous only)
598
+ </td >
599
+ </tr >
600
+ <tr >
601
+ <td >
602
+ <a href="https://github.com/pytorch/rl/blob/main/sota-implementations/td3_bc/td3_bc.py">TD3+BC</a>
603
+ </td >
604
+ <td > untested
605
+ </td >
606
+ <td > +
607
+ </td >
608
+ <td > +
609
+ </td >
610
+ <td > - (continuous only)
611
+ </td >
612
+ </tr >
613
+ <tr >
614
+ <td >
615
+ <a href="https://github.com/pytorch/rl/blob/main/examples/a2c/">A2C</a>
616
+ </td >
617
+ <td > 1.76x
618
+ </td >
619
+ <td > +
620
+ </td >
621
+ <td > -
622
+ </td >
623
+ <td > +
624
+ </td >
625
+ </tr >
626
+ <tr >
627
+ <td >
628
+ <a href="https://github.com/pytorch/rl/blob/main/sota-implementations/ppo/">PPO</a>
629
+ </td >
630
+ <td > 2.67x
631
+ </td >
632
+ <td > +
633
+ </td >
634
+ <td > -
635
+ </td >
636
+ <td > +
637
+ </td >
638
+ </tr >
639
+ <tr >
640
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/sac/sac.py " >SAC</a >
641
+ </td >
642
+ <td > 2.01x
643
+ </td >
644
+ <td > +
645
+ </td >
646
+ <td > -
647
+ </td >
648
+ <td > +
649
+ </td >
650
+ </tr >
651
+ <tr >
652
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/redq/redq.py " >REDQ</a >
653
+ </td >
654
+ <td > 2.35x
655
+ </td >
656
+ <td > +
657
+ </td >
658
+ <td > -
659
+ </td >
660
+ <td > - (continuous only)
661
+ </td >
662
+ </tr >
663
+ <tr >
664
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/dreamer/dreamer.py " >Dreamer v1</a >
665
+ </td >
666
+ <td > untested
667
+ </td >
668
+ <td > +
669
+ </td >
670
+ <td > + (<a href =" https://pytorch.org/rl/stable/reference/objectives.html#dreamer " >different classes</a >)
671
+ </td >
672
+ <td > - (continuous only)
673
+ </td >
674
+ </tr >
675
+ <tr >
676
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/decision_transformer " >Decision Transformers</a >
677
+ </td >
678
+ <td > untested
679
+ </td >
680
+ <td > +
681
+ </td >
682
+ <td > NA
683
+ </td >
684
+ <td > - (continuous only)
685
+ </td >
686
+ </tr >
687
+ <tr >
688
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/crossq " >CrossQ</a >
689
+ </td >
690
+ <td > untested
691
+ </td >
692
+ <td > +
693
+ </td >
694
+ <td > +
695
+ </td >
696
+ <td > - (continuous only)
697
+ </td >
698
+ </tr >
699
+ <tr >
700
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/gail " >Gail</a >
701
+ </td >
702
+ <td > untested
703
+ </td >
704
+ <td > +
705
+ </td >
706
+ <td > NA
707
+ </td >
708
+ <td > +
709
+ </td >
710
+ </tr >
711
+ <tr >
712
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/impala " >Impala</a >
713
+ </td >
714
+ <td > untested
715
+ </td >
716
+ <td > +
717
+ </td >
718
+ <td > -
719
+ </td >
720
+ <td > +
721
+ </td >
722
+ </tr >
723
+ <tr >
724
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/iql.py " >IQL (MARL)</a >
725
+ </td >
726
+ <td > untested
727
+ </td >
728
+ <td > +
729
+ </td >
730
+ <td > +
731
+ </td >
732
+ <td > +
733
+ </td >
734
+ </tr >
735
+ <tr >
736
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/maddpg_iddpg.py " >DDPG (MARL)</a >
737
+ </td >
738
+ <td > untested
739
+ </td >
740
+ <td > +
741
+ </td >
742
+ <td > +
743
+ </td >
744
+ <td > - (continuous only)
745
+ </td >
746
+ </tr >
747
+ <tr >
748
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/mappo_ippo.py " >PPO (MARL)</a >
749
+ </td >
750
+ <td > untested
751
+ </td >
752
+ <td > +
753
+ </td >
754
+ <td > -
755
+ </td >
756
+ <td > +
757
+ </td >
758
+ </tr >
759
+ <tr >
760
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/qmix_vdn.py " >QMIX-VDN (MARL)</a >
761
+ </td >
762
+ <td > untested
763
+ </td >
764
+ <td > +
765
+ </td >
766
+ <td > NA
767
+ </td >
768
+ <td > +
769
+ </td >
770
+ </tr >
771
+ <tr >
772
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/sac.py " >SAC (MARL)</a >
773
+ </td >
774
+ <td > untested
775
+ </td >
776
+ <td > +
777
+ </td >
778
+ <td > -
779
+ </td >
780
+ <td > +
781
+ </td >
782
+ </tr >
783
+ <tr >
784
+ <td ><a href =" https://github.com/pytorch/rl/blob/main/examples/rlhf " >RLHF</a >
785
+ </td >
786
+ <td > NA
787
+ </td >
788
+ <td > +
789
+ </td >
790
+ <td > NA
791
+ </td >
792
+ <td > NA
793
+ </td >
794
+ </tr >
795
+ </table >
796
+
797
+ ** The number indicates expected speed-up compared to eager mode when executed on CPU. Numbers may vary depending on
798
+ architecture and device
539
799
540
800
and many more to come!
541
801
0 commit comments