Skip to content

Commit b67d893

Browse files
committed
[Doc] Document losses in README.md
ghstack-source-id: b75d4e08349532b001c91ea3ae5f1e796de26ec5 Pull Request resolved: #2408
1 parent b1eedbf commit b67d893

File tree

1 file changed

+273
-13
lines changed

1 file changed

+273
-13
lines changed

README.md

+273-13
Original file line numberDiff line numberDiff line change
@@ -523,19 +523,279 @@ If you would like to contribute to new features, check our [call for contributio
523523
## Examples, tutorials and demos
524524

525525
A series of [examples](https://github.com/pytorch/rl/blob/main/examples/) are provided with an illustrative purpose:
526-
- [DQN](https://github.com/pytorch/rl/blob/main/sota-implementations/dqn)
527-
- [DDPG](https://github.com/pytorch/rl/blob/main/sota-implementations/ddpg/ddpg.py)
528-
- [IQL](https://github.com/pytorch/rl/blob/main/sota-implementations/iql/iql_offline.py)
529-
- [CQL](https://github.com/pytorch/rl/blob/main/sota-implementations/cql/cql_offline.py)
530-
- [TD3](https://github.com/pytorch/rl/blob/main/sota-implementations/td3/td3.py)
531-
- [TD3+BC](https://github.com/pytorch/rl/blob/main/sota-implementations/td3+bc/td3+bc.py)
532-
- [A2C](https://github.com/pytorch/rl/blob/main/examples/a2c_old/a2c.py)
533-
- [PPO](https://github.com/pytorch/rl/blob/main/sota-implementations/ppo/ppo.py)
534-
- [SAC](https://github.com/pytorch/rl/blob/main/sota-implementations/sac/sac.py)
535-
- [REDQ](https://github.com/pytorch/rl/blob/main/sota-implementations/redq/redq.py)
536-
- [Dreamer](https://github.com/pytorch/rl/blob/main/sota-implementations/dreamer/dreamer.py)
537-
- [Decision Transformers](https://github.com/pytorch/rl/blob/main/sota-implementations/decision_transformer)
538-
- [RLHF](https://github.com/pytorch/rl/blob/main/examples/rlhf)
526+
527+
<table>
528+
<tr>
529+
<td><strong>Algorithm</strong>
530+
</td>
531+
<td><strong>Compile Support**</strong>
532+
</td>
533+
<td><strong>Tensordict-free API</strong>
534+
</td>
535+
<td><strong>Modular Losses</strong>
536+
</td>
537+
<td><strong>Continuous and Discrete</strong>
538+
</td>
539+
</tr>
540+
<tr>
541+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/dqn">DQN</a>
542+
</td>
543+
<td> 1.53x
544+
</td>
545+
<td> +
546+
</td>
547+
<td> NA
548+
</td>
549+
<td> + (through <a href="https://pytorch.org/rl/stable/reference/generated/torchrl.envs.transforms.ActionDiscretizer.html?highlight=actiondiscretizer">ActionDiscretizer</a> transform)
550+
</td>
551+
</tr>
552+
<tr>
553+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/ddpg/ddpg.py">DDPG</a>
554+
</td>
555+
<td> 1.54x
556+
</td>
557+
<td> +
558+
</td>
559+
<td> +
560+
</td>
561+
<td> - (continuous only)
562+
</td>
563+
</tr>
564+
<tr>
565+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/iql/">IQL</a>
566+
</td>
567+
<td> 2.55x
568+
</td>
569+
<td> +
570+
</td>
571+
<td> +
572+
</td>
573+
<td> +
574+
</td>
575+
</tr>
576+
<tr>
577+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/cql/cql_offline.py">CQL</a>
578+
</td>
579+
<td> 1.91x
580+
</td>
581+
<td> +
582+
</td>
583+
<td> +
584+
</td>
585+
<td> +
586+
</td>
587+
</tr>
588+
<tr>
589+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/td3/td3.py">TD3</a>
590+
</td>
591+
<td> 1.79x
592+
</td>
593+
<td> +
594+
</td>
595+
<td> +
596+
</td>
597+
<td> - (continuous only)
598+
</td>
599+
</tr>
600+
<tr>
601+
<td>
602+
<a href="https://github.com/pytorch/rl/blob/main/sota-implementations/td3_bc/td3_bc.py">TD3+BC</a>
603+
</td>
604+
<td> untested
605+
</td>
606+
<td> +
607+
</td>
608+
<td> +
609+
</td>
610+
<td> - (continuous only)
611+
</td>
612+
</tr>
613+
<tr>
614+
<td>
615+
<a href="https://github.com/pytorch/rl/blob/main/examples/a2c/">A2C</a>
616+
</td>
617+
<td> 1.76x
618+
</td>
619+
<td> +
620+
</td>
621+
<td> -
622+
</td>
623+
<td> +
624+
</td>
625+
</tr>
626+
<tr>
627+
<td>
628+
<a href="https://github.com/pytorch/rl/blob/main/sota-implementations/ppo/">PPO</a>
629+
</td>
630+
<td> 2.67x
631+
</td>
632+
<td> +
633+
</td>
634+
<td> -
635+
</td>
636+
<td> +
637+
</td>
638+
</tr>
639+
<tr>
640+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/sac/sac.py">SAC</a>
641+
</td>
642+
<td> 2.01x
643+
</td>
644+
<td> +
645+
</td>
646+
<td> -
647+
</td>
648+
<td> +
649+
</td>
650+
</tr>
651+
<tr>
652+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/redq/redq.py">REDQ</a>
653+
</td>
654+
<td> 2.35x
655+
</td>
656+
<td> +
657+
</td>
658+
<td> -
659+
</td>
660+
<td> - (continuous only)
661+
</td>
662+
</tr>
663+
<tr>
664+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/dreamer/dreamer.py">Dreamer v1</a>
665+
</td>
666+
<td> untested
667+
</td>
668+
<td> +
669+
</td>
670+
<td> + (<a href="https://pytorch.org/rl/stable/reference/objectives.html#dreamer">different classes</a>)
671+
</td>
672+
<td> - (continuous only)
673+
</td>
674+
</tr>
675+
<tr>
676+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/decision_transformer">Decision Transformers</a>
677+
</td>
678+
<td> untested
679+
</td>
680+
<td> +
681+
</td>
682+
<td> NA
683+
</td>
684+
<td> - (continuous only)
685+
</td>
686+
</tr>
687+
<tr>
688+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/crossq">CrossQ</a>
689+
</td>
690+
<td> untested
691+
</td>
692+
<td> +
693+
</td>
694+
<td> +
695+
</td>
696+
<td> - (continuous only)
697+
</td>
698+
</tr>
699+
<tr>
700+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/gail">Gail</a>
701+
</td>
702+
<td> untested
703+
</td>
704+
<td> +
705+
</td>
706+
<td> NA
707+
</td>
708+
<td> +
709+
</td>
710+
</tr>
711+
<tr>
712+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/impala">Impala</a>
713+
</td>
714+
<td> untested
715+
</td>
716+
<td> +
717+
</td>
718+
<td> -
719+
</td>
720+
<td> +
721+
</td>
722+
</tr>
723+
<tr>
724+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/iql.py">IQL (MARL)</a>
725+
</td>
726+
<td> untested
727+
</td>
728+
<td> +
729+
</td>
730+
<td> +
731+
</td>
732+
<td> +
733+
</td>
734+
</tr>
735+
<tr>
736+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/maddpg_iddpg.py">DDPG (MARL)</a>
737+
</td>
738+
<td> untested
739+
</td>
740+
<td> +
741+
</td>
742+
<td> +
743+
</td>
744+
<td> - (continuous only)
745+
</td>
746+
</tr>
747+
<tr>
748+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/mappo_ippo.py">PPO (MARL)</a>
749+
</td>
750+
<td> untested
751+
</td>
752+
<td> +
753+
</td>
754+
<td> -
755+
</td>
756+
<td> +
757+
</td>
758+
</tr>
759+
<tr>
760+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/qmix_vdn.py">QMIX-VDN (MARL)</a>
761+
</td>
762+
<td> untested
763+
</td>
764+
<td> +
765+
</td>
766+
<td> NA
767+
</td>
768+
<td> +
769+
</td>
770+
</tr>
771+
<tr>
772+
<td><a href="https://github.com/pytorch/rl/blob/main/sota-implementations/multiagent/sac.py">SAC (MARL)</a>
773+
</td>
774+
<td> untested
775+
</td>
776+
<td> +
777+
</td>
778+
<td> -
779+
</td>
780+
<td> +
781+
</td>
782+
</tr>
783+
<tr>
784+
<td><a href="https://github.com/pytorch/rl/blob/main/examples/rlhf">RLHF</a>
785+
</td>
786+
<td> NA
787+
</td>
788+
<td> +
789+
</td>
790+
<td> NA
791+
</td>
792+
<td> NA
793+
</td>
794+
</tr>
795+
</table>
796+
797+
** The number indicates expected speed-up compared to eager mode when executed on CPU. Numbers may vary depending on
798+
architecture and device
539799

540800
and many more to come!
541801

0 commit comments

Comments
 (0)